Optimize Analysis Performance with a GPU

Media Server can use a graphics card (GPU) to perform some processing tasks. Using a GPU rather than the CPU can significantly increase the speed of analysis tasks that use Convolutional Neural Networks.

This section describes how to configure the following tasks to achieve optimum performance when you accelerate processing using a GPU.

When you are using a GPU to accelerate processing, configure your analysis task with NumParallel=1. To specify the number of video frames to process concurrently on the GPU, set the parameter GPUNumParallel. The value of this parameter must be a power of 2, such as 4, 8, 16, 32, 64, and so on.

[Demographics]
Type=Demographics
Input=FaceDetect.DataWithSource
NumParallel=1
GPUNumParallel=32

You should choose the highest possible value for GPUNumParallel; the limit is the amount of memory available on your GPU. If you set a value that is too high and the GPU runs out of memory, analysis will fail. You can monitor the amount of GPU memory used by Media Server with the nvidia-smi tool.

If you are processing low numbers of frames, Media Server might send video frames to the GPU before there are enough for a complete batch. You can configure the amount of time that Media Server waits for video frames by setting the configuration parameter GPUBatchingDuration. To maximize throughput and use the GPU most efficiently, configure a long GPUBatchingDuration so that Media Server waits until there is a full batch of video frames to analyze. If you are processing low volumes of frames or require the analysis results as rapidly as possible, you can reduce the duration.

If your analysis task supports the configuration parameter SampleInterval, you can decrease the interval. (For more information about sample intervals, see Choose the Rate of Ingestion). Analysis with a GPU is significantly faster than with a CPU and so Media Server can process significantly more frames. In the case of face recognition and demographics analysis, which process the output of another analysis task, set the SampleInterval parameter on the first analysis task (face detection).

[FaceDetect]
Type=FaceDetect
// NumParallel is set because Face Detection uses the CPU for analysis
NumParallel=8
FaceDirection=Front
MinSize=200
SizeUnit=pixel
SampleInterval=0ms


[Demographics]
Type=Demographics
Input=FaceDetect.DataWithSource
NumParallel=1
GPUNumParallel=32

In this example, face detection attempts to process every frame because SampleInterval=0ms. Every frame that contains a detected face is written to the FaceDetect.DataWithSource track, and the demographics analysis task processes these frames using the GPU.


_FT_HTML5_bannerTitle.htm