When it comes to graphics processors there are generally two line-ups provided by NVidia: GeForce (consumer) and Quadro (pro). As you’d expect, each have their pros and cons, the balance of which is tipped by the environment they’re being installed in and the type of processing they have to do.
The first thing to understand is that both GeForce and Quadro are built on the same architecture. There are, however, versions of the architectures being released regularly, and within those versions there are even more versions.
For example, one of the latest architecture releases from Nvidia is named Turing, which is split into three different versions: TU102, TU104, and TU106. Don’t be fooled into thinking the TU106 is more powerful than the TU102, though. In general terms, the lower the number, the more components there are integrated into the processing unit, and therefore the greater the processing performance. What this means is that two graphics cards which share the same architecture can actually differ hugely in capabilities.
Turning to Intelligent Video Analytics applications, there is a series of MXM GPUs which includes features directly geared to video processing. The Nvidia Quadro RTX series of MXM GPU offers the same high performance we’ve seen with the Turing and Pascal architectures, but with added features to serve video-intensive applications such as machine vision and IVA — features not included in its GeForce RTX cousin.
Nvidia’s GPUDirect for Video, available only on Quadro RTX modules, allows manufacturers to write device drivers that efficiently transfer video frames in and out of NVIDIA GPU memory. What this brings is faster processing speeds, through the GPU and I/O devices working together to provide synchronised transfers rather than having forced wait times to ensure completion, reduced latency by transferring smaller chunks of data rather than sending the entire chunk in one go, and overall reduction in CPU overhead by copying data through pinned host memory.
Another feature included with Quadro RTX GPU modules is the ability to scale GPUs. Many intensive
AI applications such as inference and video analysis require multiple GPU cards to manage the load. The multi-GPU scalability of Quadro RTX GPU modules allows for multiple GPUs to be integrated into a single system, then multiple systems used for the application, essentially scaling the number of Tensor cores into the thousands.