MXM GPUs Bring Intelligent Video Analytics to the Edge

By Impulse Embedded News Published 11 November 2021

Imagine a rail operator sitting behind a bank of monitors overseeing a level crossing, or a team of security professionals in an airport checking for abandoned bags. Video Analytics have been around for over half a century, but what is changing is the way that the video is analysed.

Where once a human being would be watching multiple screens and determining cause for alarm, more sophisticated Artificial Intelligence based Intelligent Video Analytics (IVA) systems are being developed and installed in cities and infrastructure, allowing for the monitoring and processing of huge amounts of video data 24/7.

Academic establishments, construction site safety, smart parking, people counting at festivals, city traffic management and route optimisation — these are all types of applications in which IVA helps organisers and governors ensure safety and smooth operations by accurately processing volumes of information and alerting of incidents or imminent danger. But Intelligent Video Analytics expands beyond reactive monitoring of activities. For instance, when combined with technologies such as RADAR and LIDAR, IVA is being used for applications like Advanced Driver-Assistance Systems (ADAS) to further autonomous driving capabilities and safety.

With such a rich set of tools and deep learning libraries available for developers involved in video image recognition, the applications for intelligent video analytics are almost boundless. As camera resolution and quantity increases so does the volume of data being processed, and to process this on the Edge requires a powerful GPU in a compact, often rugged package.

Discreet GPU Cards

The world of Intelligent Video Analytics moves fast. Traditionally speaking, and by ‘traditionally’ we’re talking less than five years ago, IVA has depended on discreet GPU cards installed in desktop PCs or workstations, or in rack-mount server farms for larger installations, to process the resource-hungry AI algorithms on which IVA applications are based.

This arrangement, with cameras on the edge sending data back to a central control room where it is processed, is a perfect setup for many IVA applications. Factories transferring data from multiple camera feeds over high-speed Ethernet back to a tower PC in the same building for processing would be a great example of this kind of application. With this type of camera-to-server installation, data is harvested from the camera feeds and sent, in its entirety, to the control room for processing by the discreet GPU cards. Imagine the amount of data transfer consumed by ten 4k cameras operating 24 hours per day — fine when 100% of this data is being transmitted over existing Ethernet infrastructure, but where things get a little more challenging is when cameras are out in the field, dotted along motorways, in mobile camera units, on rolling stock, or in the harsh environment of the metropolis exposed to the city elements. In this situation, often the only means of data transfer is via sporadic Wi-Fi connections, existing twisted pair cabling, or cellular, none of which are practical for the amount of data, due to either commercial reasons or simply the lack of bandwidth.

This is where Embedded Edge AI Systems come into their own. Instead of transmitting the entirety of the data, the data itself can be processed right on the edge, in close proximity to the camera itself, with only the critical information being sent back to the control room. This can mean the difference between kilobytes and gigabytes, and opens the door to cellular data transmission, especially now we’re in the process of 5G being rolled out. With the advent of GPU processing on the Edge, mobile camera units can now process and alert of incidents in real time, independent of any established connectivity within its area.

One of the catalysts behind the movement to GPU processing on the Edge, along with the popular Nvidia Jetson family, is MXMs.

What is an MXM?

As with a lot of advances in technology centred around size restraints, such as M.2 modules, an MXM, or Mobile PCI Express Module, was originally designed for use in laptops. It was intended to provide an industry standard socket for graphics processors, which would allow the easy upgrade of GPU capabilities without the need to upgrade the whole system or be tied to proprietary hardware for the lifetime of the device.

In a similar vein to M.2 modules, this technology was quickly snapped up by manufacturers and integrators in the Embedded arena. Under the same space constraints as small commercial devices, Embedded systems require high processing power in a small footprint, but it isn’t just the power to size ratio which is making MXMs ever more popular in Embedded applications.

PCI cards are not, and never have been, designed for rugged applications where vibration is an issue. There are steps integrators can take to use these in harsh environments, such as designing GPU cages to minimise the effects of impacts and vibrations on the cards, but by and large they are intended for “carpeted spaces”, i.e. offices and control rooms where the environment is calm. Try and use a tower PC with a GPU card in a mobile camera unit, and its days are numbered.

MXMs are designed specifically for these harsh environments. Intended to be integrated into systems which themselves are tested and certified to adequately handle vibration, these modules are hardy and capable of being exposed to the impacts and tremors found in the harsh environment of life on the Edge. They too can be certified, with options for rail, oil and gas, marine and more, making the process of integration and certification of them into already certified embedded computing systems a much smoother process.

Which MXM GPU is right for your application?

When it comes to graphics processors there are generally two line-ups provided by NVidia: GeForce (consumer) and Quadro (pro). As you’d expect, each have their pros and cons, the balance of which is tipped by the environment they’re being installed in and the type of processing they have to do.

The first thing to understand is that both GeForce and Quadro are built on the same architecture. There are, however, versions of the architectures being released regularly, and within those versions there are even more versions.

For example, one of the latest architecture releases from Nvidia is named Turing, which is split into three different versions: TU102, TU104, and TU106. Don’t be fooled into thinking the TU106 is more powerful than the TU102, though. In general terms, the lower the number, the more components there are integrated into the processing unit, and therefore the greater the processing performance. What this means is that two graphics cards which share the same architecture can actually differ hugely in capabilities.

Turning to Intelligent Video Analytics applications, there is a series of MXM GPUs which includes features directly geared to video processing. The Nvidia Quadro RTX series of MXM GPU offers the same high performance we’ve seen with the Turing and Pascal architectures, but with added features to serve video-intensive applications such as machine vision and IVA — features not included in its GeForce RTX cousin.

Nvidia’s GPUDirect for Video, available only on Quadro RTX modules, allows manufacturers to write device drivers that efficiently transfer video frames in and out of NVIDIA GPU memory. What this brings is faster processing speeds, through the GPU and I/O devices working together to provide synchronised transfers rather than having forced wait times to ensure completion, reduced latency by transferring smaller chunks of data rather than sending the entire chunk in one go, and overall reduction in CPU overhead by copying data through pinned host memory.

Another feature included with Quadro RTX GPU modules is the ability to scale GPUs. Many intensive AI applications such as inference and video analysis require multiple GPU cards to manage the load. The multi-GPU scalability of Quadro RTX GPU modules allows for multiple GPUs to be integrated into a single system, then multiple systems used for the application, essentially scaling the number of Tensor cores into the thousands.

In Summary

With high processing power and a tiny footprint, MXM GPUs are enabling manufacturers and system integrators to provide efficient data capture and processing right on the Edge, right out of the box. Discreet GPU cards still dominate the industrial space, but for applications requiring more durability or a smaller footprint, there is little MXM GPUs can’t do that their big discreet GPU card cousins can. For more information on MXMs and using them with Embedded Computers, speak to Impulse Embedded on +44(0)1782 337 800, or click here to get in touch.

Rolling-stock data logger for Siemens Rail

Spurious error messages, especially those from rolling-stock subsystems, can be difficult and time consuming to diagnose. Siemens Rail, in Manchester, wished to install a data logging system onto their rolling stock to transmit data in real-time back to their depot for efficient data analysis by their team.