…right people – right place – right time!

How 3D Stereo Vision Technology is paving the way for future real-time applications


26th April - Valencia.

The world of vision technology has developed at a rapid pace. Only a little while ago, few could imagine a portable device no bigger than a deck of cards having the ability to capture images and video, store gigabytes of data, and act as a radio, GPS device, and telephone. Using multiple technologies optimised as highly compact modules through an ‘embedded vision system’ incorporating a small camera or image sensor, powerful processor, and often I/O and display capability, has meant many application-specific systems were now possible.

Now, by making the leap to 3D sensor systems in real-time applications, the cutting edge of the vision industry is being developed and is already being used in fields such as automation technology, service robotics, autonomous vehicles, logistics, quality assurance or medical technology. The technology is being driven by the ability to capture stereoscopic vision in compact embedded modules.

Stereo vision or stereoscopic vision is the ability to obtain a spatial visual impression with both eyes through the comparison of their perceived images. When viewing an object, each eye looks up from a different direction or angle. Each eye filters the information and sends it to the brain, where both visual impressions are processed into a combined image. This is what creates our three-dimensional depth perception. Normally with a 2D camera, distance information or depth between the camera and the object observed is lost by taking an image, but this depth information can be recovered by several images taken from different known camera directions and their comparison. Stereo vision technology uses stereo matching algorithms. The operation is based on finding all the pixels in the stereo images that correspond to the same 3D point in the captured scene.

Stereo vision is a robust technology and can be used for several different machine vision applications in order to obtain depth information. Fast processing power which is offered by the powerful FPGAs integrated in the devices allow 3D stereo vision systems to be also used for critical real time applications, such as autonomous driving and real-time medical applications.

Automotive Advances

New automotive applications like advanced driver-assistance systems (ADAS) capabilities and functions demand sophisticated detection capabilities using a combination of sensors, including image (camera), LiDAR, radar, ultrasonic, and far infrared (FIR) devices. The use of precise 3D stereo vision modules will enable accurate information gathering and distribution of decision cues.

LiDAR devices generate pulses of light and capture the responses that bounce off objects. In addition to observing road features like lane markings, LiDAR sensors — which operate in both light and dark conditions — can also be used to detect animals, pedestrians, and other vehicles. Meanwhile, FIR thermal sensors capture the IR radiation emitted by objects, work in the dark, and can be used to differentiate between animate and inanimate objects. Signals from all of these sensors have to be processed. In many cases, representations of the sensor data are then required to be displayed to the driver on one or more in-cabin displays.

Increasingly, in-cabin applications (e.g., gesture recognition) and in-vehicle infotainment (IVI) applications (e.g., e-mirror, rear-view/backup) require some amount of artificial intelligence (AI) and machine learning (ML). It is forecast that AI will be employed in the majority of ADAS and IVI systems as soon as 2025.

Supporting multiple sensors and processing the data they generate requires an application processor (AP) with multiple I/Os and significant processing capabilities.  These solution stacks significantly reduce the time and engineering expertise needed to implement automotive embedded vision applications by providing developers with modular hardware platforms, reference designs, neural network IP cores, and custom design services to accelerate and simplify automotive vision system design.

Edge Processing

In embedded computer hardware, cost reduction, increased power and energy-efficiencies are being achieved by new developments in ‘Edge processing’ which are also paving the way for faster image handling.  Attaching a sensor module to a neural network brings image processing through a Graphic Processing Unit (GPU) closer to the sensor; this is known as ‘edge processing’, as opposed to processing in the cloud or sending the feed from the camera to a separate computer. This addresses latency, bandwidth and privacy concerns. And by doing so increases the art of what’s possible with computer vision.

Edge computing has had a significant impact on modern computer vision systems where timely business-critical decisions are being made based on what a camera sees. Often the time it takes to send images or video to a centralized location will not meet the timeliness required to make the necessary decisions. And sometimes the physical limitations of networks don’t provide enough bandwidth to send all of the images or video to a centralized location. The final consideration relates to privacy where it may violate privacy laws by sending images or video to a centralized location.

Locations without efficient network infrastructure can also be added quickly. For example, a hospital in a semi-rural location using robotic surgery would need access to its computer vision data in real-time. Having the 6-9 second buffer that occasionally comes with cloud connection would be detrimental to time-sensitive use cases such as this one.

Increasing 3D Capability

Vision technology is also shifting from image capture to object recognition and tracking. Robot usage in consumer and industrial applications have developed significantly, and now with advanced 3D capability will be able to differentiate objects and perceive human form.

3D has always had advantages over 2D for robotics due to its ability to capture and comprehend a much richer set of data. It not only easily recognizes more types of objects, but also enables robots to orient themselves in three-dimensional space.

While many robotic systems use 3D cameras to identify and avoid obstacles, deep learning of the human form will enable robots to both interact with and, where necessary, avoid people. Service robots, security systems, warehouse and factory bots, autonomous delivery units, and robotic hospital/healthcare aides are just some of the hundreds of applications for this highly advanced, yet practical and affordable vision breakthrough. Conventional forms of 3D reconstruction can’t recognize humans; everything is treated as an object. In deep-learning 3D reconstruction, an algorithm is “taught” to recognize the human form. It removes the black holes or other forms of missing data in the camera field, including the rough and/or missing object edges encountered in less-adept systems.

3D vision capability is also having an impact on Time of Flight (ToF) systems that can capture the exact shape and position of moving objects, identifying their size, distance, and rate of movement even in complete darkness. 3D ToF vision systems are highly accurate and extremely valuable for industrial or environmental use. ToF is pixel-by-pixel effective, giving it excellent edge perception, yet it doesn’t require a GPU or neural network. The onboard computing capability of ToF systems allows robots to convert raw data into precise depth images in real-time. ToF vision can be used for advanced human-machine interface (HMI), 3D scanning, surveillance, and gaming, as well as a wide range of robotics applications. It’s expected that within five years, 30% of 2D cameras will be enhanced with 3D capability.

With so many new developments that are opening a new world of vision applications, so the requirement for specialist embedded engineers becomes ever more apparent.  But this is no easy feat as there is a global shortage of highly skilled engineers with the right expertise to develop the next exciting innovation. It takes experts in the market of finding the right match of skills for any given project. Companies such as CIS have honed their knowledge over 20 years in doing just that. Make sure your next project is covered, just call 0034 963 943 500 or email us on info@cis-ee.com