Learning Center June 21, 2023

Computer Vision and Perceptual AI

Perceptual AI, also known as machine perception, perceptual computing or cognitive computing, is a sub field of artificial intelligence (AI) that aims to enable machines to perceive and understand the world in a manner similar to humans. It focuses on emulating human-like sensory capabilities, such as vision, hearing and touch to interpret and make sense of the surrounding environment.

For perceptual AI to deliver benefit, machines must be able to extract meaningful information from sensory data and use it to understand, reason and interact with the world in a more natural and intelligent way. Needless to say, computer vision plays a huge role in this.

3D imaging and perceptual AI

Computer vision techniques are supporting the growth of AI in general, and perceptual AI is particularly reliant on these technologies. 3D imaging has a pivotal role in delivering perceptual AI.

By incorporating 3D imaging techniques, perceptual AI systems gain a deeper understanding of the physical world, enabling them to accurately analyze and interpret complex visual data. Various 3D imaging techniques contribute to the advancement of perceptual AI. You can read more about the most common techniques in our blog, “Selecting a camera for your machine vision application.”

Just as an example, stereo vision, which involves capturing multiple images of the same scene from different perspectives and using the disparities between them to calculate depth, enables AI systems to perceive more, leading to better object recognition, scene understanding, and spatial reasoning.

Structured light and (ToF) imaging are also useful in providing real-time depth maps, enabling AI systems to understand the 3D structure of the environment and improve object segmentation, tracking and collision avoidance.

Perceptual AI systems leveraging 3D imaging are transforming many applications such as robotics. With the ability to perceive 3D structures, robots can navigate complex environments, manipulate objects with precision, and interact with humans more safely. Industrial robots equipped with 3D imaging capabilities can optimize tasks such as bin picking, assembly and quality control, enhancing productivity and flexibility. Add perceptual AI, and a robotic attachment can handle items such as ripe fruit more carefully, reacting to pressure sensors for a gentle touch.

Sensors for perceptual AI

Multiple sensors can be used to gather data from the environment and enable machines to perceive. These sensors capture different types of information, such as visual, auditory or tactile, which are essential for different aspects of perception.

Visual sensors, such as those in cameras, play a fundamental role in perceptual AI. Cameras capture images or videos of a scene, essential for tasks such as object recognition, scene understanding and tracking. Additionally, cameras can be combined with specialized lenses, filters or techniques including multispectral imaging to capture specific visual information, such as thermal or hyperspectral data.

LiDAR (Light Detection and Ranging) sensors use laser technology to measure distances and create detailed 3D maps of the environment. LiDAR emits laser pulses and measures the time it takes for the light to bounce back from objects, allowing the calculation of precise distances.

Radar (Radio Detection and Ranging) sensors use radio waves to detect and track objects in the surrounding environment. They measure the time it takes for radio waves to reflect off objects and return to the sensor. Radar sensors are commonly used in applications where detecting objects at longer distances and in adverse weather conditions is crucial. They are employed in autonomous driving, collision avoidance systems, and surveillance applications.

Ultrasonic transducers emit high-frequency sound waves and measure the time it takes for the sound waves to bounce back from objects. These sensors provide distance measurements and are often used for object detection, proximity sensing and obstacle avoidance in robotics and automation applications.

Microphones are essential auditory sensors that capture sound waves and convert them into electrical signals. They enable machines to perceive and interpret audio information, facilitating tasks such as speech recognition, sound classification, and acoustic scene analysis. Microphones are used in applications like voice assistants, noise detection systems and security systems.

Tactile sensors capture physical contact or pressure information. They can be in the form of pressure-sensitive mats, force sensors, or tactile arrays. Tactile sensors enable machines to perceive touch and pressure, supporting tasks such as object manipulation, grasp control and human-robot interaction.

Sensor fusion

By using a combination of sensors including visual, LiDAR, radar, auditory and tactile sensors, perceptual AI systems can gather comprehensive data about the environment. These sensors provide machines with the necessary input to analyze, interpret and make informed decisions based on the sensed information, ultimately enhancing their perception and interaction capabilities with the world.

For example, combining visual data from cameras with distance information from LiDAR sensors or fusing visual and auditory data to enhance understanding, a more holistic perception of the real world can be obtained, and actions taken accordingly.

5 features to consider when selecting a camera for perceptual AI

As imaging sensors are some of the most important in the execution of perceptual AI, we’ve made some suggestions of what to look out for when selecting a camera for the job.

Resolution: High-resolution cameras can provide clearer images and video which is important for tasks like object recognition, data analysis and detailed scene understanding. Our Harrier range includes full HD cameras and 4K camera options.

Frame rate: The frame rate of the camera determines how many images or video frames are captured per second. A higher frame rate is beneficial for applications that require real-time perception, such as object tracking, motion analysis, or autonomous navigation. Consider the frame rate requirements of your specific application to ensure the camera can capture images at the desired speed.

Sensor size: The size of the camera sensor affects factors like low-light performance, dynamic range, and depth of field. Larger sensors typically capture more light, resulting in improved image quality and better performance in low-light conditions. They can also provide a shallower depth of field, which may be desirable for certain computer vision tasks. However, larger sensors can be more expensive and may require more substantial camera hardware. Other modules, such as the new Sony FCB-EV9520L, have a smaller (1/2.8”) sensor but additional technology to improve sensitivity to light. View our full range of autofocus-zoom cameras and select from a 1/3 up to 1/1.8 type sensor.

Spectral sensitivity: Some applications of perceptual AI may require cameras with specific spectral sensitivity. For instance, tasks related to infrared imaging, thermal analysis, or hyperspectral imaging may require cameras that are sensitive to specific wavelengths outside the visible spectrum.

Integration and compatibility: Consider the camera’s compatibility with the hardware and software infrastructure you plan to use. Look for cameras that have appropriate connectivity options (e.g., USB, Ethernet) and are supported by commonly used software frameworks or libraries for computer vision and machine learning. Harrier camera interface boards from Active Silicon can be added to block cameras to convert LVDS video output to various alternatives including HD/3G-SDI, USB 3 (UVC), HDMI, Ethernet IP and MIPI CSI-2. Harrier cameras are also among the most compact available which makes dropping them into existing systems easier.

Need some advice on selecting the best camera for your machine perception task? Contact us to see what options we offer, and how we can support your machine vision project.