In years past, the idea of using 3D for machine vision applications was appealing, but costly. In 2010, the release of a new “disruptive” technology, Microsoft’s low-cost depth sensor, the Kinect, has initiated a new low-cost branch of 3D camera technology. This paper will give a brief overview of some of the more commonly used 3D sensor technologies.
In laser triangulation, usually a part is moving on a conveyer belt, passing through a laser line with a camera detecting the reflected laser beams. Simplistically, the angle of the reflected beam that is detected gives the height of the object on the belt. The advantages of using a laser triangulation method are the following: • Very Mature Technology • Lots of 3rd Party Software Available • Ideal for Repetitive Scans • Prepackaged Cameras or DIY • Modest Costs
Some disadvantages of using a laser triangulation method are that the part needs to be moving, it is an active scanning method, and that it may be slow when comparing to other methods. www.imagingtechnology.com
Structured Light Projector
The Kinect sensor uses an IR structured light projector along with a depth sensor. The RGB camera is also available to create a color image of the scene to use along with a depth image.
Angular Field of View Frame Rate Resolution
57 horizontal, 43 vertical ~30 Hz 640 x 480 (VGA) 320 x 240 (Depth Camera) 0.8 m - 3.5 m (10m) 1 cm at 2m
Working Distance Depth Resolution
• • • • • •
PROS Inexpensive ($200) Fast, (30fps) Lots of 3rd Party Software Available USB2 Can Track Human Gestures Can combine RGB image with Depth Map
• • • • • • •
CONS Poor resolution Marginally Robust Subject to Ambient Lighting Interference Shiny and Dark objects don’t image well Very Noisy Most Development is Game Oriented Unknown Roadmap
Simplistically, Time of Flight sends an IR light pulse into the FOV and analyzes the time delay of the reflections off of surfaces to determine how far a surface is from the sensor. In the past, Time of Flight technologies were very expensive, complex/difficult to use. They were mostly used by military or NASA. In recent years, however, this technology has evolved to low-cost, easier to deploy sensors within the 3D community. There is usually a limited depth range and is typically known for gesture recognition.
• • • •
Resolution: 320x240 Frame Rate: 25-60 fps Typically limited to near distances DS325 < 1.4cm noise at 1 meter Gesture Recognition Algorithms
KINECT2 Angular Field of View Frame Rate Resolution Working Distance Depth Resolution
70 horizontal, 60 vertical ~30 Hz 1920 x 1080 (Color) 512 x 428 (Depth Camera) 0.8 m - 4 m ?
A projector projects a pattern, usually of lines, boxes and circles that can be used on the surface of a 3D object. A camera can be used to capture the images and a 3D structure can be interpreted from the projected pattern. Below is a simple example of such a pattern :
Camera DLP Projector
The structured light projector can use a variety of patterns with varying colors.
This method uses 2 identical sensors with some baseline distance apart so that their field of views is overlapping. In the example below, both cameras are viewing the object of interest, Mr. Einstein.
The difference in location of the object in each image is termed the disparity and is converted into depth information. LEFT CAMERA
• • • • • • •
PROS Algorithms are very mature Passive Technology Can be used with open source code Algorithms are already embedded in TI DSP chips Can use any camera/lens sensor pair for wide range of working distances and resolutions Many affordable stereo cameras specific to working distances already on the market Can be used with structured light to increase accuracy and reduce data dropout
CONS Computationally expensive for higher resolution cameras Low feature areas cause data dropout Low Light is a Problem
Fragment ed depth
One of the disadvantages of stereoscopic methods is that flat, featureless areas do not have matching points in both of the left and right images, so there is “data dropout” in those areas. One solution is to add structured light to the stereoscopic method, which will help to fill in depth information in areas where point matching is not possible.
For fast, high resolution, high accuracy stereo vision, one may consider a specialty camera, such as the Chromasens:
• Single line-scan camera / Dual-lens • Resolution from 10 µm to 700 µm • High speeds (up to 60,000 lines/s) • Flexible in base width • Optional Color Information • Utilizes GPU Technology for Fast , RealTime Processing
For applications that require extremely detailed fine measurements, a special contact method may be considered.