3D Map Building Based on Stereo Vision

Summary

A real-time local 3D map building algorithm for navigation in autonomous land vehicles (ALVs) is theorized using binocular stereo vision. A model is constructed and the error of the algorithm is analyzed based on the 2D prediction error to find a depth factor which can be used to build a local 3D map. A global 3D map can be built by integrating the local 3D map with information from an INS and a GPS.

Objective

A binocular stereo vision system (one that uses more than two cameras) is utilized complemented by a 3D local map-building algorithm in combination with INS and GPS. This is done for autonomous land vehicles (ALVs) to get information about its surroundings.

Method

Two cameras with parallel optical axes form the setup of the stereo vision system. The image pairs generated by the cameras have a resolution high enough that the residual vertical disparity is under one pixel.

A matching algorithm is used to create a disparity map. This is done by finding something called the sum-squared-difference (SSD) minimums for each pixel independently and finding the minima of fitted parabolas. Disjoint regions or small regions of bad matches can be removed with a simple blob colouring algorithm. The matching algorithm is run on a CPU running at 2.4 GHz and its runtime is under 100 ms.

Now for building a 3D map around the ALV, a change in coordinate systems take place. The camera’s coordinate system (C-CS) is converted to the ALV coordinate system (ALV-CS) where some mathematical operations are performed with the 2D coordinates gathered from the image pairs. By this conversion, a depth factor is calculated which is used to build the local 3D map of the disparity map.

Experimental error is calculated using physical measurements and then added to the ALV-CS coordinates.

Result

The 3D data in closer regions is dense, while the data further away is sparse or even non-existant. Furthermore, any data occluded or not detected by the camera is inexistent. A global 3D map is made in a coordinate system called the world coordinate system (W-CS) by converting the ALV-CS to W-CS. The W-CS is built on INS and GPS technologies.

Inference

The algorithm used here was built for autonomous land vehicles and it may not be suitable for implementation in a UAV. However, it is worth looking into given that it performs well to build global 3D maps. Local 3D maps generally have less information on distant coordinates but on integration with an INS and a GPS, it could map the region around pretty well. However, the setup would require two cameras as well as a GPS and an INS built onto the UAV. Another good thing with this algorithm is that it is real-time which means that if we were to incorporate it into WARG’s architecture, the live video feed could be pipelined into it and the 3D map generated could be fed into a new module for navigation.

In a nutshell, this model has both pros and cons but is worth looking into for 3D map building and navigation.

References

  1. https://www.researchgate.net/publication/224643999_3D_Map_Building_Based_on_Stereo_Vision

  2. J. M. Saez and F. Escolano, ”A global 3D map-building approach using stereo vision,” in Proceedings of the 2004 IEEE International Conference on Robotics and Automation, pp.1197-1202.

  3. M. W. M. G. Dissanayake, P. Newman., S. Clark, et al., ”A solution to the simultaneous localization and map building (SLAM) problem,” IEEE Transactions on Robotics and Automation, vol.17, no.3, Jun. 2001, pp.229-241

  4. Y. L. Xiong and L .Matthes, ”Error analysis of a real-time stereo system,” in Proceedings of the 1997 IEEE Computer Society International Conference on Computer Vision and Pattern Recognition, pp.1087-1093.