Talk:Computer stereo vision

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Improving the Article[edit]

I'm very unhappy with the quality of the article. The description given in the outline section applies to Block Matching but doesn't really match with more modern algorithms. This should follow Schaarstein's taxonomy. The article discusses in great detail the least squares information measure which is hardly used anywhere. There's also no discussion of local vs. global algorithms and smoothness is only covered very briefly.

I would be willing to donate the introductory sections of my PhD thesis, including figures (see [1] pages 14 to 20 and probably parts of pages 21 to 31). However, I don't see a way how to merge this with anything in the current article. Also, this work doesn't yet discuss deep learning based methods, which I think should really be included. User:Cone83 —Preceding undated comment added 16:56, 17 December 2017 (UTC)[reply]

Merge from Image rectification[edit]

There seems to be a lot of overlap, but I would like to know what others think. User A1 (talk)

Image rectification uses the intrinsics of a camera (parameters internal to a camera such as focal length, pixel dimensions ...) to undo distortions related to the imaging process. This is quite different than stereo vision which involves the extrinsics (as well as the intrinsics) between 2 (or more) different cameras in order to combine imaging information between the cameras into a consistent geometric (such as depth) information. That is: not a lot of overlap, more of a parent<-child relationship.

Above was from Gary Bradski (talk) 06:19, 8 July 2009 (UTC)[reply]

I agree with Gary that there is a parent (stereo) child (rectification) relation here. But the image recification article should probably be extended with a better discussion about the mathematical foundations for why it can be done and specific techniques for how it is done to make it more independent to the stereo article. --KYN (talk) 23:24, 8 July 2009 (UTC)[reply]
I agree that Image rectification should NOT he merged into stereo vision. It is necessary for the proper interpretation of images from a single cameras, and is also useful in determining camera motion/distance relationships for a single camera. Dmwpowers (talk) 03:48, 20 February 2010 (UTC)[reply]

Additional refinement.[edit]

This line :

The image must first be removed of distortions, such as barrel distortion to ensure that the observed image is purely projectional.

should read:

The image should first have it's distortion's removed. Lenses contain a taylor series of radial angular distortions, called pincushion / barrel distortion, depending if they stretch or compress angles from the optical axis to the edges of a lens, to assure the image is angularly calibrated to absolute angles. LoneRubberDragon (talk) 09:20, 26 February 2010 (UTC)[reply]

http://en.wikipedia.org/wiki/Pincushion_distortion

Second, there is no mention about the two fields of stereo imagery, small displacement stereo image model reconstruction, and large dispalcement stereo image model reconstruction. For example, in large displacement stereography, a street corner with 10 cameras placed in various random locations, can be used to reconstruct a model of the intersection, if key features are identified and then robustly back projected into a stereo model of the intersection. In small displacement stereography, which you do cover, it takes several "cameras" placed closely together in space relative to the scene, so that differential image displacement methods are used to estimate the image depth of features, caused by the small paralax differences. If you look at my comments on your Super-resolution imaging algorithm, you can use the gradient methods to solve for stereographic depth estimation, where the region of interest (ROI) is partitions of the original image, most easily estimated in a decomposition heirarchy, starting with the whole image dispalcement, than quarter image panels, then fourths, and eights, and sixteenths, until the entire image has stereo depth estimates mapped. Smoothness is achieved by nyquist overlapping the halfs, quarters, eights and such, so that partitions estimate depth and the overlaps create a detailed map to nyquist criteria of the ROI algorithm window. LoneRubberDragon (talk) 09:20, 26 February 2010 (UTC)[reply]

There is also little mention of the trigonometric geometric optics math involved, in estimating distances, from the known angular information about the lens coverage, calibrated to remove radial angular distortions. LoneRubberDragon (talk) 09:20, 26 February 2010 (UTC)[reply]

I know what you mean about these algorithms, from image processing work at Irvine Sensors Corporation on WPAFB and Sandia National Lab projects, but the average reader will get little from your superficial gloss. I find too many wikipedia articles are superficial glosses, with little meat, except as a bookmarker for people who are already experts in the field. LoneRubberDragon (talk) 09:20, 26 February 2010 (UTC)[reply]

I have made a few changes in the text to make the meaning more precise and to remove unclear phrasing.Gwestheimer (talk) 18:52, 31 May 2011 (UTC)[reply]

Add links to examples and tutorials[edit]

I think there is room to add links to some examples and tutorials on stereo vision to the page. I wanted to know what other users thought of this and perhaps what sections to add ? Avi.nehemiah (talk) 20:00, 15 January 2014 (UTC)[reply]

Detail definition[edit]

I added a detailed definition. I struggled to find good readable articles. The citation I have given is the best I have found. I have tried to keep it readable while attempting to define what stereo vision processing does. More work probably needed but the hour is late.

Thepigdog (talk) 16:04, 13 May 2014 (UTC)[reply]

I am looking at the smoothness condition. The auto correlation formula seems to be often used. Strictly speaking smoothness is a property of the world, which should be learnt. Consider for example a completely random world. Images would have no smoothness (e.g. the random dots example). Any formula provided for smoothness is then a heuristic.

I think probably in the human brain neural net, smoothness is learnt.

http://www2.ece.ohio-state.edu/ion/documents/IEEE_aero.pdf

Thepigdog (talk) 01:43, 14 May 2014 (UTC)[reply]

I removed {{morereferences|date=August 2009}} {{expert-subject|Robotics|date=June 2009}}

Because old. I have added some citations and references. Its hard to find good ones.

Thepigdog (talk) 11:29, 14 May 2014 (UTC)[reply]

3D world model[edit]

Stereo vision should be used to construct a 3D model of the world and the objects in it. A stereo vision system should be constantly adding to and updating the 3D model. Creating disparity maps from scratch may not be necessary. Over time a model will build up, so that each new image can be compared with a predicted model and depths. Perhaps this leads to a slightly different view of the role of stereoscopy. The main role of computer vision should be to patch together a 3D model out of all the images given.

Should this be part of this article? I have seen some references that refer to a robot constructing a 3D model of the environment.

Thepigdog (talk) 02:58, 15 May 2014 (UTC)[reply]