Talk:Viola–Jones object detection framework

	This article is within the scope of WikiProject Computer Vision, a collaborative effort to improve the coverage of Computer Vision on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Computer VisionWikipedia:WikiProject Computer VisionTemplate:WikiProject Computer VisionComputer Vision articles
Mid	This article has been rated as Mid-importance on the importance scale.

Robotics Start‑class Low‑importance

	This article is within the scope of WikiProject Robotics, a collaborative effort to improve the coverage of Robotics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.RoboticsWikipedia:WikiProject RoboticsTemplate:WikiProject RoboticsRobotics articles
Start	This article has been given a rating which conflicts with the project-independent quality rating in the banner shell. Please resolve this conflict if possible.
Low	This article has been rated as Low-importance on the project's importance scale.

Addition and Subtraction[edit]

The concept of integral image is explained in terms of sums. The meaning of the shaded and un-shaded sub-rectangles in the feature types is never explained. It may take the non-expert reader a while to figure out that

the summed area table can also be used for differences, or weighted sums,
the shading in the feature types probably represents a difference (or, equivalently, positive and negative weightings).

The shading notation should be explained. JohnAspinall (talk) 18:46, 3 August 2010 (UTC)[reply]

Feature Count[edit]

At the time of writing this there is a statement on the page claiming a 24x24 pixel window contains 45,396 possible features with a [why?] marker against it. Reading through the Viola-Jones 2002 paper (ref 1) towards the end of page 2 it states the following:

Given that the base resolution of the detector is 24x24, the exhaustive set of rectangle features is quite large, over 180,000. Note that unlike the Haar basis, the set of rectangle features is overcomplete.

This suggests the correct figure is 'over 180,000', not 45,396, however as I do not understand how either have been calculated (and the paper appears not to forward any explanation) I have not updated the page. Perhaps someone more knowledgeable in the field than I might be able to shed some light on this? 2.222.0.127 (talk) 18:36, 1 May 2013 (UTC)[reply]

Furthermore on reading ^[1] they suggest the number of features for the same 24x24 window is 117,941. However they do provide the method by which this figure was calculated, although no workings are provided and I do not follow their exact method. 2.222.0.127 (talk) 21:36, 1 May 2013 (UTC)[reply]

Revised 2003 paper for the Viola-Jones algorithm (ref 2) states that:

Given that the base resolution of the detector is 24x24, the exhaustive set of rectangle features is quite large, 160,000. Note that unlike the Haar basis, the set of rectangle features is overcomplete.

References

^ Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection, Lienhart, Rainer and Kuranov, Alexander and A Pisarevsky, Vadim 2003

KLT Algorithm[edit]

KLT is not an algorithm used for face detection. It is a feature tracking algorithm, plain and simple. The source provided is a MATLAB demo that uses KLT to track features in a bounding box found *after* face detection is performed separately using Viola-Jones. I am modifying this section accordingly. Marcman411 (talk) 14:04, 19 August 2017 (UTC)[reply]

Viola-Jones is mainly of historical interest[edit]

The article should state clearly that Viola-Jones is obsolete as a face-recognition algorithm (the problem for which it was invented). From about 2014, the focus of research shifted to Deepface and related deep-learning algorithms. The Viola-Jones detection framework still deserves mention in Wikipedia article because it is of historical importance, though perhaps a general historical-survey article covering Viola-Jones and other algorithms would be more appropriate. Longitude2 (talk) 13:44, 3 May 2020 (UTC)[reply]

Not accurate. It is still extensively used in many places, especially in devices which are not meant to be powerful computers (such as in consumer cameras). — al-Shimoni (talk) 05:42, 3 November 2022 (UTC)[reply]

[1] Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection, Lienhart, Rainer and Kuranov, Alexander and A Pisarevsky, Vadim 2003

[1]