What do the "left" and "right" values ​​mean in the haar cascade XML files?

In OpenCV haar cascading files, what are the "left" and "right" values, and how does this relate to the "threshold" value? Thanks!

Just for reference, here is the file structure:

<haarcascade_frontalface_alt type_id="opencv-haar-classifier"> <size>20 20</size> <stages> <_> <!-- stage 0 --> <trees> <_> <!-- tree 0 --> <_> <!-- root node --> <feature> <rects> <_>3 7 14 4 -1.</_> <_>3 9 14 2 2.</_></rects> <tilted>0</tilted></feature> <threshold>4.0141958743333817e-003</threshold> <left_val>0.0337941907346249</left_val> <right_val>0.8378106951713562</right_val></_></_> <_> 
+6
opencv
source share
4 answers

“Left” and “right” refer to the gradient values ​​of a particular shape. These specific shapes are not, in particular, the left rectangle and the right rectangle. Instead, it simply refers to sections of a specific configuration (sometimes more than one section if there are more than 2). David Haar's document has a chart that helps explain this.

Here is the ascii view (= filled, - blank):

 ==== ==-- =--= ==== ==-- =--= ---- ==-- =--= ---- ==-- =--= 

In general, naming is a bad convention. Instead, it should be called “gradient top”, “gradient bottom” (2), “gradient left”, “gradient right” (2), “gradient left”, “center of gradient”, “gradient bottom” (3), respectively . Turn, edge and other shapes should be named so that the sections are uniquely identified.

+2
source share

In the OpenCV source code, you will find cvhaar.cpp , which gives some insight into how the Haar cascade works. Unfortunately, this is essentially not a comment, and the documentation does not help. Here is my understanding of how this works.

In the icvEvalHidHaarClassifier() function, the sum is calculated for functions of one CvHidHaarTreeNode .

If this amount is less than the threshold, you should follow the "left" node, and the process repeats. Otherwise, the "right" node is executed, repeating itself again. This is reflected in the following statement:

 idx = sum < t ? node->left : node->right; 

The loop is interrupted when the "left" or "right" node is a negative value. In this case, the sum is no longer calculated for this function, but the threshold value for this function is returned as a result of the classifier.

I put “left” and “right” in quotation marks because, as you say, they have nothing to do with the position of the function. Instead, they reflect how the cascade “falls”: below the threshold, the cascade falls to the left, above the threshold it falls.

Let us return to the presentation of these nodes. In XML, you will see the representation of nodes not as indexes, but as values:

 <left_val>0.0337941907346249</left_val> <right_val>0.8378106951713562</right_val> 

These numbers are actually node names that are looked up with cvGetFileNodeByName() . I don’t know exactly how this works in OpenCV, but now I hope you at least better understand how the cascade works.

+2
source share

Paul, really?

I think left_val / right_val is used as:

 sum_stage += (sum_feature < feature_threshold*stddev)?(left_val):(right_val) 
+2
source share

As far as I understand, the original article is quick detection of objects using an extended cascade of simple functions by Paul Viola and Michael Jones. It is based on Haar-like functions, hence the name. I suggest capturing it from the IEEE website. (If you don’t have an account, check out other versions on Google Scholar .)

Classifiers are also described in the face detection function using Haar classifiers (Wilson, Fernandez). You can find it on the ACM website or on the CSA website .

0
source share

All Articles