Understanding the LBP OpenCV Implementation

I need help with face detection based on LBP, and that is why I am writing this.

I have the following face detection issues implemented on OpenCV:

  • In lbpCascade_frontal_face.xml (this is from opencv): what are internalNodes, leafValues, tree, functions, etc.? I know that they are used in the algorithm. But I do not understand the meaning of each of them. For example, why do we use a specific function and not others for a particular stage? how do we decide which function / node to choose?
  • What are property values ​​in LBP_frontal_face_classifier.xml? I know that they are a vector of 4 integers. But how should I use this function? I thought that stage 0 of access to the first function, but access is not in this template. What is the access pattern for these features?

  • All articles in the literature provide only a high-level overview. Their descriptions mainly consist of calculating LBP from a neighborhood of pixels. But how are these LBP values ​​used against these elements in the classifier?

  • How does an integrated image help in calculating the LBP value of a pixel? I know how HAAR is used. I need to understand LBP.

I read several articles, articles. But none of them clearly describes how the LBP-based discovery function or algorithm works. If someone wants to develop a facial recognition program on their own, what steps they should follow, not a single document describes this.

Please help me if you can. I will be grateful.

+6
source share
1 answer

I refer to my own answer from the past, which slightly touches on the topic but does not explain the format of the XML cascade.

Take a look at a fake cascade clarity modified with just one step and three functions.

<!-- stage 0 --> <_> <maxWeakCount>3</maxWeakCount> <stageThreshold>-0.75</stageThreshold> <weakClassifiers> <!-- tree 0 --> <_> <internalNodes> 0 -1 3 -67130709 -21569 -1426120013 -1275125205 -21585 -16385 587145899 -24005</internalNodes> <leafValues> -0.65 0.88</leafValues></_> <!-- tree 1 --> <_> <internalNodes> 0 -1 0 -163512766 -769593758 -10027009 -262145 -514457854 -193593353 -524289 -1</internalNodes> <leafValues> -0.77 0.72</leafValues></_> <!-- tree 2 --> <_> <internalNodes> 0 -1 2 -363936790 -893203669 -1337948010 -136907894 1088782736 -134217726 -741544961 -1590337</internalNodes> <leafValues> -0.71 0.68</leafValues></_></weakClassifiers></_> 

Somewhat later ....

 <features> <_> <rect> 0 0 3 5</rect></_> <_> <rect> 0 0 4 2</rect></_> <_> <rect> 0 0 6 3</rect></_> <_> <rect> 0 1 4 3</rect></_> <_> <rect> 0 1 3 3</rect></_> 

...

First, let's look at the stage tags:

  • maxWeakCount for the stage is the number of weak classifiers in the stage, what is called in the comments a <!-- tree --> and what I call the LBP function.
    • In this example, the number of LBP functions in step 0 3 .
  • stageThreshold - this is what the weight characteristics of the functions should be at least for passing the stage.
    • In this example, the threshold for the step is -0.75 .

Moving on to the tags describing the LBP function:

  • internalNodes is an array of 11 integers. The first two do not make sense for LBP cascades. The third is the index in the <features> <rect> table at the end of the XML file (A <rect> describes the geometry of the object). The last 8 values ​​are eight 32-bit values, which together make up the 256-bit LUT, which I talked about in my previous answer. This LUT is computed by a learning process that I don't quite understand.
    • In this example, the first feature of the reference reference rectangle is 3 , which is described by four integers 0 1 4 3 .
  • leafValues are two pass / fail weights associated with a function. Depending on the bit selected from internalNodes during function evaluation, one of these two weights is added to the total. This amount is compared to the <stageThreshold> stage. Then bool stagePassed = (sum >= stageThreshold - EPS); where EPS is 1e-5, determines whether the stage has passed or failed. Weights are also determined by the learning process.
    • In this example, the first weight of the function failure is -0.65 , and the pass weight is 0.88 .

Finally, the <feature> . It consists of an array of <rect> tags that contain 4 integers that describe the geometry of the object. Given the processing window (24x24 in your case), the first two integers describe its x and y integer pixel offsets in the processing window, and the next two integers describe the width and height of one correcting out of 9, necessary to evaluate the LBP function.

In essence, then the <rect> ft.x ft.y ft.width ft.height </rect> located in the pW.width x pW.height processing window, which checks for the presence of a face in pW.x x pW.y , corresponds to ...

http://i.stack.imgur.com/NL0XX.png

To estimate LBP, it is enough to read the integral image at points p[0..15] and use p[BR]+p[TL]-p[TR]-p[BL] to calculate the integral of nine sub-strands. The central rectangle of R4 is compared with 8 others, clockwise, starting from R0, to get an 8-bit LBP (bit packed [msb 01258763 lsb]).

This 8-bit LBP is then used as an index in the function (2 ^ 8 = 256) -bit LUT ( <internalNodes> ), choosing one bit. If this bit is 1, the function is incompatible with the face; if 0, then it is consistent with the face. The corresponding weight ( <leafNode> ) is then returned and added with the weights of all other functions to obtain the total step amount. It then compares with <stageThreshold> to determine if the stage passed or failed.

If something else I have not explained well enough, I can clarify.

+14
source

All Articles