Finger / Hand Recognition with Kinect

Question

Finger / Hand Recognition with Kinect

Let me explain my need before I explain the problem. I look forward to a manual application. Navigation using the palm and clicks using the grip / fist.

I am currently working with Openni, which sounds promising and has a few examples that turned out to be useful in my case, as it had a built-in manual tracker in the samples. which serves my purpose for a while.

I want to ask,

1) What would be the best approach to using a fist / grip detector?

I trained and used the classic Adaboost fists on extracted RGB data, which was pretty good, but it has too many false positives to move forward.

So here I ask two more questions

2) Is there any other good library that can satisfy my needs using depth data?

3) Can we train our own gestures, especially using fingers, since some of the articles are about HMM, if so, how can we continue working with a library such as OpenNI?

Yes, I tried to use mid-level libraries in OpenNI, for example, a capture detector, but they will not serve my purpose, since it does not open and does not correspond to my need.

Besides the fact that I asked if there is something that you think this can help me, it will be accepted as a good offer.

+7

image-processing opencv computer-vision kinect openni

4nonymou5 Feb 14 '14 at 11:12

source share

8 answers

Vlad · Answer 1 · 2014-02-26T20:53:58+0000

You do not need to train your first algorithm, as this will complicate the situation. Do not use color, as it is unreliable (mixes with the background and changes unpredictably depending on the lighting and point of view).

Assuming your hand is the closest object, you can simply segment it in depth. You can set the threshold manually, use the nearest area of the depth histogram, or execute the connected component on the depth map to first divide it into significant parts (and then select your object based not only on its depth, but also using its dimensions, movement, user input etc.). Here is the output of the related component method:
Apply convex defects from opencv library to find fingers;
Watch your fingers, not re-open them in 3D. This will increase stability. I successfully implemented such finger detection about 3 years ago.

Robau · Answer 2 · 2014-03-01T09:25:41+0000

Read my article :) http://robau.files.wordpress.com/2010/06/final_report_00012.pdf

I did a study of hand gesture recognition and appreciated several approaches that are reliable for scaling, rotation, etc. You have information about the depth, which is very valuable, since the most difficult problem for me was to actually separate the hand from the image.

My most successful approach is to track the contour of the hand and each point on the contour, the distance to the center of gravity of the hand. This provides a set of points that can be used as input for many learning algorithms.

I use the moments of the image of a segmented arm to determine its rotation, so there is a good starting point on the outline of the arms. It is very easy to determine the fist, outstretched arm and the number of outstretched fingers.

Note that although it works great, your hand tends to get tired of being directed into the air.

AldurDisciple · Answer 3 · 2014-02-25T07:48:51+0000

It seems that you do not know Cloud Cloud Library (PCL) . It is an open source library designed to process point clouds and RGB-D data, which is based on OpenNI for low-level operations and which provides a lot of high-level algorithms , such as registration, segmentation and recognition.

A very interesting figure / object recognition algorithm as a whole is called an implicit model. To detect a global object (for example, a car or an open hand), you first need to detect possible parts (for example, wheels, torso, etc., fingers, palm, wrist, etc.) using a local feature detector, and then for determining the position of a global object by considering the density and relative position of its parts. For example, if I can detect five fingers, a palm and a wrist in a certain area, there is a good chance that I actually look at my hand, however, if I find only one finger and a wrist somewhere, it could be a couple of false positives. A research article about this implicit figure model algorithm can be found here .

PCL has a couple of tutorials on the topic of shape recognition, and, fortunately, one of them covers the implicit model model that was implemented in PCL. I have never tested this implementation, but from what I could read in the tutorial, you can specify your own point clouds for training the classifier.

If you said , you did not explicitly mention it in your question, but since your goal is to program a manually-controlled application, you may actually be interested in real-time form detection algorithm. You will need to check the speed of the implicit shape model presented in PCL, but I think this approach is better for offline recognition of the shape.

If you need real-time shape recognition, I think you should first use a hand / hand tracking algorithm (which is usually faster than full detection) to know where to look for images, rather than trying to perform a full shape detection on every frame your RGB-D stream. For example, you can track the location of a hand by segmenting a depth map (for example, using the appropriate threshold at depth) and then detecting extermination.

Then, as soon as you know exactly where the hand is, it should be easier to decide whether the hand makes one gesture related to your application. I'm not sure what you mean by fist / grab gestures, but I suggest you identify and use some application control gestures that are quick and easy to distinguish from one another.

Hope this helps.

phyrox · Answer 4 · 2014-02-25T08:10:19+0000

Quick answer: Yes, you can train your own gesture detector using depth data. It is very simple, but it depends on the type of gestures.

Suppose you want to detect hand movement:

Determine the position of the arm (x,y,x) . Using OpenNi is inflexible since you have one node for the hand
Perform a gesture and collect ALL hand positions during gestures.
With a list of train positions HMM. For example, you can use Matlab , C or Python .
For your own gestures, you can check the model and define gestures.

Here you can find a good tutorial and code (in Matlab). Code ( test.m pretty easy to use). Here is a snippet:

 %Load collected data training = get_xyz_data('data/train',train_gesture); testing = get_xyz_data('data/test',test_gesture); %Get clusters [centroids N] = get_point_centroids(training,N,D); ATrainBinned = get_point_clusters(training,centroids,D); ATestBinned = get_point_clusters(testing,centroids,D); % Set priors: pP = prior_transition_matrix(M,LR); % Train the model: cyc = 50; [E,P,Pi,LL] = dhmm_numeric(ATrainBinned,pP,[1:N]',M,cyc,.00001);

Working with fingers is almost the same, but instead of detecting your hand, you need to detect your fingers. Since Kinect does not have finger points, you need to use a specific code to detect them (using segmentation or path tracking). Some examples using OpenCV can be found here and here , but the most promising is the ROS library, which has a finger node (see the example here ).

Thomas hetzer · Answer 5 · 2014-02-27T07:38:35+0000

If you only need to determine the status of the fist / capture, you should give Microsoft a chance. Microsoft.Kinect.Toolkit.Interaction contains methods and events that detect the state of hand grip / hand grip release. Take a look at HandEventType InteractionHandPointer . This works well for fist / grip detection, but does not detect or report the position of individual fingers.

The following kinect (kinect one) detects 3 joints on the arm (wrist, arm, thumb) and has 3 gestures on the arm: open, closed (grip / fist) and lasso (pointer). If this is enough for you, you should consider microsoft libraries.

ramez · Answer 6 · 2014-02-15T00:27:32+0000

1) If there are many false positives, you can try to expand the set of negative classifier samples and prepare it again. An extended set of negative images should contain images where the fist was detected falsely. Perhaps this will help create a better classifier.

Nallath · Answer 7 · 2014-02-28T11:22:51+0000

I have had quite a few successes in the middleware library provided by http://www.threegear.com/ . They provide multiple gestures (including grabbing, pinching, and pointing) and 6 DOF-manual casts.

masterxilo · Answer 8 · 2017-03-02T00:51:46+0000

You may be interested in this article and open source:

Robust articulated ICP for real-time hand tracking

Code: https://github.com/OpenGP/htrack

Screenshot: http://lgg.epfl.ch/img/codedata/htrack_icp.png

YouTube video: https://youtu.be/rm3YnClSmIQ

PDF paper: http://infoscience.epfl.ch/record/206951/files/htrack.pdf

Finger / Hand Recognition with Kinect

More articles: