Discovering Objects Using Keras: An Easy Way To Speed ​​Up R-CNN or YOLO

This question may have been answered, but I did not find a simple answer to this question. I created a pipeline using Keras to classify the Simpsons characters ( dataset here ).
I have 20 classes and an image as input, I return the name of the character. It is pretty simple. My dataset contains images with the main character in the image and only the name of the character as a label.

Now I would like to add object detection by asking that I draw a bounding box around the characters in the picture and predict what character it is. I do not want to use a sliding window because it is very slow. So I thought about using a faster RCNN ( github repo ) or YOLO ( github repo ). Should I add bounding box coordinates for each image of my training set? Is there a way to make object detection (and get bounding fields in my test) without specifying coordinates for a set of workouts?

In general, I would like to create a simple object detection model, I do not know if it is possible to create a simpler YOLO or Faster RCNN.

Thanks so much for any help.

+7
deep-learning object-detection classification keras
source share
2 answers

The goal of yolo or faster rcnn is to get bounding fields. In short, yes, you will need to tag data for training.

Make a shortcut:

  • 1) What are some bounding boxes (for example, 5 characters)?
  • 2) The train is faster than rcnn or yolo in a very small data set.
  • 3) Run the full data set model
  • 4) He will get some right, not much.
  • 5) Configure rcnn faster for those that are properly limited, your training set should be much larger.
  • 6) repeat until you get the desired result.
+5
source share

Perhaps you already have a suitable architecture: "Now I would like to add object detection by asking me to draw a bounding box around the characters in the picture and predict which character."

So, you just divided the task into two parts:
1. Add an object detector to detect people to return bounding rectangles
2. Classify bounding rectangles using the convection already created.

For part 1, you should find it useful to go with a function detector (e.g. connet prenained to COCO or Imagenet) with an object detector (still YOLO and Faster-RCNN) on top to detect people. However, you may find that people in "cartoons" (let them say that the Simpsons are people) are not recognized properly, because the function detector is not trained in cartoon-based images, but in real images. In this case, you can try reassembling several layers of the function detector on cartoons to find out the features of the cartoon, according to the transfer training .

0
source share

All Articles