I used GoogleNet pre-validation from https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet and finalized it with my own data (~ 100K images, 101 class). After one day of training, I reached 62% in the top 1 and 85% in the top 5 classification and tried to use this network to predict multiple images.
I just followed the example from https://github.com/BVLC/caffe/blob/master/examples/classification.ipynb ,
Here is my Python code:
import caffe import numpy as np caffe_root = './caffe' MODEL_FILE = 'caffe/models/bvlc_googlenet/deploy.prototxt' PRETRAINED = 'caffe/models/bvlc_googlenet/bvlc_googlenet_iter_200000.caffemodel' caffe.set_mode_gpu() net = caffe.Classifier(MODEL_FILE, PRETRAINED, mean=np.load('ilsvrc_2012_mean.npy').mean(1).mean(1), channel_swap=(2,1,0), raw_scale=255, image_dims=(224, 224)) def caffe_predict(path): input_image = caffe.io.load_image(path) print path print input_image prediction = net.predict([input_image]) print prediction print "----------" print 'prediction shape:', prediction[0].shape print 'predicted class:', prediction[0].argmax() proba = prediction[0][prediction[0].argmax()] ind = prediction[0].argsort()[-5:][::-1] # top-5 predictions return prediction[0].argmax(), proba, ind
In my deploy.prototxt file, I changed the last layer just to predict my 101 classes.
layer { name: "loss3/classifier" type: "InnerProduct" bottom: "pool5/7x7_s1" top: "loss3/classifier" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 101 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } layer { name: "prob" type: "Softmax" bottom: "loss3/classifier" top: "prob" }
Here is the distribution of softmax output:
[[ 0.01106235 0.00343131 0.00807581 0.01530041 0.01077161 0.0081002 0.00989228 0.00972753 0.00429183 0.01377776 0.02028225 0.01209726 0.01318955 0.00669979 0.00720005 0.00838189 0.00335461 0.01461464 0.01485041 0.00543212 0.00400191 0.0084842 0.02134697 0.02500303 0.00561895 0.00776423 0.02176422 0.00752334 0.0116104 0.01328687 0.00517187 0.02234021 0.00727272 0.02380056 0.01210031 0.00582192 0.00729601 0.00832637 0.00819836 0.00520551 0.00625274 0.00426603 0.01210176 0.00571806 0.00646495 0.01589645 0.00642173 0.00805364 0.00364388 0.01553882 0.01549598 0.01824486 0.00483241 0.01231962 0.00545738 0.0101487 0.0040346 0.01066607 0.01328133 0.01027429 0.01581303 0.01199994 0.00371804 0.01241552 0.00831448 0.00789811 0.00456275 0.00504562 0.00424598 0.01309276 0.0079432 0.0140427 0.00487625 0.02614347 0.00603372 0.00892296 0.00924052 0.00712763 0.01101298 0.00716757 0.01019373 0.01234141 0.00905332 0.0040798 0.00846442 0.00924353 0.00709366 0.01535406 0.00653238 0.01083806 0.01168014 0.02076091 0.00542234 0.01246306 0.00704035 0.00529556 0.00751443 0.00797437 0.00408798 0.00891858 0.00444583]]
This is like random distribution without meaning.
Thanks for any help or hint and best wishes, Alex