-
Notifications
You must be signed in to change notification settings - Fork 434
How to recognize custom objects
The real power of the convolutional neural network approach to image recognition comes from its flexibility. It's fairly easy to retrain the top levels of a network to spot new kinds of objects, even on a low-powered mobile device. I'll show you how you can use the LearningExample application to spot an object or logo you care about, and then how to add that capability to your own application.
You'll need XCode and an iPhone, preferably a 5 or 5S for best performance. Copy this git repository to your local machine, load the LearningExample project from the examples folder, build it, and run it on your iPhone.
You should see a screen like this. The first thing you need to do to create your model is capture 100 frames that contain the object you want to recognize. These are the 'positive' images, and once you press the 'Start Learning' button, the phone will capture whatever's in the viewfinder as an example of what it should be looking for.
Picking good positive examples is an art form, not a science, but here are some tips. You should think about how the object you want to recognize is likely to appear when users are pointing their phones at it. For this example I'll be using a wine bottle, and I've made the choice that I want it to be good at recognizing an upright wine bottle from a couple of feet away, since that's the likely way they'll appear in the restaurant photos I happen to be interested in. That means I won't try to put the bottle on its side, take pictures from above, or do close-ups of the label when I'm collecting examples.
It is important to make sure that it's really the bottle that it's recognizing and not other objects in the background though. To help with this I am going to shoot from different angles so it's in front of different objects, and move the bottle around my desk a bit to vary the lighting.
Once you have a plan for collecting your positive images, press the 'Start Learning' button and the app will continuously capture examples frames over the course of a minute or so, depending on the processing speed of your phone. You'll see a progress bar at the top, once that's completely blue the message should change again.