Open source pure python Selective Search and advanced object recognition with Labellio
There are various use cases for object recognition, categorized as follows.
- Object Classification to recognize an object assuming one in an image
- Localization to locate bounding box where individual objects are captured, assuming multiple objects are in an image
- Segmentation to assign labels to pixels
- There is an approach to learn description of visual semantics called Image Captioning, combining feature extracted from CNN into RNN
In the past, we explained labellio_cli, which does this Object Classification simplest amongst these. To add Localization to available tasks with Labellio, we are happy to open source pure python Selective Search in GitHub, which you can also use in other tools such as Caffe.
What is Selective Search?
You could do Exhaustive Search that stupidly scans windows by sliding them to achieve Localization. However, there would be too many regions to process and you would hit limitation on the shape or size you can process.
A solution to that is an algorithm called Selective Search that groups similar regions at pixel level and selects candidates from them. It is a well known approach called R-CNN that uses Selective Search and CNN. A similar technique is used in GoogleNet pipeline. Unfortunately, there has been not easy-to-use open implemetation of it. Caffe has example code that does this Selective Search, but it requires MATLAB, which itself is hard to install and setup, and you may give up like us. In order to solve this situation today, we are open sourcing Selective Search module in pure python module.
This module does not require or assume Labellio, and it is in pure python, you can use it for any other Deep Learning libraries other than Caffe, such as Chainer, and we believe this will be used as a general solution for object recognition. A bit off topic, but there is another proposed technique called OverFeat, which embodies regression for localization in its neural network.
Selective Search usage: Use Labellio to train cake classification model and extract object regions by Selective Search
As an example of how powerful Selective Search is, the following is to recognize different cakes in an image with multiple of them in a plate.
First of all, let’s build a cake classifier. We collect training data of mont blancs, cheese cakes and shortcakes from Bing search. We are not sure if the image search returns expected results, so go through the manual labelling mode. As expected, the search results included some other objects that are not actually mont blanc cakes.
Have a coffee to wait until the training completes.
And download the model once it completes.
That’s it for creating a model. The accuracy is as good as 96% and enough to use in this demo.
Detect candidate regions by Selective Search
Next is the main topic of Localization using Selective Search. We are using the following image that has two cakes in it.
Selective Search subdivides the image with an exsting algorithm and continuously combines them into bigger regions based on the similarity. In the original paper, Felzenswalb is used for the initial subdivision. It will look like the following if you color the extracted small regions.
It combines these subdivided regions into bigger one by looking at similarity of color histgrams and textures, and finally builds one region. This is done in the Alapaca’s Selective Search module by calling an API. It is relatively simple implementation, but the result is good. We also welcome pull requests if you can improve something such as speed.
You can install it via pip if it’s installed.
$ pip install selectivesearch
There is only one single API, selective_search, and you pass loaded image and parameters. You can see an example here.
The following is to show the bounding boxes as the results of Selective Search.
Finally, you classify each region using CNN. We pass sub-images into Labellio CLI and get the following result. We set a threshold of the result probability to filter noise outside of interesting ones.
As you can see, you can easily build localized object recognition using Selective Search and CNN. We hope this helps you use more of Labellio with our Selective Search implementation.