Open source pure python Selective Search and advanced object recognition with Labellio

There are various use cases for object recognition, categorized as follows.

  • Object Classification to recognize an object assuming one in an image
  • Localization to locate bounding box where individual objects are captured, assuming multiple objects are in an image
  • Segmentation to assign labels to pixels
  • There is an approach to learn description of visual semantics called Image Captioning, combining feature extracted from CNN into RNN

In the past, we explained labellio_cli, which does this Object Classification simplest amongst these.  To add Localization to available tasks with Labellio, we are happy to open source pure python Selective Search in GitHub, which you can also use in other tools such as Caffe.

What is Selective Search?

You could do Exhaustive Search that stupidly scans windows by sliding them to achieve Localization.  However, there would be too many regions to process and you would hit limitation on the shape or size you can process.

A solution to that is an algorithm called Selective Search that groups similar regions at pixel level and selects candidates from them.  It is a well known approach called R-CNN that uses Selective Search and CNN.  A similar technique is used in GoogleNet pipeline.  Unfortunately, there has been not easy-to-use open implemetation of it.  Caffe has example code that does this Selective Search, but it requires MATLAB, which itself is hard to install and setup, and you may give up like us.  In order to solve this situation today, we are open sourcing Selective Search module in pure python module.

This module does not require or assume Labellio, and it is in pure python, you can use it for any other Deep Learning libraries other than Caffe, such as Chainer, and we believe this will be used as a general solution for object recognition.  A bit off topic, but there is another proposed technique called OverFeat, which embodies regression for localization in its neural network.

Selective Search usage: Use Labellio to train cake classification model and extract object regions by Selective Search

As an example of how powerful Selective Search is, the following is to recognize different cakes in an image with multiple of them in a plate.

First of all, let’s build a cake classifier.  We collect training data of mont blancs, cheese cakes and shortcakes from Bing search.  We are not sure if the image search returns expected results, so go through the manual labelling mode.  As expected, the search results included some other objects that are not actually mont blanc cakes.

9RRXVYu1O-xoOhhBduyTLzXJ3PPzU5E4sU2ePFB_2rmp5EPD9lBj85fMsMLpI0-KWQVYdXYhjVmh02M1qFSffU2iznK-7WZsAOqzxNd4mEpeKy0fomH6dbd5vNUUlN4SFdET_ks

 

Have a coffee to wait until the training completes.

 

bh5KElinS2dOiFEL9SR-KoyluA17Ar7zhiE7WU0SBjPgzCfoUiJG7SYbgvqSk2kRY_NXDpOSUEToC7ynESNhcabiEAjbaoX_0Jl6HNH6MmWlI-ewzM_iOd1DGvoNWVg_zLBI2No

And download the model once it completes.

ax0pmY1fYipyUtXSqs3l4StKPUvGTxCO4oYbmm1Qm_CVur4EEPFDGL40TjokRooxj0KSIRGFzbPhbZOkyx-ZjIhYRu9x2UZTYevgjsWRBizHkA_TfNxp8EG-l7Y_BCitbse3Ui4

That’s it for creating a model.  The accuracy is as good as 96% and enough to use in this demo.

Detect candidate regions by Selective Search

Next is the main topic of Localization using Selective Search.  We are using the following image that has two cakes in it.

axv-aRhuGW-8KzEBqhOFSHCo-vtMW-CJbODOX2WY9ravYJccmrzKkiF7ANKN_w5m3I2BK0Al1XRlvIU6_aw-mvv950o8WLyo83jrH5PrWNZZUdq_NI5u2V9PExUHkMMX_x2nFFAfrom https://flic.kr/p/7CoYmV

Selective Search subdivides the image with an exsting algorithm and continuously combines them into bigger regions based on the similarity.  In the original paper, Felzenswalb is used for the initial subdivision.  It will look like the following if you color the extracted small regions.

zoE6tkUFQUIqniJvYXq7OOg8mzTfAl25So_esR5Nfm8dLVRxbGETsk91Lg66W75oimDcOvTXIqIOd4pYFvuz8UvDrFdKTyIHVGnBkpd8N_PAus-l4T0J4fbhLmMfdiu_wmGm1vg

It combines these subdivided regions into bigger one by looking at similarity of color histgrams and textures, and finally builds one region.  This is done in the Alapaca’s Selective Search module by calling an API.  It is relatively simple implementation, but the result is good.  We also welcome pull requests if you can improve something such as speed.

You can install it via pip if it’s installed.

$ pip install selectivesearch

There is only one single API, selective_search, and you pass loaded image and parameters.  You can see an example here.

The following is to show the bounding boxes as the results of Selective Search.

eqM17O1Bd_8k0lh6iEkegGExMTMI9177amyF4iTqzrg4W2xE_81uUqj3X6LNz7utRDio1b0fKiMCzgkqx8TgLDfJhzkcqOQxZuQNg40Ekopt__5uiVjYhr2xuAJao8UBcuTfX8U

Finally, you classify each region using CNN.  We pass sub-images into Labellio CLI and get the following result.  We set a threshold of the result probability to filter noise outside of interesting ones.

z7uheDSNviGcq3-cECBVUBouo1F6D4cGnqzgzkzg3tEVQS5nbilZAqQRvij5jh0gLrI8LL9ckFHAT6T_TCMIsoRPHbH4Q4GEsOzNmUZAXRNQ4tnTtUCT0FXeoOg4HSjcu2_Xksc

As you can see, you can easily build localized object recognition using Selective Search and CNN.  We hope this helps you use more of Labellio with our Selective Search implementation.

Also read...

Leave a Reply

Your email address will not be published. Required fields are marked *