First IxorThink developed a model allowing Spott to retrieve duplicate and similar products from their product database.
The model takes as input nothing more than an image of the product and outputs images and id's of duplicate or similar products.
IxorThink used Dhashes to retrieve duplicate images. VGG16 feature extraction and an annoy index to retrieve similar images.
Secondly IxorThink build a model, a visual fashion recognition model, that allows to search a product in their product database given a street image of that product. The difficulty lays in filtering out the background to recognise the product. IxorThink used siamese networks and triplet loss function to develop and train a model to do the job.
Thirdly IxorThink developed a video-tracker. An object, within a bounding box, is tracked during several video frames. The image of the bounding the box is used as input to the visual fashion recognition model.
So all models come together, leading to a software application that enables tracking and recognising a product based upon an image within a video-frame.