Similarity Learning for Dense Label Transfer

June 16, 2018

Here is a bried description of our work Similarity Learning for Dense Label Transfer. This work got second position in CVPR-2018 DAVIS Challenge on Interactive Video Object Segmentation (VOS), and we were awarded Adobe creative license.

In interactive VOS, at test time we are given a set of scribbles, denoting objects of interest, through interactive input. Then, we have to segment the object in all the frames of that video. Based on the performance, we may be provided feedback as scribbles on worst segmentation output frame.

It is a challenging task as the model has to learn dense object representation from very sparse annotation and it should be able to consider scribbles provided as feedback on worst output frame. Also, the model should balance good segmentation accuracy with inference time.

In this work, we introduce a simple and flexible method for video object segmentation based on similarity learning. The proposed method can learn to perform dense label transfer from one image to the other. More specifically, the objective is to learn a similarity metric for dense pixel-wise correspondence between two images. The idea is that for a given object to be segmented, most of the embeddings are correlated to each other, so even few pixels should be sufficient to learn the similarity metric. Also, just few pixel embeddings should be enough to propagate the segmentation label to the other images.

This learned model can then be used in a label transfer framework to propagate object annotations from a reference frame to all the other frames in a video. Unlike previous methods, our similarity learning approach works fairly well across various domains, even when no domain adaptation is involved.

Using the proposed approach, we achieved the second place in the first DAVIS challenge for interactive video object segmentation, in both quality and speed tracks.


Dev


Copyright 2015 Viveka Kulharia. Last update (ET): 2020-11-03 11:24:14 +0000. Atom Feed