Probabilistic fusion of stereo with color and contrast for bi-layer segmentation

Vladimir Kolmogorov, Antonio Criminisi, Andrew Blake, Geoffrey Cross and Carsten Rother.

In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 28(9):1480-1492, September 2006.
Preliminary version ("Bi-layer Segmentation of Binocular Stereo Video") appeared in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2005 (best paper honorable mention award).


This paper describes models and algorithms for the real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from color/contrast or from stereo alone is known to be error-prone. Here, color, contrast and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, Layered Dynamic Programming (LDP), solves stereo in an extended 6-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive color model that is learned on the fly, and stereo disparities are obtained by dynamic programming. The second algorithm, Layered Graph Cut (LGC), does not directly solve stereo. Instead the stereo match likelihood is marginalized over disparities to evaluate foreground and background hypotheses, and then fused with a contrast-sensitive color model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut.

Both algorithms are evaluated with respect to ground truth data and found to have similar perfomance, substantially better than stereo or color/contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output.


PAMI version: [.pdf]
CVPR version: [.pdf]