In recent years, graph cuts have emerged as a powerful optimization technique
for minimizing energy functions that arise in low-level vision problems. Graph
cuts avoid the problems of local minima inherent in other approaches (such as
gradient descent). The goal of this thesis is to apply graph cuts to a
classical computer vision problem --- scene reconstruction from multiple
views, i.e. computing the 3-dimensional shape of the scene.
This thesis provides a technical result which greatly facilitates the derivation of the scene reconstruction algorithm. Our result should also be useful for developing other energy minimization algorithms based on graph cuts. Previously such algorithms explicitly constructed graphs where a minimum cut also minimizes the appropriate energy. It is natural to ask for what energy functions we can construct such a graph. We answer this question for the class of functions of binary variables that can be written as a sum of terms containing three or fewer variables. We give a simple criterion for functions in this class which is necessary and sufficient, as well as a necessary condition for any function of binary variables. We also give a general-purpose graph construction for functions that satisfy our criterion.
Using this result, we develop several algorithms for the scene reconstruction problem. We formulate this problem in an energy minimization framework. The energy that we minimize encodes all the constraints of the problem: it treats the input images symmetrically, handles visibility properly, and imposes spatial smoothness while preserving discontinuities. We discuss several smoothness terms that we can use. Experimental results show that our algorithms for two views are among the top performers, and that the accuracy of the reconstruction is greatly improved when more views are used.
In the last chapter of the thesis we address the issue of the efficiency of min-cut/max-flow algorithms for computer vision applications. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. We benchmark these algorithms on a number of typical graphs arising in vision. In many cases our new algorithm works several times faster than other methods.