Energy minimization with Graph Cuts

Graph Cuts can be used to conveniently minimize energies often used in computer vision. This page is a quick summary of Boykov, Veksler, and Zabih paper "Fast Approximate Energy Minimization via Graph Cuts". Energies that can be minimized are described, then two minimization algorithms are summarized, alpha-expansion and alpha-beta swap, and finally two practical examples are shown.

Energy that can be minimized

Graph Cuts finds the optimal solution to a binary problem. However when each pixel can be assigned many labels, finding the solution can be computationally expensive. For the following type of energy, a serie of graph cuts can be used to find a convenient local minimum:

$\begin{eqnarray*} E(f) & = & \sum_{p \in \mathcal{P}} D_p\left(i_p,f_p\right) + \sum_{p,q \in \mathcal{N}} V_{p,q}\left(f_p,f_q\right) \end{eqnarray*}$

The first term is known as the data term. It ensures that the current labeling is coherent with the observed data . It penalizes a label to pixel if it is too different with the observed data .

The second term is known as the smooth term. It ensures that the overal labeling is smooth. It penalizes two neighboring labels and if they are too different.

On this smoothing term, there are 3 constraints:

$V(\alpha,\beta) = 0 \Leftrightarrow \alpha = \beta$ , or
$V(\alpha,\beta) \neq 0 \Leftrightarrow \alpha \neq \beta$
$V(\alpha,\beta) = V(\beta,\alpha) \geq 0$
$V(\alpha,\beta) \leq V(\alpha,\gamma) + V(\gamma,\beta)$

The first two terms tell that an energy between two different labels $\alpha$ and $\beta$ should be non-zero. If it is zero, that means the two labels are the same.

The last term defines the triangle rule. A shortcut is always cheaper or similar than taking the whole path.

If the smooth term only satisfies the first two terms, it is said a semi-metric term. If the last term is also satisfied, it is said a metric term.

The alpha-expansion algorithm can only be used with metric term. Otherwise, the alpha-beta swap can be used with semi-metric and metric term.

Minimization algorithm

As just stated, the alpha-expansion algorithm can only be used when the smooth term is metric. If it is otherwise semi-metric, the alpha-beta swap algorithm will be used.

Whenever the alpha-expansion algorithm is used, it is guaranteed that the local minimum is within a known factor of the global minimum. This factor is:

$\begin{eqnarray*} 2c & = & 2 \max_{p,q \in \mathcal{N}} \frac{\max_{\alpha\neq\b... ...,\beta\right)}{\min_{\alpha\neq\beta}V\left(\alpha,\beta\right)} \end{eqnarray*}$

By definition this factor can be as small as 2. That means if the maximum possible smooth term is $V(\alpha,\beta)=1$ and the miniminum possible is $V(\alpha,\beta)=0$ , the local minimum that the alpha-expansion algorithm will find is at most twice the global minimum $E^\ast$ .

The alpha-beta swap does not guarantee any closeness to the global minimum.

Alpha-Expansion

The main idea of the alpha-expansion algorithm is to successively segment all $\alpha$ and non- $\alpha$ pixels with graph cuts and the algorithm will change the value of $\alpha$ at each iteration. The algorithm will iterate through each possible label for $\alpha$ until it converges.

At each iteration, the $\alpha$ region $\mathcal{P}_\alpha$ can only expand. This changes somehow the way to set the graph weights. Also when two neighboring nodes does not currently have the same label, an intermediate node is inserted and links are weighted so they are relative to the distance to the $\alpha$ label.

Figure 1: Alpha-expansion graph setting

$w(\alpha,p)$	=	$D(\alpha)$
$w(\bar{\alpha},p)$	=	, if $p \notin \mathcal{P}_\alpha$
$w(\bar{\alpha},p)$	=	$\infty$ , if $p \in \mathcal{P}_\alpha$
	=	$V(f_p,\alpha)$ , and similar for
$w(a,\bar{\alpha})$	=
	=	$V(f_p,\alpha)$ , when
		$p\in\alpha$ if the link $p - \alpha$ is cut
		$p\in\bar{\alpha}$ if the link $p - \bar{\alpha}$ is cut

This algorithm is shown to be a slightly faster than the following algorithm.

Alpha-Beta Swap

The main idea of the alpha-beta swap algorithm is to successively segment all $\alpha$ pixels from $\beta$ pixels with graph cuts and the algorithm will change the $\alpha-\beta$ combination at each iteration. The algorithm will iterate through each possible combination until it converges.

Within an iteration the graph is constructed in a normal way so it can segment efficiently between the $\alpha$ region and the $\beta$ region. Special care must be taken with nodes that are neither in the $\alpha$ nor in the $\beta$ region. That means, for a pixel, the terminal link weight is the data term plus the sum of all links to neighbors which are neither in the $\alpha$ nor in the $\beta$ region.

Figure 2: Alpha-Beta swap graph setting

$w(p,\alpha)$	=	$D(\alpha) + \sum_{\begin{array}{c}q\in\mathcal{N}_p\ q\notin\mathcal{P}_\alpha\ q\notin\mathcal{P}_\beta\end{array}} V(f_q,\alpha)$ , for $p \in \mathcal{P}_\alpha$ or $p\in\mathcal{P}_\beta$
$w(p,\beta$	=	$D(\beta) + \sum_{\begin{array}{c}q\in\mathcal{N}_p\ q\notin\mathcal{P}_\alpha\ q\notin\mathcal{P}_\beta\end{array}} V(f_q,\beta)$ , for $p \in \mathcal{P}_\alpha$ or $p\in\mathcal{P}_\beta$
	=	$V(\alpha,\beta)$
		$p\in\alpha$ if the link $p - \alpha$ is cut
		$p\in\beta$ if the link $p - \beta$ is cut

Image restoration

A quick implementation of the alpha-expansion algorithm allows image restoration. Here an image with embedded squares of intensity values 255, 191, 128, 64. Noise were added to the original image so intensity are $i'=i \pm 10$ . Possible labels are all integers between 0 and 255. The algorithm will perform $\alpha - \bar{\alpha}$ segmentations until it converges.

The data term used here is a simple squared difference $D(i_p,f_p) = \left(i_p-f_p\right)^2$ . The smoothing term used here is the Potts model $V(f_p,f_q) = \lambda T(f_p \neq f_q)$ , where $T(\mathrm{expr})=1$ if $\mathrm{expr}$ is true, or zero otherwise. Doing some graph analysing, it turned out that using $\lambda=2331$ will make internal link weights competitive with the terminal links.

Figure 3: Restoration using the alpha-expansion algorithm


Observed image	Restored image at convergence

Here only a few seconds is needed for the algorithm to converge.

Deformation map

Recovering the deformation map could be useful in a registration framework. Here again the alpha-expansion could be used. An original image is deformed. The upper part is moved right and the bottom part of the image is moved left. The resulting image is .

Here the data term used is the squared difference between the pixel intensity and , where is the translation of pixel from its original position. The smooth term used here is again the Potts model with $\lambda=25$ . Possible labels are all possible translation of $\pm 15$ horizontal pixels and $\pm 3$ vertical pixels.

Figure 4: Knowing two images, before and after deformation, the alpha-expansion algorithm is used to recover the deformation between the two images


Real deformation applied to original image	Computed horizontal deformation map

Here the real deformation (on left of last picture) applied is unknown. The algorithm runed within a couple of minutes until convergence. Here many pixels have been incorrectly labeled, but the overal deformation map shows indeed that the upper part moved right (bright means it moved right), and the bottom part moved left (dark means it moved left).

Herve Lombaert 2006-11-29