Seam carving

Examples - Modern

Last edited: March 26. 2021


You are faced with the following problem: Given an image with a considerable amount unimportant space. You want to resize the image in such a way that the contents of the image are preserved while only the unimportant pieces are removed. There are several situations in which this may be a relevant issue to consider. For instance, if you are making a web page you will most likely want the content on the page to have roughly the same size as the window of the browser is reduced. Reducing the window size should (at least up to some critical point) only reduce empty space and not distort the content within it. Another situation where this is relevant is in image sharing and storing. If you have limited storing space or the medium you are to post the image on only accepts images of certain dimensions, it might be relevant to have a method for resizing the image without distorting the proportions of its contents.

Seam carving is an algorithm that does exactly this; it is a content-aware image-resizer. Although the problem itself seems neither correlated to physics nor programming, there are in fact beautiful analogues to draw from this algorithm to physical systems and it involves the use of dynamic programming.

(Much of the implementation is based on what is presented in this lecture on Seam Carving given in a course on Computational Thinking at MIT)

Description of the algorithm

Here we will outline the fundamental algorithm we are to implement for resizing the image. The terms in italic will be defined more precisely in the following sections.

  • Choose an image you want to rescale in the horizontal direction.
  • Choose the number of pixels you want to rescale the image by, $p$.
  • Repeat the following $p$ times:
    • Calculate the energy of each pixel in the image.
    • Find the seam from the top to the bottom of the image with the smallest energy.
    • Remove the seam.

Intuitively we seek in each iteration a path connecting the top and bottom of the image such that as little information as possible is contained in the pixels constituting the path.

The energy of an image

In principle, the main task of the problem is to find out which pixels of an image can be removed with the smallest amount of loss possible. It is therefore convenient to associate an energy function that measures the information content of each pixel.

As a starter, let's introduce some notation and definitions to be able to make our ideas more precise. Denote by $\mathbf{I}$ an $n\times m$ image. Intuitively, we think of the image as essentially a $n\times m$ array of scalar values representing the brightness of each pixel of the image. However, in our discussion we are going to benefit from defining the image as a mapping \begin{align} \mathbf{I} : [1,\dots, n] \times [1,\dots,m] &\to \mathbb{R} \label{eq:img}\\ (i,j) &\mapsto \mathrm{Brightness}(i,j), \nonumber \end{align} assigning to each of the pixels of the image a level of brightness. We define the brightness of a pixel as the mean value of its red, green and blue-content, with $1.0$ being the highest value.

In [2]:
# importing useful packages
using  Images, LinearAlgebra, ImageFiltering, Statistics, ImageView, Plots

# loading an image
img_path = "./wave.jpg"
image = load(img_path)
Out[2]: