Computer vision project, implementing Intelligent Scissors

10 Sep, 2012 Uncategorized Le Zhang

For a project in my computer vision class, I was to implement Intelligent Scissors. The details of this algorithm is described in the paper below:

http://www.cs.cornell.edu/courses/cs4670/2012fa/readings/mort-sigg95.pdf

The algorithm dates back to 95′ and has since then been incorporated in most advanced image editing software. Basically, the user clicks a point on the image and a line path appears and traces a path from that point to the mouse. The path is intelligent in the sense that it follows only significant edges, so that when cropping it will for the most part cleanly crop objects.

The barebone structure of the code was provided by default, along with the gui. What I had to implement were the functions that makes intelligent scissors possible.

The Procedure

Each pixel in an image is interpreted as a node with 8 neighboring nodes:

Each node is equivalent to the neighbors of a pixel. The intensity values of each node is used to calculate the “cost” of the center node to a neighbor. How this is computed is through the following equations:

D(link1)=|img(i+1,j)-img(i,j-1)|/sqrt(2)

D(link 0)=|(img(i,j-1) + img(i+1,j-1))/2 – (img(i,j+1) + img(i+1,j+1))/2|/2

cost(link)=(maxD-D(link))*length(link).

The first equation shows how the magnitude of the intensity across diagonal links(link 1 in this example). The second shows the horizontal link calculation(link 0). I do this for all the links, and for all RGB values. In the end, I get an average of the RBG intensity as such:

D(link) = sqrt( (DR(link)*DR(link)+DG(link)*DG(link)+DB(link)*DB(link))/3 )

Finally, I add everything to the third equation, where maxD is the largest D(link) value and length is sqrt(2) or 1 depending on if the link is diagonal or not. To apply this calculation efficiently, I use an image filter to get the costs. This is done with cross correlation.

Thus, I first implemented the ability to apply filters to images. This is comprised of two functions:

pixel_filter
image_filter

The pixel_filter does the above cost calculations a pixel at a time, although it can only take in one kernel at a time. Thus, pixel_filter loops through each kernel value and sums up the products of the kernel value with each respective pixel value. Additionally, scale and offsets are taken into consideration. What is returned is an RGB pixel indicating the result.

Image filter applies pixel_filter for the entire image and updates the result image pixel with what is returned from pixel_filter. There is a selection parameter in this function that allows for image_filter to selectively apply the filter to certain pixels. One additional thing I did to prevent the darkening of the edges is to add a black border around the origImg. This border is 2 pixels in width because that is the minimum required for a 5×5 kernel. To do this, I first created a new black image with the origImg’s dimensions plus 4 for width and height. Next, I inserted the original image into the black image.

As extra credit, I implemented the ability to have the user selectively apply filters to an area with a brush. This is useful because it allows the user to apply the livewire over only the edge. To do this I made the mouse point the top left edge of a square. The square would be the selection that is saved to the selection parameter for the filter functions. The default brushSelections[] array made this very easy to implement.

With the filters implemented, I could apply any sized kernel to the image. An example blurring filter is applied below:

with kernel 1 1 1

________ 1 1 1 divided by 9

________ 1 1 1

Normal:

Blurred filter 3 times:

Next, I implemented the function InitNodeBuf. What this function does is use image_filter to apply the link cost kernels mentioned above to the image. I iterate through all the pixels once to get the maxDlink value. Then I loop again, this time using the maxDlink to find the link cost of each link. The link cost, row and column values are then stored in an array of struct Node, with imgwidth*imgheight elements.

Now that all the nodes are initialized, I can implemented the intelligent scissors algorithm:

Begin:

initialize the priority queue pq to be empty;

initialize each node to the INITIAL state;

set the total cost of seed to be zero;

insert seed into pq;

while pq is not empty

extract the node q with the minimum total cost in pq;

mark q as EXPANDED;

for each neighbor node r of q

if r has not been EXPANDED

if r is still INITIAL

insert r in pq with the sum of the total cost of q and link cost from q to r as its total cost;

mark r as ACTIVE;

else if r is ACTIVE, e.g., in already in the pq

if the sum of the total cost of q and link cost between q and r is less than the total cost of r

update the total cost of r in pq;

End

The following is an image of what results from the above algorithm. When the user sets a seed, all the pixels in the images get the minimal path to the seed pixel computed. This is important since in our last function, minimumpath, we trace the mouse to the seed using this data.

The algorithm relies on a priority queue, which sorts all the Nodes by lowest cost. This allows it to simply extract the head of the queue to get the minimum path.

In the last function, MinimumPath, takes the input of the mouse position and constructs a linked list based on the minimum path. The initial head of the linked list is the mouse location pixel, and by accessing that pixel’s node data, we can extract the point to the next closest node. This pointer is added to the head of the linked list until the prevNode pointer becomes NULL, signifying that the seed has been reached.

With all these functions implemented, the program can now accurately trace and crop objects.

Example mask and usage: