Michael Wu (myw9)

CS 5670: Computer Vision

September 24, 2012

Project 2: Feature Detection and Matching

I.                  Design

Feature Detection

Feature detection was performed by computing Harris values c(H) using 2x2 weighted Harris matrices H. To provide invariance to rotation, a 5x5 Gaussian mask was used to assign weights to each Harris matrix using neighboring pixels. For computational efficiency, Harris values were estimated using the determinant and trace values of the Harris matrix. When selecting specific pixel locations to represent features, a simple dynamic thresholding method was used to select an appropriate Harris value threshold. The mean (µ) and standard deviation (σ) of Harris values were calculated across the image. Assuming the values follow a roughly Gaussian distribution, an initial threshold of µ + 2σ, which results in detecting approximately 2.2% of the pixels as features.

 

gaussian_dist.png

Figure 1. Gaussian Distribution
(http://en.wikipedia.org/wiki/Normal_distribution)

 

Using this initial thresholding value, local maxima were identified in a 5x5 window and used as feature locations. If the number of detected features is less than the MIN_FEATURES constant parameter, then the dynamic method incrementally decreases the threshold until this condition is met.

 

The pixel location and angle is specified for each detected feature. To determine the angle or orientation of the feature, a 7x7 Gaussian kernel was used to apply a low-pass filter. The filter is used to reduce noise in the image that could drastically change the gradient at a single location. After pre-filtering, 3x3 sobel filters were applied to obtain the gradients in the x and y direction. The angle is then calculated by taking the arctangent of the y-grad divided by the x-grad.

Feature Description

Two feature descriptors were implemented: (1) a simple 5x5 square window descriptor (without orientation) and (2) a simplified MOPS descriptor. In the previous feature detection stage, pixel locations within 20 pixels of the image boundary are ignored and not used as features. This prevents descriptors from having varying lengths, which would otherwise cause issues during matching.

 

The simplified MOPS descriptor sub-samples 8x8 patches from a 41x41 pixel region around the feature using a low-passed image (7x7 Gaussian kernel). Low-pass filtering the image prior to sub-sampling prevents aliasing in the reduced image. The 8x8 patch is oriented by using the inverse matrix of the rotation transform and applying bilinear interpolation to estimate the correct pixel value of the rotated point.

 

bilinear_interpolation.png

Figure 2. Bilinear Interpolation
(http://en.wikipedia.org/wiki/Bilinear_interpolation)

 

*Note: The rotation is applied after centering the origin on the feature location. After rotation and interpolation, the 8x8 patch is normalized to have a zero mean and unit variance. The normalization provides invariance to affine intensity changes. 

Feature Matching

The “ratio test” was implemented as the matching score, where the score is equal to the distance of the current feature to the best feature in the second image divided by the distance of the current feature to the second best feature in the other image. This ratio is between 0 and 1, where 0 indicates a strong match and 1 indicates an ambiguous match.

I.                  Evaluation

Plots

graf_roc.png

Figure 3. “Graf” Results

 

“Graf” AUC Summary

Descriptor/Matching Method

Area Under Curve

Simple + SSD

0.673416

Simple + Ratio Test

0.708075

MOPS + SSD

0.900277

MOPS + Ratio Test

0.928643

 

yose_roc.png

Figure 4. “Yosemite” Results

 

“Yosemite” AUC Summary

Descriptor/Matching Method

Area Under Curve

Simple + SSD

0.902653

Simple + Ratio Test

0.888056

MOPS + SSD

0.963887

MOPS + Ratio Test

0.976831

 

 

harris_graf.png

 

Figure 5. Harris Values of “Graf” img1.ppm (harris.tga)

 

 

harris_yosemite.png

 

Figure 6. Harris Values of “Yosemite” Yosemite1.jpg (harris.tga)

 

Benchmark Results

                                                                        Average AUC (Leuven)

Matching \ Descriptor

5x5 Window

MOPS

SSD

0.357021

0.702188

Ratio Test

0.557175

0.727467

                                                                       
Average AUC (Bikes)

Matching \ Descriptor

5x5 Window

MOPS

SSD

0.356457

0.645929

Ratio Test

0.516427

0.649424

 

                                                                        Average AUC (Wall)   

Matching \ Descriptor

5x5 Window

MOPS

SSD

0.397541

0.682856

Ratio Test

0.564342

0.661867

                                                                       

            Strengths and Weaknesses

                                                                        By design, the simplified MOPS descriptor with the Harris corner detection is relatively invariant to translation (using small patches), rotation (using oriented patches), and affine intensity changes (normalizing pixel values). However, the implementation was not designed to be invariant to changes in image scale. Some other weaknesses include: (1) Limited feature detection near image boundaries (due to the 20 pixel cut-off mentioned in the design section) and (2) No invariance to 3D transformations.

II.               Additional Images

 

FPGA Features - Harris Detection, Simplified MOPS Descriptor, and Ratio Test Matching

fpga.jpg