In this project I make use of homographies to warp images such that transforms between images are projective. This is a technique often used to create panoramas.
I took two images of the campanile standing in the same spot but slightly turning my phone for the second photo. I also took several photos of rectanuglar objects in my room to rectify as well as some other scenes that I anticipate mosaicing.
Campanile Image 1
Campanile Image 2
Room Image 1
Room Image 2
Landscape Image 1
Landscape Image 2
Coffee Table
Dali Book
Vulfpeck Poster
A homography is a transformation that maps points from one plane to another using a 3x3 matrix. Given two images, we want to find the homography matrix H such that corresponding points in both images satisfy:
Here, p and p' are homogeneous coordinates of corresponding points in the two images, and H is the 3x3 homography matrix. To recover H, we need a set of point correspondences between the two images.
The homography matrix H has 8 degrees of freedom (since the 9th element can be set to 1 for scaling). Given at least 4 pairs of corresponding points, we can solve for the elements of H using a system of linear equations.
For each pair of points (x, y) in the first image and (x', y') in the second image, we can derive the following equations:
To eliminate the non-linear dependency on w', we divide by w' and express the system as:
Rearranging the terms, we obtain two linear equations for each correspondence:
We can represent the system of equations in matrix form as Ah = b, where h is a vector containing the 8 unknown elements of the homography matrix, and A and b are constructed from the point correspondences.
For each correspondence, the following two rows are added to A:
We can then solve this overdetermined system using a least-squares method to recover the vector h, which can be reshaped into the 3x3 matrix H:
Once the homography matrix H is computed, it can be used to warp one image into the coordinate space of another image, or to rectify an image by mapping it to a known geometric shape (e.g., a square or rectangle).
Once the homography is estimated, the next step is to re-project one image onto another to create a seamless stitch. Let's assume we are projecting Image 1 onto Image 2. To ensure smooth re-projection, we use an inverse warp technique. This begins by calculating the bounding box for the final image. We first take the four corners of Image 1 and apply the homography to map it onto the coordinate space of Image 2, determining the area where the final image will appear. Next, we interpolate the pixel values from Image 1 to find their corresponding locations in the output image.
To confirm that my warping function and homography is working correctly, I rectified a few images. This means taking photos of a rectangualr object at an angle so that the object does not appear perfectly rectangualr, then warping that image onto a rectangle to show the object 'straight on'
Coffee Table
Rectified Coffee Table Rectified
Dali Book
Rectified Dali Book
Vulfpeck Poster
Rectified Vulfpeck Poster
Now for the fun part I blended together images to create a mosaic. I used my two images of the campanile. The means to do this is to use my warp function to warp one image onto the other using the homography created from the correspondence points. Next you need to stitch the images together. However, if you simply just take an average of pixels when stitching them together you will get inconsistent results. A better option is to use blending with an alpha mask which allows for a smooth transition where the imags overlap. You further need to keep track of if a given pixel is contributed to by one or two images and normalize accordingly.
Image 1
Image 2
Mosaiced Campanile
You might notice in the bottom middle of my image there is an edge where the images don't line up too well. This is due to the fact that I turned my body instead of the camera when taking these photos. This inspired me to take more photos of the inside of my room and a beautiful landscape to create more mosaics, this time being careful to only turn the camera around its focal point. Below are the results of these mosaics.
Image 1
Image 2
Mosaiced Room
Image 1
Image 2
Mosaiced Landscape
In this part of the project the goal is to create mosaics like in part A, but instead fo manually selecting correspondences, I'll automaticlaly detect significant features in images and use those points for correspondences. Much os this part of the project is based on the “Multi-Image Matching using Multi-Scale Oriented Patches” by Brown et al. paper.
The first approach to detecting features uses Harris Corner detection which is detailed in section 2 of the MOPS paper. The idea is to take an outer product of gradients in the x and y direction giving you a harris matrix. Then looking at a window of this matrix you take the harmonic mean of the eigenvalues to determine a points "corner-strength". If both eigenvalues are small that's a flat region, if one eigenvalue is large and one small, that's an edge, if btoh eigenvalues are large, that's a corner. All points above a certain strength that aren't too close are considered interest points. Unfortunately the result of this is a very dense set of points that would be inefficient to use.
Output of harris corner detection before suppression
One solution to our overly dense set of interest points is to use adaptive non-maximal suppression. This technique is detailed in section 3 of the MOPS paper. At a basically level we would like to have say n points that are relatively spread out and are highly interesting. The algorithm to give us exactly this result defines a supression radius which is the minimum distance to a another keypoint with a significantly stronger response strength, mathematically: \( r_i = \min_i \left| \vec{x_i} - \vec{x_j} \right| \quad \forall \vec{x_j} : h(\vec{x_i}) < c_{\text{robust}} h(\vec{x_j}) \). We compute this for every keypoint and then take the K highest points. Per the paper's recommendation I set \(c_{robust} = 0.9\), and took 500 points.
Output of harris corner detection with suppression
Using the keypoints found from ANMS, we can now extract 8x8 feature patches from a larger 40x40 patch around each keypoint. The idea is to downsample from the 40x40 patch and then bias/gain noramlize. 5 features form the Campanile image are shown below.
Feature Extraction from Campanile Image 1
Now that we have have features extracted for images, we can match these features and find good pairs of corresponding points between two images. For this I used Lowe's thresholding as detailed in section 5 of MOPS. This technique is leveraged ont he idea that the NN-1(nearest neighbor) will be a significantly better match than the NN-2(second nearest neighbor); thus, we take a pair if the ratio of 1-NN / 2-NN is below some threshold. Figure 6b in MOPS indicates that a good threshold is 0.4 which I found to be much too harsh in my case. I found that it's best to test a few thresholds for the specific pair of images and to err on the side of allowing some bad matchings in and rely on the RANSAC algorithm to produce a good homography.
Campanile Matching w/ threshold = 0.4
Campanile Matching w/ threshold = 0.65
Campanile Matching w/ threshold = 0.8
Landscape Matching w/ threshold = 0.4
Landscape Matching w/ threshold = 0.65
Landscape Matching w/ threshold = 0.8
Room Matching w/ threshold = 0.4
Room Matching w/ threshold = 0.65
Room Matching w/ threshold = 0.8
Even after our feature matching we can see that there exists several outlier matchings in our plots. To fix this we'll use RANSAC (Random Sample Consensus). The idea of this algorithm is to randomly select 4 key point pairs and compute a homogrpahy for those pairs. Then determine weather those pairs are inliers by evalutaing if dist(Hp, p') < \( \epsilon\). Repeat this n times saving the homogrpahy computed over the most inlier points. I chose n=5000 and \( \epsilon = 5 \). This is the homography that we'll use to warp our image. Below I've plotted the pairs that RANSAC identified as inliers.
Campanile Matching w/ RANSAC
Using the homogrpahies found with the RANSAC algorithm we can now make even better Moasaics than in Part A. Below we can see the side by side of manual correspondances and automatic correspondances.
Room Mosaic w/ RANSAC
Room Mosaic w/ manual correspondances
Landscape Mosaic w/ RANSAC
Landscape Mosaic w/ manual correspondances
Campanile Mosaic w/ RANSAC
Campanile Mosaic w/ manual correspondances
This project was very interesting and taught me how to read and implement a research paper very nicely. It was very cool to see how effective the RANSAC algorithm is. When I initially learned of the RANSAC algortihm, I was skeptical; it seemed inefficient and not robust. However after running it several times and getting great results every time I was convinced. I also found the results of rectification quite satisfying and as an art enthusiast it's cool to see how it's been used to identify floor patterns on old paintings.