Harris Corner Detector

October 11, 2024

Background

Corners are features which are distinctive and reasonably invariant to viewpoint. In flat regions, regardless of the direction in which we move the window, the gradient magnitude and orientation of pixels do not change. This means there is no distinctive key point (pixel) in this region, as all pixels are the same.

In edge regions, moving the window along the edge does not cause any change in pixel gradients. However, shifting the window perpendicular to the edge (in the direction of the pixel gradient) does make a change, though it is not sufficiently strong.

Corners, unlike flat and edge regions, yield a significant change in gradient magnitude and orientation when the window is shifted in any direction. This occurs because the pixel RGB (or Intensity) values in the window change substantially when shifted. This behavior makes corners useful for good localization. Therefore, corners in the image are the local features that the detector is looking for.

The basic approach is this: corners are detected by running a small window over an image. The detector window is designed to compute intensity changes. We are interested in three scenarios:

Flat regions: Areas of zero or small intensity changes in all directions (e.g., a constant region).
Edges: Areas of changes in one direction but small or no changes in the orthogonal direction (e.g., along a boundary).
Corners: Areas of significant changes in all directions (e.g., when the window contains a corner or isolated points).

Mathematical Analysis of Harris Corner Detection

Let’s first dive into the mathematical formulation of the Harris Corner Detector.

Patch Shifting and Intensity Changes

Consider a patch of size $w \times h$ centered at pixel $(s, t)$ in an image. Shifting this patch by $(x, y)$ results in a new patch located at $(s + x, t + y)$. The intensity values of the original and shifted patches can be denoted as $I(s, t)$ and $I(s + x, t + y)$, respectively.

The weighted sum of squared differences (SSD) between the original and shifted patches is given by:

$$ F(x, y) = \sum_{s} \sum_{t} w(s, t) \left[ I(s + x, t + y) - I(s, t) \right]^2 $$

where $w(s, t)$ is a weighting function that assigns weight to each pixel within the patch.

Taylor Series Expansion

To approximate the intensity of the shifted patch, we employ the Taylor series expansion around the point $(s, t)$:

$$ I(s + x, t + y) \approx I(s, t) + x \cdot I_x(s, t) + y \cdot I_y(s, t) $$

where:

$I_x(s, t) = \frac{\partial I}{\partial x} \bigg|_{(s, t)}$
$I_y(s, t) = \frac{\partial I}{\partial y} \bigg|_{(s, t)}$

Expanding the SSD

Substituting the Taylor expansion into equation above:

$$ F(x, y) = \sum_{s} \sum_{t} w(s, t) \left[ x \cdot I_x(s, t) + y \cdot I_y(s, t) \right]^2 $$

This equation can be expressed in matrix form as:

$$ F(x, y) = \begin{pmatrix} x & y \end{pmatrix} M \begin{pmatrix} x \\ y \end{pmatrix} $$

where the gradient covariance matrix M is: $$ M = \sum_{s,t} w(s,t) \begin{pmatrix}I_x^2 & I_x I_y \\ I_x I_y & I_y^2 \end{pmatrix} $$

Weighting Function $w(s, t)$

The weighting function $w(s, t)$ generally adopts one of the following forms:

Box Filter: Preferred when computational speed is paramount and the noise level is low.
$$ w(s, t) = \begin{cases} 1 & \text{if } (s, t) \text{ is inside the patch} \ 0 & \text{otherwise} \end{cases} $$
Gaussian Filter: Employed when data smoothing is essential, providing better noise resistance.
$$ w(s, t) = e^{-\frac{s^2 + t^2}{2\sigma^2}} $$

Eigenvalue Analysis

The matrix $M$, often referred to as the Harris matrix, is symmetric and can be analyzed using its eigenvalues and eigenvectors. The eigenvectors of the real and symmetric matrix, M point in the direction of maximum data spread with the eigenvalues being proportional to the data spread. This can be represented using eclipse, enclosing the data and eigenvalues representing the major and minor axes. By analyzing the eigenvalues of $M$, we can distinguish between flat regions, edges, and corners:

Flat Regions: Minimal intensity changes; the patch is nearly constant ($\lambda_1 \approx \lambda_2 \approx 0$)
Edges: Significant intensity changes in one direction; the patch contains an edge ($\lambda_1$ is large, $\lambda_2$ is small (or vice versa))
Corners: Significant intensity changes in all directions; the patch contains a corner (Both $\lambda_1$ and $\lambda_2$ are large)

Corner Response Function

Instead of computing eigenvalues, which are computationally expensive, the HS detector uses a corner response function $R$ defined as:

$$ R = \det(M) - k \cdot (\text{trace}(M))^2 $$

where:

$\det(M) = \lambda_1 \lambda_2$ is the determinant of $M$.
$\text{trace}(M) = \lambda_1 + \lambda_2$ is the trace of $M$.
$k$ is an empirical constant, typically in the range $0 < k < 0.25$.

Interpretation of $R$:

Corners: $R$ is large and positive, as both $\lambda_1$ and $\lambda_2$ are large.
Edges: $R$ is negative, with one large and one small eigenvalue.
Flat Regions: $R$ is near zero, with both eigenvalues being small.

The constant $k$ acts as a sensitivity factor. A smaller $k$ makes the detector more likely to identify corners, while a larger $k$ increases the threshold for corner detection.

import numpy as np
import matplotlib.pyplot as plt
import cv2

# Load the image in grayscale
def load_image(image_path):
    image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    image = cv2.normalize(image, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
    return image

# Step 1 : Compute Image Derivatives, Ix and Iy
def compute_image_gradients(image):
    # Compute the gradients in x and y direction
    Ix = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=3)
    Iy = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=3)
    return Ix, Iy

# Step 2: Computing Ix^2, Iy^2, IxIy
def compute_squared_gradients(Ix, Iy):
    Ix2 = Ix**2
    Iy2 = Iy**2
    IxIy = Ix * Iy
    return Ix2, Iy2, IxIy

# Step 3: Applying window function to Ixx, Iyy and Ixy. We will use gaussian filter. 
def apply_gaussian_filter(image, kernel_size=5, sigma=1.0):
    return cv2.GaussianBlur(image, (kernel_size, kernel_size), sigma)

# Step 4: Compute the Harris Matrix
def compute_harris_matrix(Sxx, Syy, Sxy):
    M = np.zeros((2, 2))
    M[0, 0] = Sxx ; M[0, 1] = Sxy
    M[1, 0] = Sxy ; M[1, 1] = Syy
    return M

# Step 5: Compute the Harris Response
def compute_harris_response(M, k=0.05):
    det_M = np.linalg.det(M)
    trace_M = np.trace(M)
    R = det_M - k * (trace_M**2)
    return R

# Step 6: Compute the Harris Response for each pixel, given Sxx, Syy, Sxy
def harris_response_matrix(Sxx,Syy,Sxy,k = 0.05):
    R = np.zeros(Sxx.shape)
    for i in range(Sxx.shape[0]):
        for j in range(Sxx.shape[1]):
            M = compute_harris_matrix(Sxx[i,j],Syy[i,j],Sxy[i,j])
            R[i,j] = compute_harris_response(M,k)
    return R

# Step 7: Thresholding the response
def threshold_response(R, threshold_ratio = 0.01):
    corners = np.zeros_like(R)
    corners[R > threshold_ratio * R.max()] = 1
    return corners

After thresholding the corner response function, non-maximum suppression is applied to retain only the local maxima. This process ensures that only the strongest corners are detected. As mentioned in the paper, a corner region pixel (ie. one with a positive response) is selected as a nominated corner pixel if its response is an 8 way local maximum within a m x m window.

# Step 7: Non-maximum suppression
def non_max_suppression(R, corners, window_size=3):
    final_corners = []
    offset = window_size // 2
    for i in range(offset, R.shape[0] - offset):
        for j in range(offset, R.shape[1] - offset):
            if corners[i, j]:
                window = R[i - offset:i + offset + 1, j - offset:j + offset + 1]
                if R[i, j] == np.max(window):
                    final_corners.append((j, i)) 
    return final_corners

# Draw the corners on the image
def draw_corners(image, corners, color=(0, 0, 255), marker_size=1):
    # Convert grayscale to BGR for colored markers
    img_with_corners = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
    for corner in corners:
        cv2.circle(img_with_corners, corner, marker_size, color, 1)
    return img_with_corners

def display_image(ax, image, title):
    """Display an image on a given axes."""
    ax.imshow(image, cmap='gray' if len(image.shape) == 2 else None)
    ax.set_title(title)
    ax.axis('off')

def normalize_and_display(ax, image, title):
    """Normalize and display an image."""
    normalized_image = cv2.normalize(image, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
    display_image(ax, normalized_image, title)

def Harris_Corner_Detection(image_path, threshold=0.0001, k=0.1, window_size=3, 
                            gaussian_kernel_size=5, sigma=1.0):
    
    plt.figure(figsize=(20, 20))
    
    # Load Image
    image = load_image(image_path)
    display_image(plt.subplot(4, 3, 1), image, 'Original Grayscale Image')

    # Compute Image Gradients
    Ix, Iy = compute_image_gradients(image)
    normalize_and_display(plt.subplot(4, 3, 2), Ix, 'Gradient in X (Ix)')
    normalize_and_display(plt.subplot(4, 3, 3), Iy, 'Gradient in Y (Iy)')

    # Compute Squared Gradients
    Ix2, Iy2, IxIy = compute_squared_gradients(Ix, Iy)
    normalize_and_display(plt.subplot(4, 3, 4), Ix2, 'Squared Gradient Ix²')
    normalize_and_display(plt.subplot(4, 3, 5), Iy2, 'Squared Gradient Iy²')
    normalize_and_display(plt.subplot(4, 3, 6), IxIy, 'Product Gradient IxIy')

    # Apply Gaussian Filter
    Sxx = apply_gaussian_filter(Ix2, gaussian_kernel_size, sigma)
    Syy = apply_gaussian_filter(Iy2, gaussian_kernel_size, sigma)
    Sxy = apply_gaussian_filter(IxIy, gaussian_kernel_size, sigma)
    normalize_and_display(plt.subplot(4, 3, 7), Sxx, 'Smoothed Sxx')
    normalize_and_display(plt.subplot(4, 3, 8), Syy, 'Smoothed Syy')
    normalize_and_display(plt.subplot(4, 3, 9), Sxy, 'Smoothed Sxy')

    # Compute Harris Response
    R = harris_response_matrix(Sxx, Syy, Sxy, k)
    normalize_and_display(plt.subplot(4, 3, 10), R, 'Harris Response R')

    # Thresholding
    corners_binary = threshold_response(R, threshold)
    display_image(plt.subplot(4, 3, 11), corners_binary, 'Thresholded Corners (Binary)')

    # Non-Maximum Suppression and Draw Corners
    corners = non_max_suppression(R, corners_binary, window_size)
    img_with_corners = draw_corners(image, corners)
    display_image(plt.subplot(4, 3, 12), cv2.cvtColor(img_with_corners, cv2.COLOR_BGR2RGB), 'Detected Corners')

    plt.tight_layout()
    plt.show()
    
    return img_with_corners

# Using OpenCV's Harris Corner Detection
def harris_corner_detection_opencv(image_path, block_size=3, ksize=3, k=0.1, threshold=0.0001):
    # Load the image
    image = load_image(image_path)
    
    # Detect corners using Harris Corner Detection
    corners = cv2.cornerHarris(image, block_size, ksize, k)
    
    # Threshold the corners
    corners_binary = np.zeros_like(corners)
    corners_binary[corners > threshold * corners.max()] = 1
    
    # Non-maximum suppression
    corners = non_max_suppression(corners, corners_binary, window_size=3)
    
    # Draw corners on the image
    img_with_corners = draw_corners(image, corners)
    
    return img_with_corners

# Compare function for opencv and custom 

def compare_harris_corner_detection(image_path, block_size=3, ksize=3, k=0.1, threshold=0.0001):
    A = Harris_Corner_Detection(image_path)  # Your custom implementation
    B = harris_corner_detection_opencv(image_path)  # OpenCV implementation
    
    # Create 2 subplots A and B 
    fig, axs = plt.subplots(1, 2, figsize=(15, 15))

    # Plot A
    axs[0].imshow(cv2.cvtColor(A, cv2.COLOR_BGR2RGB))
    axs[0].set_title('Custom Harris Corner Detection')
    axs[0].axis('off')

    # Plot B
    axs[1].imshow(cv2.cvtColor(B, cv2.COLOR_BGR2RGB))
    axs[1].set_title('OpenCV Harris Corner Detection')
    axs[1].axis('off')

    # Draw a vertical line between the two subplots
    plt.axvline(x=0.5, color='black', linestyle='--', linewidth=2)

    plt.tight_layout()
    plt.show()

compare_harris_corner_detection('Board.png')

compare_harris_corner_detection('Grid.png', block_size=3, ksize=3, k=0.1, threshold=0.0001)

References