The Eight-Point Algorithm

The assumption that we can get Fundament matrix seems rather large which is defined by a matrix product of the camera parameters But it's possible to get if given two images of the same scene without knowing the extrinsic or intrinsic parameters of the camera. The method we will discuss is doing so called Eight-Point Algorithm.

Core observations

Each correspondence $p_i=(u_i,v_i,1)$ and $p_i'=(u_i',v_i',1)$ gives us the epipolar constraint:

$p_i^TFp_i'=0$

We can expand this formula as follows:

$\begin{bmatrix} u_iu_i'&u_iv_i'&u_i&v_iu_i'&v_iv_i'&v_i&u_i'&v_i'&1 \end{bmatrix} \begin{bmatrix} F_{11} \\\\ F_{12} \\\\ F_{13} \\\\ \vdots \\\\ F_{33} \end{bmatrix}=0$

Since this constraint contains one degree of freedom,but the Fundamental matrix has 8 degrees of freedom (because it's up to scale). So we need 8 of these constraints to determine the Fundamental matrix:

$\begin{bmatrix} u_1u_1'&u_1v_1'&u_1&v_1u_1'&v_1v_1'&v_1&u_1'&v_1'&1 \\\\ \vdots \\\\ u_8u_8'&u_8v_8'&u_8&v_8u_8'&v_8v_8'&v_8&u_8'&v_8'&1 \end{bmatrix} \begin{bmatrix} F_{11} \\\\ F_{12} \\\\ F_{13} \\\\ \vdots \\\\ F_{33} \end{bmatrix}=0$

SVD method

The eight constrains can be compactly written as

$W\textbf{f}=0$

where $W \in \R ^{N\times 9},\textbf{f}\in \R^{9}$. In practice, it often is better to use more than eight correspondences and
create a larger $W$ matrix to reduce noisy measurements. We can use SVD to get a good estimate of Fundamental matrix $\hat{F}$：Let $W=U\Sigma V^T$,$\hat{\textbf{f}}$ is the last column of $V$. We may get a full rank of $\hat{F}$,however we know that fundamental matrix has rank-2. So we will use $\textbf{SVD}$ again to get the best rank-2 estimation of $\hat{F}$. So we need to solve the following optimization problem:

$\begin{array}{ll} \min\limits_{F} & \left|\left| F-\hat{F}\right|\right|_F \\\\ \text{subject to} & \det{F}=0 \end{array}$

This problem is again solved by $\textbf{SVD}$, where $\hat{F}=U\Sigma V^T$. The best rank-2 approximation $F$ is found by

$F=U\begin{bmatrix} \Sigma_1 &0&0\\\\ 0 & \Sigma_2&0\\\\ 0&0&0 \end{bmatrix}V^T$

Since $\textbf{SVD}$ is frequently met in computer vision, if you're not familiar with it as me, I strongly recommend to read this for a profound understanding.

Code implementation

'''
LLS_EIGHT_POINT_ALG  computes the fundamental matrix from matching points using 
linear least squares eight point algorithm
Arguments:
    points1 - N points in the first image that match with points2
    points2 - N points in the second image that match with points1

    Both points1 and points2 are from the get_data_from_txt_file() method
Returns:
    F - the fundamental matrix such that (points2)^T * F * points1 = 0
Please see lecture notes and slides to see how the linear least squares eight
point algorithm works
'''
def lls_eight_point_alg(points1, points2):
    # TODO: Implement this method!
    assert (points1.shape == points2.shape)
    W = np.ones((points1.shape[0], 9))
    W[:,0]=points2[:,0]*points1[:,0]
    W[:,1]=points2[:,0]*points1[:,1]
    W[:,2]=points2[:,0]
    W[:,3]=points1[:,0]*points2[:,1]
    W[:,4]=points1[:,1]*points2[:,1]
    W[:,5]=points2[:,1]
    W[:,6]=points1[:,0]
    W[:,7]=points1[:,1]
    f_initial = np.linalg.svd(W)[2][-1,:]
    U,S,V_t = np.linalg.svd(f_initial.reshape(3,3),full_matrices=False)
    U_trunk,S_trunk,V_t_trunk = U[:,:2],np.diag(S[:2]),V_t[:2,:]
    F = U_trunk@S_trunk@V_t_trunk
    return F

The Normalized Eight-Point Algorithm

The problem of standard Eight-Point Algorithm

$W$ should have one singular value equal to (or near) zero, with the other singular values being zero. However, the correspondences $p_i=(u_i,v_i,1)$ will often have extremely large values in the first and second coordinates(i.e. $p_i=(1800,1000,1)$). If the image points used to construct $W$ are in a relatively small region of the image(i.e. $W=W_c+\Delta W$. Maybe $W_c[:-1]$ is greate larger than $\Delta W[:-1]$). If we let $W=\overline{w}\cdot \textbf{1}^T+[\delta w_1;\delta w_2;...;\delta w_9]$, then we say $W$ has dominated term $\overline{w}\cdot \textbf{1}^T$ which has rank-1. According to Perturbation theory of SVD:

$\begin{aligned} &\sigma_1(W)\approx \sigma_1(\overline{w}\cdot \textbf{1}^T) \approx O(u_0u_0') \quad (\text{large!})\\\\ &\sigma_2(W) \approx \sigma_1(\Delta W)\approx O(\delta u,\delta v) \quad (\text{samll})\\\\ & \dots \\\\ &\sigma_9(W) \approx \sigma_7(\Delta W)\approx O(\delta u,\delta v) \quad (\text{samll}) \end{aligned}$

So the condition number of $W$: $\kappa=\frac{\sigma_1}{\sigma_9}$ will be ralatively large which causes a little puturbation make a great difference to $\textbf{f}$.

If you have the experience of learning numerical analysis, a common way to deal with this unstability is: Normalization.

Normalization

The key intuition is that since the main perturbation lies in the large coordinates $(u_i,v_i)$, we can make them small by translation and scaling. To translate, the origin of the new coordinate system should be located
at the centroid of the image points:

$x_i'=x_i-\frac{1}{N}\sum_{j=1}^Nx_j$

Their coordinates maybe also large, so we multiply each image point by a scaling factor such that the mean square distance of the transformed image points from the origin should be 2 pixels. To calculate the scaling factor $s$:

$\begin{aligned} & x_i''= sx_i'\\\\ & \frac{1}{N}\sum_{i=1}^N ||x_i''||^2=2 \\\\ & \Rightarrow \frac{1}{N}\sum_{i=1}^Ns^2||x_i-\overline{x}||^2=2 \\\\ & \Rightarrow s=\sqrt{\frac{2N}{\sum_{i=1}^N||x_i-\overline{x}||^2}} \end{aligned}$

Thus we construct transformation matrices $T,T'$:

$T=\begin{bmatrix} \frac{1}{s} &0&-\frac{\overline{u}}{s} \\\\ 0 &\frac{1}{s} & -\frac{\overline{v}}{s} \\\\ 0&0&1 \end{bmatrix}$

$T'$ is the same. So $q_i=Tp_i \quad q_i'=T'p_i'$.

Denormalization

Using the new, normalized coordinates, we can compute the new $F_q$ using the regular least-squares Eight Point Algorithm. Since $p_i ^T T^T F_qT'p_i'=0$ and $p_i^TFp_i'=0$, we finally get $F=T^T F_qT'$.

Code implementation

'''
NORMALIZED_EIGHT_POINT_ALG  computes the fundamental matrix from matching points
using the normalized eight point algorithm
Arguments:
    points1 - N points in the first image that match with points2
    points2 - N points in the second image that match with points1

    Both points1 and points2 are from the get_data_from_txt_file() method
Returns:
    F - the fundamental matrix such that (points2)^T * F * points1 = 0
Please see lecture notes and slides to see how the normalized eight
point algorithm works
'''
def normalized_eight_point_alg(points1, points2):
    x1_mean = np.mean(points1,axis=0)
    x2_mean = np.mean(points2,axis=0)
    s1 = 1/(np.sqrt(np.mean(np.sum((points1-x1_mean)**2))/2))
    s2 = 1/(np.sqrt(np.mean(np.sum((points2-x2_mean)**2))/2))

    T1=np.zeros((3,3))
    T1[0,0]=s1
    T1[1,1]=s1
    T1[2,0]= -s1*x1_mean[0]
    T1[2,1]= -s1*x1_mean[1]
    T1[2,2]=1
    T2=np.zeros((3,3))
    T2[0,0]=s2
    T2[1,1]=s2
    T2[2,0]= -s2*x2_mean[0]
    T2[2,1]= -s2*x2_mean[1]
    T2[2,2]=1

    norm_points1 = points1@T1
    norm_points2 = points2@T2
    F_norm = lls_eight_point_alg(norm_points1,norm_points2)
    F = T2@F_norm@T1.T
    return F

Image Rectification

When two images are parallel to each other...

Let's first compute the Essential matrix $E$ in the
case of parallel image planes. We assume that the two cameras have the same $K$ and that there in no relative rotation between the camares ($R=I$). In this case, let's assume that there is only a translation alone $x$-axis, giving $T=(T_x,0,0)$. This gives:

$\begin{aligned} E=\left[T_{\times}\right]R=\begin{bmatrix} 0&0&0\\\\ 0&0&-T_x \\\\ 0&T_x&0 \end{bmatrix} \end{aligned}$

Once $K$ is known, we can find the directions of the epipolar lines associated with points in the image planes. Let us compute the direction of the
epipolar line $l$ associated with point $p'$:

$l=Ep'=\begin{bmatrix} 0&0&0\\\\ 0&0&-T_x \\\\ 0&T_x&0 \end{bmatrix} \begin{bmatrix} u' \\\\v'\\\\ 1 \end{bmatrix}=\begin{bmatrix} 0 \\\\ -T_x \\\\ T_xv' \end{bmatrix}$

We see that the direction of $l$ is horizontal, as is the direction of $l'$. If we use the epipolar constraint $p^TEp'=0$, we get:

$v=v'$

So if we can let two images parallel, we can get:

The epipolar lines are horizontal
Each pair of correspondences $p_i,p_i'$ satisfy that$v_i=v_i'$

So next we will talk about how can we make any two images parallel, which is also called rectification.

General process

Before we talk about the algotithm, let's think about what we have: just image points. So the only way to proceed is to get Fundamental matrix $F$ estimated by Normalized Eight Point Algorithm.

As we get Fundamental matrix, we can compute epipolar lines $l_i,l_i'$ for each correspondence $p_i$ and $p_i'$. We know that $e,e'$ lie in all epipolar lines in each image. Thus we can formulate a linear system of equations and solve using $\textbf{SVD}$ to find the epipole $e$:

$\begin{bmatrix} l_1^T \\\\ \vdots \\\\ l_n^T \end{bmatrix}e=0$

After finding the epipoles $e$ and $e'$, we will notice that if they are points at infinity alone the horizontal axis, the two images are parallel. So next we should find a homography to map epipole $e$ to infinity alone the horizontal axis. Specifically, this means that we
want to find a pair of homographies $H_1$ and $H_2$ that we can apply to the imagesto map the epipoles to infinity.

So we conclude that how to rectify images:

First, use Normalized Eight Point Algorithm to find Fundamental matrix
Second, use the Fundamental matrix to find each point's epipolar line
Third，use the relationship between epipolar lines and epipole to estimate epipole by $\textbf{SVD}$.
Finally, since the epipole $e$ may be not points at infinity, we should find homography map $H_1,H_2$ to map the epipoles to infinity. Once they are at infinity alone horizontal axis, then two images are parallel.

Find $H_1$ and $H_2$

Let's start by finding a homography $H_2$ that maps the second epipole $e'$ to a point on the horizontal axis at infinity $(f,0,0)$. We can consider the homography as the process of translation and rotation.

We can apply a transformation that translates the second image such that the center is at $(0,0,1)$ in homogeneous coordinates:

$T=\begin{bmatrix} 1&0&-\frac{\text{weight}}{2}\\\\ 0&1&-\frac{\text{height}}{2}\\\\ 0&0&1 \end{bmatrix}$

After translating, if the translated epipole $Te'$ is located at $(e_1',e_2',1)$, then the rotation applied is:

$R=\begin{bmatrix} \alpha \frac{e_1'}{\sqrt{e_1'^2+e_2'^2}} & \alpha \frac{e_2'}{\sqrt{e_1'^2+e_2'^2}} &0 \\\\ -\alpha \frac{e_2'}{\sqrt{e_1'^2+e_2'^2}} & \alpha \frac{e_1'}{\sqrt{e_1'^2+e_2'^2}} &0 \\\\ 0&0&1 \end{bmatrix}$

where $\alpha=1$ if $e_1' \geq 0$ and $\alpha = -1$ otherwise.

Then the coordinate becomes $(f,0,1)$. To bring the point to infinity on the horizontal axis, apply a transformation:

$G=\begin{bmatrix} 1&0&0\\\\ 0&1&0\\\\ -\frac{1}{f} &0&1 \end{bmatrix}$

So the final $H_2=T^{-1}GRT$

Code implementation

'''
COMPUTE_H computes a homography to map an epipole to infinity along the horizontal axis 
Arguments:
    e - the epipole
    im2 - the image
Returns:
    H - homography matrix
'''
def compute_H(e, im):
    height = im.shape[0]
    width = im.shape[1]
    T = np.eye(3)
    T[0,2] = -width/2
    T[1,2] = -height/2
    translated_e = e@T.T
    alpha = 1 if translated_e[0]>=0 else -1
    R = np.eye(3)
    R[0,0] = alpha*translated_e[0]/np.sqrt(translated_e[0]**2+translated_e[1]**2)
    R[0,1] = alpha*translated_e[1]/np.sqrt(translated_e[0]**2+translated_e[1]**2)
    R[1,0] = -R[0,1]
    R[1,1] = R[0,0]
    rotated_e = translated_e@R.T
    G = np.eye(3)
    G[2,0] = -1/rotated_e[0]
    return np.linalg.inv(T)@G@R@T

Since I currently haven't fully understood the algorithm of finding $H_1$ by minimizing the sum of squared distance from projected $p_i$ to $p_i'$, I just show code implementation:

'''
COMPUTE_MATCHING_HOMOGRAPHIES determines homographies H1 and H2 such that they
rectify a pair of images
Arguments:
    e2 - the second epipole
    F - the Fundamental matrix
    im2 - the second image
    points1 - N points in the first image that match with points2
    points2 - N points in the second image that match with points1
Returns:
    H1 - the homography associated with the first image
    H2 - the homography associated with the second image
'''
def compute_matching_homographies(e2, F, im2, points1, points2):
    def skew_symmetric_matrix(e):
        return np.array([[0, -e[2], e[1]],
                         [e[2], 0, -e[0]],
                         [-e[1], e[0], 0]])
    # e1 = compute_epipole(F.T)
    M = skew_symmetric_matrix(e2) @ (F) + np.outer(e2, np.ones(3))
    H2 = compute_H(e2, im2)
    points1_hat = points1@M.T@H2.T
    points1_hat /= points1_hat[:,2][:,np.newaxis]
    points2_hat = points2@H2.T
    points2_hat /= points2_hat[:,2][:,np.newaxis]
    a = np.linalg.lstsq(points1_hat, points2_hat[:,0], rcond=None)[0]
    Ha = np.eye(3)
    Ha[0] = a
    H1 = Ha@H2@M
    return H1, H2

guochuan

epipolar-geometry

The Eight-Point Algorithm

Core observations

SVD method

Code implementation

The Normalized Eight-Point Algorithm

The problem of standard Eight-Point Algorithm

Normalization

Denormalization

Code implementation

Image Rectification

When two images are parallel to each other...

General process

Find $H_1$ and $H_2$

Code implementation

Hingle