Tutorial: Iterative Closest Point (ICP)

@for Computer vision people
@author Kai Ruhl
@since 2021-04

Iterative Closest Point is an algorithm for 3D point cloud registration, i.e. to align two partly overlapping point clouds such that distances are minimized. Failure conditions are:

If there is no overlap.
If the overlapping geometry is ambiguous, e.g. two planes or spheres.
If the initial pose guess is too far off.

Introductions to ICP usually either have no math at all [vid], or alternatively too much of it with almost no explanation (and some steps left out, for fun). So in this tutorial we go through all the math, step by step, even if it takes a good long while. We illustrate everything with plots from Python, more specifically Jupyter QtConsole, which you can install by sudo apt-get install python3-qtconsole jupyter-core and then start with jupyter qtconsole, where we configure SVG with %config InlineBackend.figure_format = 'svg'. You might also need to install numpy and matplotlib [doc] if not yet present.

Point Cloud Registration

We have a stationary target point cloud $$Q$$ and a movable point cloud $$P$$, and we search for a 3D rigid transformation ("pose") of rotation matrix $$R_{{\color{gray} 3 \times 3}}$$ and translation vector $$\vec{t}_{{\color{gray} 3 \times 1}}$$ such that $$R\cdot\vec{p}_i + \vec{t}$$ is near to some spot $$q_j$$ on the surface spanned by $$Q$$, similar to what we did in mathematical optimization. In Python:

# jupyter-qtconsole
%config InlineBackend.figure_format = 'svg'
import numpy as np; np.set_printoptions(precision=2, linewidth=100)
import matplotlib.pyplot as plt; import mpl_toolkits.mplot3d

# some (target) surface Q
f   = lambda x : .2*x * np.sin(3.*x)
x0_ = np.linspace(-2.5, +2.0, 30)
y0_ = f(x0_)
z0_ = np.zeros(len(x0_))
Q   = np.vstack([x0_, y0_, z0_]) # 3x30

# rotated and translated (source) surface P
def rotmat(a, b, c): # rz, ry, rx
    """See Wikipedia link above."""
    sa, sb, sc = np.sin([a, b, c])
    ca, cb, cc = np.cos([a, b, c])
    row0 = [ca*cb, ca*sb*sc - sa*cc, ca*sb*cc + sa*sc]
    row1 = [sa*cb, sa*sb*sc + ca*cc, sa*sb*cc - ca*sc]
    row2 = [-sb, cb*sc, cb*cc]
    return np.array([row0, row1, row2])
x1_ = np.linspace(-2.0, +2.5, 30)
y1_ = f(x1_)
z1_ = np.zeros(len(x1_))
R1  = rotmat(np.radians(15), 0, 0)
t1_ = np.asarray([.3, .9, 0.])
P   = (R1.dot(np.vstack([x1_, y1_, z1_])).T  + t1_).T # 3x30

# we search for:
R, t_ = np.identity(3), np.zeros(3)

Usually, it is not feasible to take all source points $$p_i$$, so we make a subselection too.

# source points
n_q = Q.shape[-1]
n_i = 5
i_  = np.round((np.linspace (0, 1, n_i+1) + (.5/n_i))[:-1] * n_q).astype(int)

# plot both surfaces
fig = plt.figure(); fig.clear()
ax  = fig.subplots(); ax.clear()
def lines(R, t_, i_):
    """Plots pseudo surfaces and samples."""
    ax.clear()
    ax.set_xlim(-3, +3)
    ax.set_ylim(-1.5, +2.5)
    ax.set_yticks([-1, 0, 1, 2])
    ax.grid()
    ax.plot(Q[0], Q[1], c='blue') # target surface
    X = np.copy(P)
    if R is not None:
        X = R.dot(X)
    if t_ is not None:
        X = (X.T + t_).T
    ax.plot(X[0], X[1], c='red')  # source surface
    if i_ is not None:
        ax.plot(X[0, i_], X[1, i_], 'rx') # samples
    return fig
lines(R, t_, i_)

i01_icp_surfaces — ICP: target surface Q (blue) and source surface P (red)

... and there we have two partially overlapping surfaces and the source points used to compute the rigid transformation. Note that our representation is kind of funny: We only have surface points with z=0, and pretend that the surface extends evenly into z-direction. This allows us to use 2D plots and still compute everything with 3D matrixes. Also, if you want to see the above figure again, type fig in the console.

Point-to-Plane

Early ICP variants [chen1991icra] [besl1992tpami] mostly used point-to-point correspondences such that for as many points as possible $$\sum_i \left \| [R\cdot\vec{p}_i + \vec{t}] - \vec{q}_j \right \|^2$$ was minimized, where $$j$$ is a suitable target spot for the source index $$i$$. Modern variants [pomerleau2015ftr] [diez2015acs] [tam2013tvcg] [rusinkiewicz2001tdim] tend towards point-to-plane, which allows for penalty-free sliding to the side whenever geometry is ambiguous, optimizing $$\sum_i \left ( ([R\cdot\vec{p}_i + \vec{t}] - \vec{q}_j) \cdot n_j \right )^2$$ instead, where $$n_j$$ is the normal at $$q_j$$. The latter dot product has the effect that "sideways distances" do not factor in, just the distance along the normal. We start with getting corresponding surfaces with point-to-plane distances.

def indexplane(Q, j):
    """Plane in Hessian normal form."""
    assert j > 0 and j < Q.shape[-1] -1
    q0_, q1_, q2_ = Q.T[j-1:j+2]
    q3_ = np.copy (q2_); q3_[2] = 1.     # fake z
    n_ = np.cross(q3_ - q0_, q2_ - q0_)  # plane normal
    n_ = n_ / np.linalg.norm(n_, axis=0) # normalize
    return q1_, n_ # Hessian normal form

def planedist(p_, q_, n_):
    """Distance between point p and plane (q, n)."""
    plane_dist = n_.dot(p_ - q_) # see p2p link above
    pp_ = p_ - plane_dist * n_   # perpendicular point
    return np.abs(plane_dist), pp_

def indexplanedist(p_, j):
    """Distance between point p and plane at q_j."""
    if j < 1 or j >= Q.shape[-1] -1:
        return 9999., Q.T[j], Q.T[j], Q.T[j]
    q_, n_ = indexplane(Q, j)
    dist, pp_ = planedist(p_, q_, n_)
    return dist, pp_, q_, n_

# testing
indexplane(P, i_[0])
planedist(P.T[i_[0]], *indexplane(Q, 8))
indexplanedist(P.T[i_[0]], 8)

We take $$p_3$$ and $$q_8$$ and see that the distance of 0.2 and normals of [-0.5, 0.8]^T are plausible. But not every correspondence is plausible, so we require the points being somewhat near, and the normals being not too different either.

max_normal_angle_cos = np.cos(np.radians(20))
max_point_distance   = 1.5

def isnear(p_, np_, dist, pp_, q_, nq_):
    """Correspondence p and q valid?"""
    if dist > max_point_distance:
        return False # hit point too far away
    if np.linalg.norm(p_ - q_) > max_point_distance:
        return False # original point too far away
    cos_a = np_.dot(nq_)
    if cos_a < max_normal_angle_cos:
        return False # outside normal cone
    return True

def getj(P, i):
    """Best q_j correspondence for p_i."""
    p_, np_ = indexplane(P, i)
    j_ = range(Q.shape[-1]) # 0..29
    dists1 = [indexplanedist(p_, j) for j in j_] # dist, pp_, q_, nq_
    valid1 = [isnear(p_, np_, *d) for d in dists1]
    if not np.any(valid1):
        return (-1, 9999, p_, p_, np_, i, p_, np_)
    # crash score if not near
    dists2 = [(dists1[j][0] if valid1[j] else 9999.,
              *dists1[j][1:]) for j in j_]
    j = np.argmin([d[0] for d in dists2])
    return (j,) + dists2[j] + (i, p_, np_)

# testing
isnear(*indexplane(P, i_[0]), *indexplanedist(P.T[i_[0]], 8))
[getj(P, i) for i in i_]

With the above, we can take any point $$p_i$$ to get some point $$q_j$$ that minimizes the distance along the latter's normal. Let us see if it works out, starting with some point and arrow drawing capabilities.

def arrow(x, y, dx, dy, text, color, zoom=1.):
    zx, zy = dx*zoom, dy*zoom # arrow end point offset
    if not (dx == 0 and dy == 0):
        ax.arrow(x, y, zx, zy, color=color, head_width=.1, \
                 length_includes_head=True)
    if text is not None:
        sx, sy = np.sign(dx), np.sign(dy) # [-1, 0, +1]
        tx, ty = sx * 5, sy * 5 # text offset
        ha = ['right', 'center', 'left'][int(sx+1)]
        va = ['top', 'center', 'bottom'][int(sy+1)]
        ax.annotate(text, (x+zx, y+zy), (tx, ty), ha=ha, va=va, \
            textcoords='offset points', color=color, \
            bbox=dict(fc='w', ec='w', alpha=0.4, pad=1.))

def point(x, y, text, color, marker='o', va='bottom'):
    if marker is not None:
        ax.plot(x, y, marker=marker, color=color)
    if text is not None:
        ax.annotate(text, (x, y), (0, 5), ha='center', va=va, \
            textcoords='offset points', color=color, \
            bbox=dict(fc='w', ec='w', alpha=0.4, pad=1.))

... and on to actually drawing our correspondences.

def icparrow(j, dist, pp_, q_, nq_, i, p_, np_):
    if j < 0: return
    point(*p_[:2], '%d'%(i,), 'red', 'x')
    point(*q_[:2], None, 'blue', 'o', 'top')
    #point(q_[0], q_[1]-.7, '%d'%(j,), 'blue', None)
    plane = np.vstack([q_, pp_]).T
    ax.plot(plane[0], plane[1], c='gray', alpha=0.6)
    arrow(*p_[:2], *(pp_ - p_)[:2], '%.1f'%(dist,), 'gray')

def icparrows(R, t_):
    X = (R.dot(P).T + t_).T
    dists = [getj(X, i) for i in i_]
    [icparrow(*d) for d in dists]
    return fig

R, t_ = np.identity(3), np.zeros(3) # reset
lines(R, t_, i_); icparrows(R, t_)

i02_icp_point_to_plane — ICP: source point (red) to target plane (blue to gray)

... and now we have correspondences and a distance metric for our points $$p_i$$. Whew, quite some code for just that, isnt it?

Linear Formulation

The pose ($$R, \vec{t}$$) that we want to solve is not linear at all, but all we have is linear algebra. So... what happens next is that everyone cheats a lot, starting with formulating ICP as many iterations of solving $$A\vec{x}=\vec{b}$$. Surprisingly, it still works in many cases!

Cheat 1: Treat ICP as linear problem.

$${\color{orange} \underbrace{{\color{black} \begin{bmatrix} a_{p_1,r_x} & a_{p_1,r_y} & a_{p_1,r_z} & a_{p_1,t_x} & a_{p_1,t_y} & a_{p_1,t_z} \\ a_{p_2,r_x} & a_{p_2,r_y} & a_{p_2,r_z} & a_{p_2,t_x} & a_{p_2,t_y} & a_{p_2,t_z} \\ & & \vdots & & & \\ a_{p_n,r_x} & a_{p_n,r_y} & a_{p_n,r_z} & a_{p_n,t_x} & a_{p_n,t_y} & a_{p_n,t_z} \end{bmatrix}}}_{A}} \cdot {\color{orange} \underbrace{{\color{black} \begin{bmatrix} r_x \\ r_y \\ r_z \\ t_x \\ t_y \\ tz \end{bmatrix}}}_{\vec{x}}} = {\color{orange} \underbrace{{\color{black} \begin{bmatrix} \textrm{residual}_{p_1} \\ \textrm{residual}_{p_2} \\ \vdots \\ \textrm{residual}_{p_n} \end{bmatrix}}}_{\vec{b}}}$$

Each row in $$A$$ is for one point $$p_i$$. Using that row, a point reacts to our current estimate of the unknown rigid transformation $$\vec{x}$$, and also has a distance error ("residual") in $$\vec{b}$$. Repeat for the next row, and so on. Determining what to put into $$A$$ and $$b$$ will be our next task [low2004tr].

We start with the insight that $$\sum_i \left ( (R\cdot\vec{p}_i + \vec{t} - \vec{q}_j) \cdot n_j \right )^2$$ has a really pesky non-linearity in the rotation matrix $$R$$ because there are a lot of sines and cosines in it. Note that Wikipedia has a strange notation with $$\alpha=r_z$$, $$\beta=r_y$$ and $$\gamma=r_x$$ that we nonetheless follow here.

$$R=\begin{bmatrix} \cos\alpha \cos\beta & \cos\alpha \sin\beta \sin\gamma - \sin\alpha \cos\gamma & \cos\alpha \sin\beta \cos\gamma + \sin\alpha \sin\gamma \\ \sin\alpha \cos\beta & \sin\alpha \sin\beta \sin\gamma + \cos\alpha \cos\gamma & \sin\alpha \sin\beta \cos\gamma - \cos\alpha \sin\gamma \\ -\sin\beta & \cos\beta \sin\gamma & \cos\beta \cos\gamma \end{bmatrix}$$

Cheat 2: Assume small rotations, then linearize.

Putting all that into a matrix seems hard, so... what if we assume that the rotation is only small, and then say that $$\sin(x)=x$$ and $$\cos(x)=1$$?

ax.clear(); ax.grid(); x_ = np.linspace(0, 5);
ax.plot(x_, np.sin(x_), 'b-'); point(0, -.2, 'sin(0)', 'blue', None);
ax.plot(x_, np.cos(x_), 'r-'); point(0, .75, 'cos(0)', 'red' , None);

i11_icp_rotation_linear — ICP: small rotations treated as linear

... because $$\sin(x)$$ is evenly rising around 0, and $$\cos(x)$$ is just starting to come down, is the reasoning. Seriously, I am not making this up! Look at all the Rusinkiewicz papers, latest [rusinkiewicz2019tog], and tell me they do it differently. And also:

Cheat 3: Small rotations squared are so much smaller, just drop them.

$$R \approx \begin{bmatrix} 1 & \beta\gamma-\alpha & \beta+\alpha\gamma \\ \alpha & \alpha\beta\gamma+1 & \alpha\beta-\gamma \\ -\beta & \gamma & 1 \end{bmatrix} \approx \begin{bmatrix} 1 & -\alpha & \beta \\ \alpha & 1 & -\gamma \\ -\beta & \gamma & 1 \end{bmatrix}$$

I am super suprised that this works in practice, but apparently it does. And now we re-visit $$\sum_i \left ( (R\cdot\vec{p}_i + \vec{t} - \vec{q}_j) \cdot n_j \right )^2$$ in detail, but without the squares and just for one point; solving by least squares will automatically bring that square back.

$$(R \cdot \vec{p} + \vec{t} - \vec{q}) \cdot \vec{n} \approx \left ( \begin{bmatrix} 1 & -\alpha & \beta \\ \alpha & 1 & -\gamma \\ -\beta & \gamma & 1 \end{bmatrix} \begin{bmatrix} p_x \\ p_y \\ p_z \end{bmatrix} + \begin{bmatrix} t_x \\ t_y \\ t_z \end{bmatrix} - \begin{bmatrix} q_x \\ q_y \\ q_z \end{bmatrix} \right ) \cdot \begin{bmatrix} n_x \\ n_y \\ n_z \end{bmatrix}$$

... and from here into the gory details ...

$$= \begin{bmatrix} p_x - \alpha p_y + \beta p_z + t_x - q_x \\ \alpha p_x + p_y - \gamma p_z + t_y - q_y \\ -\beta p_x + \gamma p_y + p_z + t_z - q_z \end{bmatrix} \cdot \begin{bmatrix} n_x \\ n_y \\ n_z \end{bmatrix} = \begin{array}{rl} & p_x n_x - \alpha p_y n_x + \beta p_z n_x + t_x n_x - q_x n_x \\ + & \alpha p_x n_y + p_y n_y - \gamma p_z n_y + t_y n_y - q_y n_y \\ + & -\beta p_x n_z + \gamma p_y n_z + p_z n_z + t_z n_z - q_z n_z \end{array}$$

... and back to sorting by rotation and translation parameters, remembering that $$r_x = \gamma$$ (thanks again, Wikipedia) ...

$$= \begin{array}{rcl} & \gamma & (-p_z n_y + p_y n_z) \\ + & \beta & (+p_z n_x - p_x n_z) \\ + & \alpha & (-p_y n_x + p_x n_y) \\ + & t_x & n_x \\ + & t_y & n_y \\ + & t_y & n_z \\ \end{array} \begin{array}{rcl} + & p_x n_x - q_x n_x \\ + & p_y n_y - q_y n_y \\ + & p_z n_z - q_z n_z \end{array}$$

... and if that doesnt almost look like $$A\vec{x}-b$$ ...

$$= \begin{bmatrix} (p_y n_z - p_z n_y) & (p_z n_x - p_x n_z) & (p_x n_y - p_y n_x) & n_x & n_y & n_z \\ & & \vdots & & & \\ \end{bmatrix} \begin{bmatrix} \gamma \\ \beta \\ \alpha \\ t_x \\ t_y \\ tz \end{bmatrix} - \begin{bmatrix} (\vec{q} - \vec{p}) \cdot \vec{n} \\ \vdots \end{bmatrix}$$

... and whale oil beef hooked, that reminds me of the cross product ...

$$= \begin{bmatrix} \larr & \vec{p} \times \vec{n} & \rarr & \larr & \vec{n} & \rarr \\ & & \vdots & & & \\ \end{bmatrix} \begin{bmatrix} \gamma \\ \beta \\ \alpha \\ t_x \\ t_y \\ tz \end{bmatrix} - \begin{bmatrix} (\vec{q} - \vec{p}) \cdot \vec{n} \\ \vdots \end{bmatrix}$$

There is probably some deeper insight behind $$a_{11}..a_{13}$$ being a cross product, but I dont have the slightest intuition. In any case, we are now ready to express $$A\vec{x}-\vec{b}$$ in code:

def axb1(j, dist, pp_, q_, nq_, i, p_, np_):
    """One row of A and b, as in Ax-b."""
    pxn = np.cross(p_, nq_)
    b   = (q_ - p_).dot(nq_)
    rowAb = np.hstack([pxn, nq_, b])
    return rowAb

def axb2(j, dist, pp_, q_, nq_, i, p_, np_, faxb=axb1):
    """Additional z-stabilization for axb1."""
    rowAb1 = axb1(j, dist, pp_, q_, nq_, i, p_, np_)
    q2_ = q_ + [0, 0, 1] # shift along z
    p2_ = p_ + [0, 0, 1] # source point too
    rowAb2 = faxb(j, dist, pp_, q2_, nq_, i, p2_, np_)
    return np.vstack([rowAb1, rowAb2])

def axb(R, t_, faxb1=axb1):
    """A and b, as in Ax-b."""
    X = (R.dot(P).T + t_).T
    dists = [getj(X, i) for i in i_]
    Ab = np.vstack([axb2(*d, faxb1) for d in dists if d[0] >= 0])
    return Ab[:,:-1], Ab[:,-1]

# testing
axb1(*getj(P, i_[0]))
axb(R, t_)

... and that gives us the linear formulation $$A\vec{x}-\vec{b}$$ that we now want to minimize by finding the $$\vec{x}$$ that does so.

Linear Least Squares

Solving $$A\vec{x}=\vec{b}$$ is a pretty standard optimization problem, so we might as well use numpy to do it.

A, b = axb(R, t_)
rt2_ = np.linalg.lstsq(A, b)[0]
print('rx=%.1f, ry=%.1f, rz=%.1f and tx=%.1f, ty=%.1f, tz=%.1f' % tuple(rt2_))

... which recommends a clockwise rotation of 0.4 radians around the z-axis (ok), an x-shift of -0.1 (left, ok) and a y-shift of -0.6 (down, ok); the rest is near zero. This counteracts what we did before and generally looks good.

def rtmat(rt_, R0=np.identity(3), t0_=np.zeros(3)):
    """R and t matrixes from solution x, plus previous R/t."""
    R  = rotmat(*rt_[:3][::-1]) # reverse rx,ry,rz order
    t_ = rt_[3:]
    return R.dot(R0), t_ + t0_

R2, t2_ = rtmat(rt2_)
lines(R2, t2_, i_)

i12_icp_least_squares1 — ICP: surfaces after one least squares iteration

Time to tighten the (distance and normal cone) thresholds and go again:

# finding good thresholds is advanced magic
max_normal_angle_cos = np.cos(np.radians(10))
max_point_distance   = 1.
lines(R2, t2_, i_)
icparrows(R2, t2_)

i13_point_to_plane2 — ICP: Point-to-plane after iteration 1

... and with the renewed point-to-plane correspondences, repeat the linearization.

# iteration 2
A, b = axb(R2, t2_)
rt3_ = np.linalg.lstsq(A, b)[0]
print('rx=%.1f, ry=%.1f, rz=%.1f and tx=%.1f, ty=%.1f, tz=%.1f' % tuple(rt3_))

# combine the two iteration results:
R3, t3_ = rtmat(rt3_, R2, t2_)
lines(R3, t3_, i_); icparrows(R3, t3_)

This time, the console says that there is no rotation, an x-shift of -0.3 (left, ok) and a y-shift of -0.1 (down, neutral).

i14_point_to_plane3 — ICP: Point-to-plane after iteration 2

... which looks kinda good. Again!

# iteration 3
A, b = axb(R3, t3_)
rt4_ = np.linalg.lstsq(A, b)[0]
print('rx=%.1f, ry=%.1f, rz=%.1f and tx=%.1f, ty=%.1f, tz=%.1f' % tuple(rt4_))

# combine with previous iteration:
R4, t4_ = rtmat(rt4_, R3, t3_)
lines(R4, t4_, i_); icparrows(R4, t4_)

i15_point_to_plane4 — ICP: Point-to-plane after iteration 3

... and I think you got the idea.

Point-to-Curve

One relatively recent idea [rusinkiewicz2019tog] [issue] is to use both $$n_p$$ and $$n_q$$ in the linear equations so that the penality-free sliding is not on a plane but rather on some curve. The symmetric formulation [src] looks like this:

$$\begin{bmatrix} \larr & (\vec{p} + {\color{green} \vec{q}}) \times ({\color{green} \vec{n_p}} + \vec{n_q}) & \rarr & \larr & ({\color{green} \vec{n_p}} + \vec{n_q}) & \rarr \\ & & \vdots & & & \\ \end{bmatrix} \begin{bmatrix} \gamma \\ \beta \\ \alpha \\ t_x \\ t_y \\ tz \end{bmatrix} - \begin{bmatrix} (\vec{q} - \vec{p}) \cdot ({\color{green} \vec{n_p}} + \vec{n_q}) \\ \vdots \end{bmatrix}$$

... and we need to interpret the solution $$\vec{x}$$ differently, with the transformation being $$R \cdot \vec{t} \cdot R$$ instead of $$\vec{t} \cdot R$$ (first rotation, then translation), and with $$\gamma, \beta, \alpha$$ comprising a Rodrigues vector [src] because Rusinkiewicz felt like it. So first we switch to 4x4 transformation matrixes, which work with homogeneous coordinates $$[x, y, z, 1]^T$$:

$$S = \vec{t} \cdot R = {\color{orange} \underbrace{\color{black} \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}}_{T}} \cdot {\color{orange} \underbrace{\color{black} \begin{bmatrix} r_{11} & r_{12} & r_{13} & 0 \\ r_{21} & r_{22} & r_{23} & 0 \\ r_{31} & r_{32} & r_{33} & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}}_{R}} = \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_x \\ r_{21} & r_{22} & r_{23} & t_y \\ r_{31} & r_{32} & r_{33} & t_z \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}$$

def rodrigues4x4(angle, axis):
    """Rotation matrix from angle around axis."""
    l = np.linalg.norm(axis)
    x, y, z = axis / l
    s, c = np.sin(angle), np.cos(angle)
    xs, ys, zs, c1 = x*s, y*s, z*s, 1-c
    xx, yy, zz = c1*x*x, c1*y*y, c1*z*z
    xy, xz, yz = c1*x*y, c1*x*z, c1*y*z
    R = [[xx+c,  xy+zs, xz-ys, 0],
         [xy-zs, yy+c,  yz+xs, 0],
         [xz+ys, yz-xs, zz+c , 0],
         [0    , 0    , 0    , 1]]
    return np.array(R)

def t4x4(t_):
    """Translation matrix from translation vector."""
    T = np.identity(4); T[:3,3] = t_; return T

def splitRt(S):
    """Rotation and translation matrix from 4x4 combo."""
    return S[:3,:3], S[:3,3]

# reset thresholds and estimate
max_normal_angle_cos = np.cos(np.radians(20))
max_point_distance   = 1.5
S0 = np.identity(4)
R, t_ = splitRt(S0)

... and then we build the new $$A, \vec{b}$$ as well as solution $$\vec{x}$$ interpretation. Unfortunately since the paper omitted a lot of steps and thus was not very clear to me, I cant really tell you why it works like this.

def symaxb1(j, dist, pp_, q_, nq_, i, p_, np_):
    """One row of A and b, as in Ax-b."""
    p2_ = p_ + q_
    n2_ = np_ + nq_
    pxn = np.cross(p2_, n2_)
    b   = (q_ - p_).dot(n2_)
    rowAb = np.hstack([pxn, n2_, b])
    return rowAb

def symrtmat(rt_, S0=np.identity(4)):
    """S (R/t) matrix from solution x, plus previous R/t."""
    r_, t_ = rt_[:3], rt_[3:] # r_ = rodriguez rotation
    rot_angle = np.arctan(np.linalg.norm(r_))
    t_ = t_ * np.cos(rot_angle)
    R = rodrigues4x4(rot_angle, r_).T
    T = t4x4(t_)
    S = R.dot(T).dot(R)
    return S.dot(S0) # plus previous transformation

# compute (point-to-curve) transformation update
A, b = axb(*splitRt(S0), symaxb1)
rt5_ = np.linalg.lstsq(A, b)[0] # no direct interpretation
S5 = symrtmat(rt5_)

# adjust thresholds, draw result of iteration 1
R5, t5_ = splitRt(S5)
lines(R5, t5_, i_)
max_normal_angle_cos = np.cos(np.radians(10))
max_point_distance   = 1.
icparrows(R5, t5_)

i21_point_to_curve2 — ICP: Point-to-curve after one iteration

... which does a lot more shifting and rotation compared to point-to-plane. Again?

A, b = axb(*splitRt(S5), symaxb1)
S6 = symrtmat(np.linalg.lstsq(A, b)[0], S5)
R6, t6_ = splitRt(S6)
lines(R6, t6_, i_); icparrows(R6, t6_)

i22_point_to_curve3 — ICP: Point-to-curve after iteration 2

... and again?

A, b = axb(*splitRt(S6), symaxb1)
S7 = symrtmat(np.linalg.lstsq(A, b)[0], S6)
R7, t7_ = splitRt(S7)
lines(R7, t7_, i_); icparrows(R7, t7_)

i23_point_to_curve4 — ICP: Point-to-curve after iteration 3

So yes, I guess that counts as faster convergence!

Epilogue

While covering the mathematical and numerical basics, we left out a lot of considerations because I think other articles out there cover them quite well. One of them is step size: We applied each rotation and translation update rt fully, but we might as well multiply them with 0..1 before doing so. Levenberg-Marquardt and others do so, and also switch the step size between iterations.

Another consideration is the choice of samples and correspondences: We just took regular samples and tested all possible correspondences; but sampling could be less ambiguous by using nooks and crannies [gelfand2003tdim], and for runtime reasons alone some spatial partitioning to identify "near vertexes" on the other surface more quickly is useful; we also gave no relative weights to the samples. And we did not do any outlier detection whatsoever; surfaces from 3D scanners are usually noisy and benefit from such filtering.

A final note on solving $$A\vec{x}=\vec{b}$$ is that larger matrixes are usually sparse so that a method called Cholesky decomposition can be faster. In a nutshell, the math goes like this:

$$\begin{array}{rcl} A\vec{x} & = & \vec{b}\\ A^T A \vec{x} & = & A^T \vec{b}\\ \vec{x} & = & ({\color{orange} \underbrace{\color{black} A^T A}_{\textrm{cov}}})^{-1} A^T \vec{b}\\ \end{array}$$

With $$C=A^T A$$ being the 6x6 covariance matrix [geo] of the point entries; magically its condition number (biggest by smallest Eigenvalue) can inform us of "slidiness" when big, and the Eigenvalues + vectors can tell us which degrees of freedom are not fixed. As far as I know, nobody knows exactly why this is so. Back at Cholesky we define the 6x1 vector $$\vec{d}=A^T \vec{b}$$ which alongside $$C$$ being also smaller (6x6) than $$A$$ ($$m$$x6) makes the original problem smaller too. Now we solve:

$$\begin{array}{rcl} C \vec{x} & = & \vec{d}\\ L L^T \vec{x} & = & \vec{d}\\ \end{array}$$

... where Cholesky decomposes $$C$$ into $$L L^T$$. $$L$$ is a lower triangular matrix (and $$L^T$$ upper triangular) [inv], both of which mean we can just start solving the unknowns by forward and backward substition, starting with the $$L$$ row that has only one entry and working from there. In numpy:

# reset thresholds and estimate
max_normal_angle_cos = np.cos(np.radians(20))
max_point_distance   = 1.5
R, t_ = np.identity(3), np.zeros(3)

# compute Cholesky decomposition
A, b = axb(R, t_)
C  = A.T.dot(A)
d_ = A.T.dot(b)
L  = np.linalg.cholesky(C)

... but for some reason, the last row of $$C$$ is zero (maybe because we had no correspondence in z-direction?) so it does not work in our case. Thought you'd like to know anyway, even if this is quite the anticlimactic ending. But then again, so is every epilogue, aint it?

Conclusion

Whew! For just one algorithm, ICP was quite the ride. I hope you could get a little bit of in-depth mathematical intuition for the problem formulation, and for the cheats, ahm sorry approximations, used for solving it. Have a sunny and cloud-less, and generally nice, day!

EOF (Apr:2021)