Basha MVSF

Tali Basha (now Dekel) published a Multi-View Scene Flow (MVSF) at CVPR 2010, and IJCV 2013 [home] (with code) [author] [tr]. We search for possibilites for user interaction.

Algorithm

Given n cameras C_i with frames I_i at t=0 and t=1, and an initial estimate of Z0 in camera C₀, the objective is to dermine the true depth Z0 and the 3D-motion U,V,W. A coarse-to-fine pyramid with η=.9 is used to remove the nonlinearity in the data term, an outer loop to remove the nonlinearity from the perspective division, and an inner loop to remove the nonlinearity from the nonquadratic penalizer ψ(s²)=[s²+ε²]^.5. In the algorithm sketch below, only U is used for illustration, V/W/Z0 go analoguously:

`U0`	for pyramid level "L"=14..0:	unknowns	helpers	images	maps	coordinates
`U1`	`U`=0 or `U`=`U_L+1`	`U`		`I_0t`, `I_0t_XY`	`MapX_Id`
`U1`	for outer iteration "O"=0..3:
`U2`	keep `U` fixed; rewarp I_i,0 and I_i,1 with current `U`; `dU`=0	`dU`, `U_XY`	`occl_0t`	`I_0tw`, `I_0tw_XY`	`MapX_I_0t`	`X`, `Xt`, `X_xyz`, `Xt_xy`, `proj_xy0t,xyz`, `proj_xyt,uvw`
`U2`	for inner iteration "I"=0..10:
`U3`	keep `dU` fixed; build M`x`=B with current `dU`; `x`=0	`dU_XY`	`occl_0t`, `div_0wz`		`MapX_I_0t_in`	`Z_in`, `X_in`, `Xt`_in
`U3`	for SOR iteration "S"=0..40:
`U4`	keep ψ_{{flow,stereo1,stereo2}} fixed; solve M`x`=B	`x`
`U3`	`dU` = `x`
`U2`	`U` += `dU`
`U1`	`U_L-1` = upsample(`U`)

Variables

Due to the 3D to 2D projection nature of the computation, a lot of coordinates are juggled around, as are their derivatives. The list below shows the variables in the MVSF algorithm in a sequenced overview. All equations refer to the technical report [tr] which has the most details. There, BC_{m,s1,s2} → BC_{{flow,st.init,stereo}} refers to brightness constancy and S_{m,s} → S_{{flow,stereo}} to smoothness.

`U1`	`MapX_Id`	=	Id(N_y,N_x)
`U2`	`X`	=	ComputeXY(`Z0`, `MapX_Id`)					(eq.2)
	`Xt`	=	`X`+`U`	,	`Zt`	=	`Z0`+`W`	(eq.3,4)
	`MapX_I_0`	=	ComputeProjMaps(`X`, `Z0`)					(eq.5,16-17)
	`MapX_I_t`	=	ComputeProjMaps(`Xt`, `Zt`)
	`occl`₀	=	ComputeOcclusion(`X`, `Z0`, `MapX_I_0`)					(eq.7, alg.1)
	`occl`_t	=	ComputeOcclusion(`Xt`, `Zt`, `MapX_I_t`)					(eq.7)
	`I_0w`	=	Smooth(Remap(`I_0`, `MapX_I_0`))					(eq.15)
	`I_tw`	=	Smooth(Remap(`I_t`, `MapX_I_t`))
	`I_0w_X`	=	Derivative(`I_0w`)
	`I_tw_X`	=	Derivative(`I_tw`)
	`Z0_X`	=	GradientFast(`Z0`)
	`U_X`	=	GradientFast(`U`)
	`X_xyz`	=	Gradient3DPoints(`Z0`, `Z0_X`, `MapX_Id`)	,	`Xt_xy`	=	`X_xy`+`U_XY`
	`proj_x0,xyz`	=	GradientProjCoor(`X`, `Z0`, `X_xyz`, `Z0_X`)
	`proj_xt,xyz`	=	GradientProjCoor(`Xt`, `Zt`, `Xt_xyz`, `Zt_X`)
	`proj_xt,uvw`	=	GradientProjCoor(`Xt`, `Zt`, 01, 01)
`U3`	`Z_in`	=	`Z0`+`dZ0`	,	`Zt_in`	=	`Z_in`+`W`
	`X_in`	=	ComputeXY(`Z_in`, `MapX_Id`)	,	`Xt_in`	=	`X_in`+`U`+`dU`
	`MapX_I_0_in`	=	ComputeProjMaps(`X_in`, `Z_in`)
	`MapX_I_t_in`	=	ComputeProjMaps(`Xt_in`, `Zt_in`)
	`dZ0_X`	=	GradientFast(`dZ0`)
	`dU_X`	=	GradientFast(`dU`)
	`div_0wz`	=	divergence(`Z0_X`, `dZ0_X`, `U_X`, `dU_X`)					(eq.12,13)
	`M`, `B`, `occl_0t`	=	MB(`Z0`, `dZ0`, `U`, `dU`, `div_0wz`, `occl_0t`, `MapX_I_0t_in`, `I_0_(X)`, `I_0tw_(X)`, `proj_x0t,xy`, `proj_xt,uvw`)
`U4`	`x`	=	SOR(`M`, `B`)

Mx=B

This is the system of linear equations to be solved and the best chance to insert user input into the optimization. Green entries mark optional parts (when solving for UVWZ instead of UVW). Note that the full M is actually a sparse (N_x*N_y*|uvwz|)² matrix, and the M here represents the diagonal and neighbour locations. In the following, one 4-part for UVWZ is shown. The SOR solver in the section after this shows each of the UVWZ uses one row of M: U will use line₀, V line₁, W line₂, Z line₃.

M ⋅ x = B

⌜ a_u1 a_v1 a_w1 a_z1 b_u_↑ b_u_↓ b_u_← b_u_→ ⌝ ⌜ dU ⌝ ⌜ b_u ⌝

a_u2 a_v2 a_w2 a_z2 b_v_↑ b_v_↓ b_v_← b_v_→ . dV = b_v

a_u3 a_v3 a_w3 a_z3 b_w_↑ b_w_↓ b_w_← b_w_→ dW b_w

⌞ a_u4 a_v4 a_w4 a_z4 b_z_↑ b_z_↓ b_z_← b_z_→ ⌟ ⌞ dZ ⌟ ⌞ b_z ⌟

`M`_0..3
a_u1	=	α*μ_u	* [ ∑_i=1..4(b_fi) ]	,	b_f2	=	.5 * (`div₀`[j,i] + `div₀`[j,i-1])	, where b_f1,b_f2,b_f3,b_f4 are right₍₁₎,left₍₂₎,down₍₃₎,up₍₄₎ (S_s or S_m, eq.12,13)
	+	ψ_flow	* [ itw_U ]²	,	ψ_flow	=	(1² + [it+(itw_U`dU`)+(itw_V`dV`)+(itw_W`dW`) - i0 + (itw_Z-i0w_Z)`dZ`]²)^-.5 (BC_m, Δ^t_i=1+, eq.11)
	+	ψ_stereo	* [ itw_U - iltw_U ]²	,	ψ_stereo	=	(1² + [it+(itw_U`dU`)+(itw_V`dV`)+(itw_W`dW`)+(itw_Z`dZ`) - ilt-(iltw_U`dU`)-(iltw_V`dV`)-(iltw_W`dW`)-(iltw_Z*`dZ`)*]²)^-.5 (BC_s2, Δ^{^}_i, eq.11)
	+	ψ_flowL	* [ iltw_U ]²	,	ψ_flowL	=	(1² + [ilt+(iltw_U`dU`)+(iltw_V`dV`)+(iltw_W`dW`)+(iltw_Z`dZ`) - il0]²)^-.5 (BC_m, Δ^t_i=0, eq.11)
a_w2	=	ψ_flow	* [ itw_W * itw_V ]	,	itw_U	=	(it_X`proj_xt,u`)+(it_Y`proj_yt,u`)	, it_X=(itw_X`proj_yt,y` - itw_Y`proj_yt_,_x`)/det	, det=`proj_xt,x``proj_yt,y` - `proj_xt,y``proj_yt,x` (J, eq.22)
	+	ψ_stereo	* [ (itw_W - iltw_W) * (itw_V - iltw_V) ]	,	iltw_U	=	(ilt_X`proj_xlt,u`)+(ilt_Y`proj_ylt,u`)	, ilt_X=(iltw_X`proj_ylt,y` - iltw_Y`proj_ylt_,_x`)/det	, det=`proj_xlt,x``proj_ylt,y` - `proj_xlt,y``proj_ylt,x` (J, eq.22)
	+	ψ_flowL	* [ iltw_W * iltw_V ]	,	it	=	`Images_tw`_[1+], ilt=`Images_tw`_[0]	, i0=`Images_0w`_[1+], il0=`Images_0`_[0]	, itw_X=`Images_tw_X`_[1+], iltw_X=`Images_tw_X`_[0]
a_z4	=	α_zμ_z*	* [ ∑_i=1..4(b_fzi) ]	,	i0w_Z	=	(i0_X`proj_x0,z`)+(i0_Y`proj_y0,z`) (eq.18)	, i0_X=(i0w_X`proj_y0,y` - i0w_Y`proj_y0_,_x`)/det	, det=`proj_x0,x``proj_y0,y` - `proj_x0,y``proj_y0,x` (J, eq.22)
	+	ψ_flow	* [ itw_Z - i0w_Z ]²	,	itw_Z	=	(it_X`proj_xt,z`)+(it_Y`proj_yt,z`) (eq.20)	, it_X=(itw_X`proj_yt,y` - itw_Y`proj_yt_,_x`)/det	, det=`proj_xt,x``proj_yt,y` - `proj_xt,y``proj_yt,x` (J, eq.22)
	+	ψ_st.init	* [ i0w_Z ]²	,	ψ_st.init	=	(1² + [i0+(i0w_Z`dZ`) - il0]²)^-.5* (BC_s1, Δ_i=1+, eq.11)
	+	ψ_stereo	* [ itw_Z - iltw_Z ]²
`M`_4..7
b_u_↑	=	-α*μ_u	* b_f4	,	b_f4	=	.5 * (`div₀`[j,i] + `div₀`[j-1,i])	, `div₀`=(.001² + [(`U_X`+`dU_X`)²+(`U_Y`+`dU_Y`)²+(`V_X`+`dV_X`)²+(`V_Y`+`dV_Y`)²])^-.5 (S_m(uv), eq.13)
b_w_←	=	-α_w*μ_w	* b_fw2	,	b_fw2	=	.5 * (`div_w`[j,i] + `div_w`[j,i-1])	, `div_w`=(.001² + [μ_w * ((`W_X`+`dW_X`)²+(`W_Y`+`dW_Y`)²)])^-.5 (S_m(w), eq.13)
b_z_→	=	-α_zμ_z*	* b_fz1	,	b_fz1	=	.5 (`div_z`[j,i] + `div_z`[j,i+1])*	, `div_z`=(.001² + [μ_z * ((`Z0_X`+`dZ0_X`)²+(`Z0_Y`+`dZ0_Y`)²)])^-.5 (S_s, eq.12)
`B`
b_u	=	-α*μ_u	* [ ∑_i=1..4(b_fib_u_i) - ∑_i=1..4(b_fi) b_u₀ ]	,	b_u₀	=	`U`[j,i], b_u₁=`U`[j,i+1], ...
	-	ψ_flow	* [ itw_U * (it-i0) ]
	-	ψ_stereo	* [ (it-ilt) * (itw_U-iltw_U) ]
	-	ψ_flowL	* [ iltw_U * (ilt-il0) ]
b_w	=	-α_w*μ_w	* [ ∑_i=1..4(b_fwib_wi) - ∑_i=1..4(b_fwi) b_w0 ]	,	b_w0	=	`W`[j,i], b_w1=`W`[j,i+1], b_w2=`W`[j,i-1], ...
	-	ψ_flow	* [ itw_W * (it-i0) ]
	-	ψ_stereo	* [ (it-ilt) * (itw_W-iltw_W) ]
	-	ψ_flowL	* [ iltw_W * (ilt-il0) ]
b_z	=	-α_zμ_z*	* [ ∑_i=1..4(b_fzib_z_i) - ∑_i=1..4(b_fzi) b_z₀ ]	,	b_z₀	=	`Z0`[j,i], ..., b_z₄=`Z0`[j-1,i]
	-	ψ_flow	* [ (itw_Z-i0w_Z) * (i0-it) ]
	-	ψ_st.init	* [ (i0-il0) * i0w_Z ]
	-	ψ_stereo	* [ (it-ilt) * (itw_Z-iltw_Z) ]
	-	ψ_flowL	* [ iltw_Z * (ilt-il0) ]

Neighbours: The order 1..4 is always right₍₁₎,left₍₂₎,down₍₃₎,up₍₄₎, and the variables b_f[i] and b_u_[i] always run concurrent, e.g. where b_f2 asks left, b_u₂ asks left too. In M_4..8, that order is reversed.

SOR

Successive over-relaxation is a faster variant of the Jacobi iteration, using the diagonal and upper+lower triangular submatrixes. The row index i ∈ N_x*N_y*|uvwz| has separate entries for uvwz.

r_new	=	(1.-Ω_SOR)r_old + Ω_SOR (b - sum)/diag	=	(1.-Ω_SOR)r_old + Ω_SOR (b_u - sum_U)/a_u1
diag	=	`M`_i,_0..3 where uvwz=0123	=	a_u1, a_v2, a_w3, or a_z4
b	=	`B`_i	=	b_u, b_v, b_w, or b_z
sum_U	=	`M`_i,1`SOR`_i+1 + `M`_i,2`SOR`_i+2 + `M`_i,3`SOR`_i+3*	=	a_v1`SOR`_v + a_w1`SOR`_w + a_z1*`SOR`_z
	+	`M`_i,4`SOR`_i,1↑ + `M`_i,5`SOR`_i,1_↓ + `M`_i,6`SOR`_i,1_← + `M`_i,7`SOR`_i,1_→	+	b_u_↑`SOR`_u,1↑ + b_u_↓`SOR`_u,1_↓ + b_u_←`SOR`_u,1_← + b_u_→`SOR`_u,1_→
sum_Z	=	`M`_i,0`SOR`_i-3 + `M`_i,1`SOR`_i-2 + `M`_i,2*`SOR`_i-1	=	a_u4`SOR`_u + a_v4`SOR`_v + a_w4*`SOR`_w
	+	`M`_i,4`SOR`_i,1↑ + `M`_i,5`SOR`_i,1_↓ + `M`_i,6`SOR`_i,1_← + `M`_i,7`SOR`_i,1_→	+	b_z_↑`SOR`_z,1↑ + b_z_↓`SOR`_z,1_↓ + b_z_←`SOR`_z,1_← + b_z_→`SOR`_z,1_→

r_new	=	(1.-Ω_SOR)r_old + Ω_SOR (b - sum)/diag	=	(1.-Ω_SOR)r_old + Ω_SOR (b_u - sum_U)/a_u1
diag	=	`M`_i,_0..3 where uvwz=0123	=	a_u1, a_v2, a_w3, or a_z4
b	=	`B`_i	=	b_u, b_v, b_w, or b_z
sum_U	=	`M`_i,1`SOR`_i+1 + `M`_i,2`SOR`_i+2 + `M`_i,3`SOR`_i+3*	=	a_v1`SOR`_v + a_w1`SOR`_w + a_z1*`SOR`_z
	+	`M`_i,4`SOR`_i,1↑ + `M`_i,5`SOR`_i,1_↓ + `M`_i,6`SOR`_i,1_← + `M`_i,7`SOR`_i,1_→	+	b_u_↑`SOR`_u,1↑ + b_u_↓`SOR`_u,1_↓ + b_u_←`SOR`_u,1_← + b_u_→`SOR`_u,1_→
sum_Z	=	`M`_i,0`SOR`_i-3 + `M`_i,1`SOR`_i-2 + `M`_i,2*`SOR`_i-1	=	a_u4`SOR`_u + a_v4`SOR`_v + a_w4*`SOR`_w
	+	`M`_i,4`SOR`_i,1↑ + `M`_i,5`SOR`_i,1_↓ + `M`_i,6`SOR`_i,1_← + `M`_i,7`SOR`_i,1_→	+	b_z_↑`SOR`_z,1↑ + b_z_↓`SOR`_z,1_↓ + b_z_←`SOR`_z,1_← + b_z_→`SOR`_z,1_→

Basha MVSF (multi-view scene flow)

Algorithm

Variables

Mx=B

SOR