3. Computer Vision 2

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

3.1. Corner

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Corner Detection [1] • Basic idea: Find points where two edges meet—i.e., high gradient in two directions • “Cornerness” is undefined at a single pixel, because there’s only one gradient per point – Look at the gradient behavior over a small window • Categories image windows based on gradient statistics – Constant: Little or no brightness change – Edge: Strong brightness change in single direction – Flow: Parallel stripes – Corner/spot: Strong brightness changes in orthogonal directions

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Corner Detection: Analyzing Gradient Covariance • Intuitively, in corner windows both Ix and Iy should be high – Can’t just set a threshold on them directly, because we want rotational invariance

• Analyze distribution of gradient components over a window to differentiate between types from previous slide:

• The two eigenvectors and eigenvalues λ1, λ2 of C (Matlab: eig(C)) encode the predominant directions and magnitudes of the gradient, respectively, within the window • Corners are thus where min(λ1, λ2) is over a threshold E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Derivation of the Harris Corner Detector [2] • The Harris corner detector is based on the local auto-correlation function of a signal; where the local auto-correlation function measures the local changes of the signal with patches shifted by a small amount in different directions. • Given a shift (Δx,Δy) and a point (x, y), the auto-correlation function is defined as, (1) where I(·, ·) denotes the image function and (xi, yi) are the points in the window W (Gaussian1) centered on (x, y). 1For

clarity in exposition the Gaussian weighting factor has been omitted from the derivation.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Derivation of the Harris Corner Detector [2] • The shifted image is approximated by a Taylor expansion truncated to the first order terms, (2) where Ix(·, ·) and Iy(·, ·) denote the partial derivatives in x and y, respectively.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Derivation of the Harris Corner Detector [2] • Substituting approximation Eq. (2) into Eq. (1) yields,

where matrix C(x, y) captures the intensity structure of the local neighborhood. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Mathematics M=앞 페이지 C

Classification of image points using eigenvalues of M:

2

“Edge” 2 >> 1

“Corner” 1 and 2 are large, 1 ~ 2; E increases in all directions

1 and 2 are small; E is almost constant in all directions

“Flat” region

“Edge” 1 >> 2 1 E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Mathematics

Measure of corner response:

R  det M  k  trace M 

2

det M  12 trace M  1  2 (k – empirical constant, k = 0.04-0.06) • The trace of a matrix is the sum of its eigenvalues, making it an invariant with respect to a change of basis. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Mathematics 2 • R depends only on eigenvalues of M

“Edge” R0

• R is negative with large magnitude for an edge • |R| is small for a flat region

“Flat” |R| small

“Edge” R

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Workflow

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Workflow Compute corner response R

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Workflow Find points with large corner response: R>threshold

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Workflow Take only the points of local maxima of R

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector: Workflow

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector Code [3] dx = [-1 0 1; -1 0 1; -1 0 1];

% Derivative masks

dy = dx'; Ix = conv2(im, dx, 'same');

% Image derivatives

Iy = conv2(im, dy, 'same'); % Generate Gaussian filter of size 6*sigma (+/- 3sigma) and of % minimum size 1x1. g = fspecial('gaussian',max(1,fix(6*sigma)), sigma); Ix2 = conv2(Ix.^2, g, 'same'); % Smoothed squared image derivatives Iy2 = conv2(Iy.^2, g, 'same'); Ixy = conv2(Ix.*Iy, g, 'same'); cim = (Ix2.*Iy2 - Ixy.^2)./(Ix2 + Iy2 + eps); % My preferred measure. %

k = 0.04;

%

cim = (Ix2.*Iy2 - Ixy.^2) - k*(Ix2 + Iy2).^2; % Original Harris measure.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

shapessm

Iy

Iy2

Ix

Ix2

Ixy

cim

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Harris Detector Code [3]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

References 1. Chandra Kambhamettu, “Corners+Ransac,” Delaware University Lecture Material of Computer Vision (CISC 4/689), 2007. 2. Konstantinos G. Derpanis, “The Harris Corner Detector,” available at http://www.cse.yorku.ca/~kosta/CompVis_Notes/harris_detector.p df 3. Peter Kovesi, “MATLAB and Octave Functions for Computer Vision and Image Processing,” available at http://www.csse.uwa.edu.au/~pk/Research/MatlabFns/index.html

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

3.2. Edge

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Why extract edges? [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Where do edges come from? [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Images as functions [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Images as functions [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Discrete Gradient [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Sobel Operator [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Edge detection using the Sobel operator [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Effects of noise [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Solution: smooth first [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Derivative theorem of convolution [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Laplacian of Gaussian [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

2D edge detection filters [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

LoG filter [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Canny edge detector [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Canny edge detector: step 4 [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Canny edge detector [1]

LoG result E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Canny edge detector [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

3.3. Binocular Stereo

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Vision [1]

scene point

image plane optical center

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Vision [1]

• Basic Principle: Triangulation – Gives reconstruction as intersection of two rays – Requires • calibration • point correspondence E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Point Correspondence [1]

p

p’ ?

Given p in left image, where can the corresponding point p’ in right image be?

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Simplest Case: Recti-linear Configuration [1] • • • •

Image planes of cameras are parallel. Focal points are at same height. Focal lengths same. Then, epipolar lines are horizontal scan lines.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Simplest Case: Recti-linear Configuration [3] P  ( X ,Y , Z ) Image plane

x

f y

O

z

P  ( X , Y , Z ) Z   f

X   f

x  X 

y  Y 

X Z

Y   f

Y Z

X Y ( X , Y , Z )  ( x, y,1)  ( f , f , 1) Z Z

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Simplest Case: Recti-linear Configuration [3] x

p1

O1

y

B

z

P1  ( X , Y , Z )

x

f y

O2

z

p2

Derive expression for Z as a function of x1 , x2 , f , B

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Simplest Case: Recti-linear Configuration [3] x

O1

y

B

z

P1  ( X , Y , Z )

x

f y

O2

z

x1   f

X1 X B B  x1  f , x2   f 1 Z1 Z1 Z1

fB  Z1  x1  x2 E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Simplest Case: Recti-linear Configuration [1] P

(xl,yl)=(f X/Z, f Y/Z) (xr,yr)=(f (X-T)/Z, f Y/Z) Disparity:

Z f

xl

d=xl-xr=f X/Z – f (X-T)/Z xr

pl

pr

Ol

Or T

Then given Z, we can compute X and Y.

T is the stereo baseline d measures the difference in retinal position between corresponding points E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Constraints [1] M Image plane

Y1

Epipolar Line

p

p’ Y2

Z1

O1

X2 X1 Focal plane

O2 Epipole

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Z2

Epipolar Geometry and Fundamental Matrix [1] • The geometry of two different images of the same scene is called the epipolar geometry. • The geometric information that relates two different viewpoints of the same scene is entirely contained in a mathematical construct known as fundamental matrix.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Baseline and Epipolar Plane [1] • Baseline: Line joining camera centers C, C’ • Epipolar plane π : Defined by baseline and scene point X

baseline from Hartley E-mail: & Zisserman

[email protected] http://web.yonsei.ac.kr/hgjung

Epipoles and Epipolar Lines [1] • Epipolar lines l, l’: Intersection of epipolar plane π with image planes • Epipoles e, e’: Where baseline intersects image planes – Equivalently, the image in one view of the other camera center.

C’

C from Hartley & Zisserman

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Epipolar Pencil [1] • As position of X varies, epipolar planes “rotate” about the baseline (like a book with pages) – This set of planes is called the epipolar pencil • Epipolar lines “radiate” from epipole—this is the pencil of epipolar lines

from Hartley & Zisserman

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Epipolar Constraints [1] • • •

Camera center C and image point x define ray in 3-D space that projects to epipolar line l’ in other view (since it’s on the epipolar plane) 3-D point X is on this ray, so image of X in other view x’ must be on l’ In other words, the epipolar geometry defines a mapping x→l’, of points in one image to lines in the other

x’

C

from Hartley & Zisserman

C’

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Fundamental Matrix [1] • Mapping of point in one image to epipolar line in other image x  l’ is expressed algebraically by the fundamental matrix F line point • Write this as l’ = F x • Since x’ is on l’, by the point-on-line definition we know that x’T l’ = 0 • Substitute l’ = F x, we can thus relate corresponding points in the camera pair (P, P’) to each other with the following: x’T F x = 0

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Fundamental Matrix [2] In computer vision, the fundamental matrix F is a 3×3 matrix which relates corresponding points in stereo images. In epipolar geometry, with homogeneous image coordinates, x and x′, of corresponding points in a stereo image pair, Fx describes a line (an epipolar line) on which the corresponding point x′ on the other image must lie. That means, for all pairs of corresponding points holds

Being of rank two and determined only up to scale, the fundamental matrix can be estimated given at least seven point correspondences. Its seven parameters represent the only geometric information about cameras that can be obtained through point correspondences alone. The above relation which defines the fundamental matrix was published in 1992 by both Faugeras and Hartley. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Fundamental Matrix [2] Longuet-Higgins' essential matrix satisfies a similar relationship, the essential matrix is a metric object pertaining to calibrated cameras, while the fundamental matrix describes the correspondence in more general and fundamental terms of projective geometry. This is captured mathematically by the relationship between a fundamental matrix F and its corresponding essential matrix E, which is K and K’ being the intrinsic calibration matrices of the two images involved

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Computing Fundamental Matrix [1] Fundamental Matrix is singular with rank 2 In principal F has 7 parameters up to scale and can be estimated from 7 point correspondences Direct Simpler Method requires 8 correspondences

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Pseudo Inverse-Based [1]

u Fu   0 T

Each point correspondence can be expressed as a linear equation  F11 u v 1 F21  F31

F12 F22 F32

F13  u F23   v   0 F33   1 

 F11  F   12   F13     F21  uu uv u uv vv v u v 1 F22   0    F23  F   31   F32  F   33  E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Pseudo Inverse-Based [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

The Eight-Point Algorithm [1] • Input: n point correspondences ( n >= 8) T – Construct homogeneous system Ax= 0 from pr F pl  0 • x = (f11,f12, f13, f21,f22,f23 f31,f32, f33) : entries in F • Each correspondence gives one equation • A is a nx9 matrix (in homogenous format) – Obtain estimate F^ by SVD of A A  UDVT • x (up to a scale) is column of V corresponding to the least singular value – Enforce singularity constraint: since Rank (F) = 2 • Compute SVD of F^ Fˆ  UDVT • Set the smallest singular value to 0: D -> D’ • Correct estimate of F : F'  UD' VT • Output: the estimate of the fundamental matrix, F’ • Similarly we can compute E given intrinsic parameters E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Locating the Epipoles from F [1] pr TF pl  0

P

el lies on all the epipolar lines of the left image

Pl Epipolar Plane

pr TF el  0

Ol

F is not identically zero

Epipolar Lines

pl

True For every pr

F el  0

Pr

p

r

el

er Epipoles

• Input: Fundamental Matrix F F  UDVT – Find the SVD of F – The epipole el is the column of V corresponding to the null singular value (as shown above) – The epipole er is the column of U corresponding to the null singular value • Output: Epipole el and er E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Or

Corner Detection [5] Im1.jpg

Im2.jpg

thresh = 500;

% Harris corner threshold

% Find Harris corners in image1 and image2 [cim1, r1, c1] = harris(im1, 1, thresh, 3); [cim2, r2, c2] = harris(im2, 1, thresh, 3);

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Correlation-Based Matching [5] dmax = 50; w = 11;

% Maximum search distance for matching % Window size for correlation matching

% Use normalised correlation matching [m1,m2] = matchbycorrelation(im1, [r1';c1'], im2, [r2';c2'], w, dmax); % Display putative matches show(im1,3), set(3,'name','Putative matches') for n = 1:length(m1); line([m1(2,n) m2(2,n)], [m1(1,n) m2(1,n)]) end

Putative matches E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

RANSAC-Based Fundamental Matrix Estimation [5] % Assemble homogeneous feature coordinates for fitting of the % fundamental matrix, note that [x,y] corresponds to [col, row] x1 = [m1(2,:); m1(1,:); ones(1,length(m1))]; x2 = [m2(2,:); m2(1,:); ones(1,length(m1))]; t = .002; % Distance threshold for deciding outliers [F, inliers] = ransacfitfundmatrix(x1, x2, t, 1);

Inlying matches E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Epipolar Lines [5] % Step through each matched pair of points and display the % corresponding epipolar lines on the two images. l2 = F*x1; l1 = F'*x2;

% Epipolar lines in image2 % Epipolar lines in image1

% Solve for epipoles [U,D,V] = svd(F); e1 = hnormalise(V(:,3)); e2 = hnormalise(U(:,3)); for n = inliers figure(1), clf, imshow(im1), hold on, plot(x1(1,n),x1(2,n),'r+'); hline(l1(:,n)); plot(e1(1), e1(2), 'g*');

end

figure(2), clf, imshow(im2), hold on, plot(x2(1,n),x2(2,n),'r+'); hline(l2(:,n)); plot(e2(1), e2(2), 'g*');

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Epipolar Lines [5]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Estimation of Camera Matrix [6] • Once the essential matrix is known, the camera matrices may be retrieved from E. In contrast with the fundamental matrix case, where there is a projective ambiguity, the camera matrices may be retrieved from the essential matrix up to scale and a four-fold ambiguity. • A 3x3 matrix is an essential matrix if and only if two of its singular values are equal, and the third is zero. • We may assume that the first camera matrix is P=[I|0]. In order to compute the second camera matrix, P’, it is necessary to factor E into the product SR of a skew-symmetric matrix and a rotation matrix. • Suppose that the SVD of E is Udiag(1,1,0)VT. Using the notation of W and Z, there are (ignoring signs) two possible factorizations E=SR as follows: S=UZUT, R=UWVT or UWTVT 0 1 0  W  1 0 0  0 0 1 

 0 1 0 Z   1 0 0   0 0 0 

ZW= 1 0 0 E=SR=(UZUT)(UWVT) 0 1 0 = UZ(UTU)WVT   0 0 0  = UZWVT = U diag(1,1,0) VT E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Estimation of Camera Matrix [7] S=UZUT, R=UWVT or UWTVT • For a given essential matrix E=Udiag(1,1,0)VT, and first camera matrix P=[I|0], there are four possible choices for the second camera matrix P’, namely P’=[UWVT|u3] or [UWVT|-u3] or [UWTVT|u3] or [UWTVT|-u3] Where, u3 is the last column of U.

 Epipole

The four possible solutions for calibrated reconstruction from E.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Linear triangulation methods [8] • In each image we have a measurement x=PX, x’=P’X, and these equations can be combined into a form AX=0, which is an equation linear in X. • The homogeneous scale factor is eliminated by a cross product to give three equations for each image point, of which two are linearly independent. • For the first image, x×(PX)=0 and writing this out gives 1T  0 1 y   p X   2T  1  0  x  p X   0    y x 0   p3T X 

x(p3T X)  (p1T X)  0 y (p3T X)  (p 2T X)  0 x(p 2T X)  y (p1T X)  0

Where, piT are the rows of P. •  xp3T  p1T   3T 2T   p p y  A 3 T 1 T  x 'p '  p '   3T 2T   y ' p '  p '  E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Linear triangulation methods [8] • An equation of the form AX=0 can then be composed, with

 xp3T  p1T   3T 2T  p p  y  A 3 T 1 T  x 'p '  p '   3T 2T   y ' p '  p '  Where two equations have been included from each image, giving a total of four equations in four homogeneous unknowns. This is a redundant set of equations, since the solution is determined only up to scale. X can be calculated by SVD of A.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Fundamental matrix estimation methods X

xcam

x 'cam

Y

Y' Z

Z'

X

X'

x 'cam  P ' X x 'cam   P ' X   0

x cam  PX xcam   PX   0

xp X 

p X

 0

x ' p' X 

yp3T X  xp 2T X 

p 2T X  0 yp1T X  0

y ' p'3T X  x 'p'2T X 

3T

1T

3T

1T

p' X

 0

p'2T X  0 y 'p'1T X  0

AX = 0

 xp3T  3T yp where A    x 'p'3T  3T  y 'p'

p1T    p 2T   p'1T    p'2T  

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Rectification [1] How can we make images as in recti-linear configuration?  Stereo Rectification

• Image Reprojection – reproject image planes onto common plane parallel to line between optical centers • Notice, only focal point of camera really matters E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

References 1. Chandra Kambhamettu, “Multipleview1,” Delaware University Lecture Material of Computer Vision (CISC 4/689), 2007. 2. Wikipedia, “Fundamental Matrix (Computer Vision),” http://en.wikipedia.org/wiki/Fundamental_matrix_(computer_vision) 3. Sebastian Thrun, Rick Szeliski, Hendrik Dahlkamp and Dan Morris, “Stereo1,” Stanford University Lecture Material of Computer Vision (CS223B), Winter 2005. 4. Daniel Wedge, “The Fundamental Matrix Song,” http://danielwedge.com/fmatrix/ 5. Peter Kovesi, “Example of finding the fundamental matrix using RANSAC,” available at http://www.csse.uwa.edu.au/~pk/Research/MatlabFns/Robust/example/index.html 6. Richard Hartley and Andrew Zisserman, “8.6 The essential matrix,” Multiple View Geometry in Computer Vision, Cambridge, pp. 238-241. 7. Richard Hartley and Andrew Zisserman, “11.2 Linear triangulation methods,” Multiple View Geometry in Computer Vision, Cambridge, pp. 238-241.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

3.4. Stereo Matching

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Vision [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Reduction of Searching by Epipolar Constraint [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Photometric Constraint [1] Same world point has same intensity in both images. – True for Lambertian surfaces • A Lambertian surface has a brightness that is independent of viewing angle – Violations: • Noise • Specularity • Non-Lambertian materials • Pixels that contain multiple surfaces

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Photometric Constraint [1]

For each epipolar line For each pixel in the left image • compare with every pixel on same epipolar line in right image • pick pixel with minimum match cost This leaves too much ambiguity, so: Improvement: match windows E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Correspondence Using Correlation [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Sum of Squared Difference (SSD) [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Image Normalization [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Images as Vectors [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Image Metrics [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Result [1]

Left

Disparity Map

Images courtesy of Point Grey Research

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Window Size [1]

Better results with adaptive window •



T. Kanade and M. Okutomi, A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment,, Proc. International Conference on Robotics and Automation, 1991. D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion. International Journal of Computer Vision, 28(2):155-174, July 1998. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Ordering Constraint [3]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Smooth Surface Problem [3]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Occlusion [1]

Left Occlusion

Right Occlusion E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Search over Correspondence [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Matching with Dynamic Programming [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Dynamic Programming [3]

Local errors may be propagated along a scan-line and no inter scan-line consistency is enforced. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Correspondence by Feature [3]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Segment-Based Stereo Matching [3] Assumption • Depth discontinuity tend to correlate well with color edges • Disparity variation within a segment is small • Approximating the scene with piece-wise planar surfaces

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Segment-Based Stereo Matching [3] • Plane equation is fitted in each segment based on initial disparity estimation obtained SSD or Correlation • Global matching criteria: if a depth map is good, warping the reference image to the other view according to this depth will render an image that matches the real view • Optimization by iterative neighborhood depth hypothesizing

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Segment-Based Stereo Matching [3]

Hai Tao and Harpreet W. Sawhney

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Segment-Based Stereo Matching [3]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Smoothing by MRF [2]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Smoothing by MRF [4]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Smoothing by MRF [4]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Testing and Comparison [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Testing and Comparison [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Testing and Comparison [1]

Window-based matching (best window size)

Ground truth

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Stereo Testing and Comparison [1]

State of the art method

Ground truth

Boykov et al., Fast Approximate Energy Minimization via Graph Cuts, International Conference on Computer Vision, September 1999. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Intermediate View Reconstruction [1]

Right Left Disparity Image Image

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Intermediate View Reconstruction [1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

References 1. David Lowe, “Stereo,” UBC(Univ. of British Columbia) Lecture Material of Computer Vision (CPSC 425), Spring 2007. 2. Sebastian Thrun, Rick Szeliski, Hendrik Dahlkamp and Dan Morris, “Stereo 2,” Stanford Lecture Material of Computer Vision (CS 223B), Winter 2005. 3. Chandra Kambhamettu, “Multiple Views1” and “Multiple View2,” Univ. of Delawave Lecture Material of Computer Vision (CISC 4/689), Spring 2007. 4. J. Diebel and S. Thrun, “An Application of Markov Random Fields to Range Sensing,” Proc. Neural Information Processing Systems (NIPS), Cmbridge, MA, 2005. MIT Press.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

3.5. Optical Flow

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion in Computer Vision Motion • Structure from motion • Detection/segmentation with direction

[1]

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field v.s. Optical Flow [2], [3] Motion Field: an ideal representation of 3D motion as it is projected onto a camera image.

Optical Flow: the approximation (or estimate) of the motion field which can be computed from time-varying image sequences. Under the simplifying assumptions of 1) Lambertian surface, 2) pointwise light source at infinity, and 3) no photometric distortion. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field [2] • An ideal representation of 3D motion as it is projected onto a camera image. • The time derivative of the image position of all image points given that they correspond to fixed 3D points. “field : position  vector” • The motion field v is defined as where P is a point in the scene where Z is the distance to that scene point. V is the relative motion between the camera and the scene, T is the translational component of the motion, and ω is the angular velocity of the motion.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field [5] 3D point P (X,Y,Z) and 2D point p (x,y), focal length f

P p f Z

X   x f  Z  Y y  f Z 

(1)

Motion field v can be obtained by taking the time derivative of (1)

 VZ   VX v x  f  Z  X Z 2      v y  f  VY  Y VZ   Z2  Z

(2)

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field [5] The motion of 3D point P, V is defined as flow

V  T    P

VX  TX  Y Z  Z Y  VY  TY  Z X   X Z V  T   Y   X Z X Y  Z

(3)

By substituting (3) into (2), the basic equations of the motion field is acquired

  X xy Y x 2 TZ x  TX f  Y f   Z y    v x  f f Z  2    T y T f xy y Y v y  Z   X f  Z x  Y  X  Z f f

(4)

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field [5] The motion field is the sum of two components, one of which depends on translation only, the other on rotation only.

  X xy Y x 2 TZ x  TX f  Y f   Z y    v x  f f Z  2    T y T f xy y Y v y  Z   X f  Z x  Y  X  Z f f Translational components

TZ x  TX f  v x  Z  T y  TY f v y  Z Z 

(4)

Rotational components

(5)

  X xy Y x 2   vx  Y f   Z y  f f  2   xy y v y   X f   Z x  Y  X  f f

(6)

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Pure Translation [2] If there is no rotational motion, the resulting motion field has a peculiar spatial structure. If (5) is regarded as a function of 2D point position,

TZ x  TX f  v x  Z  T y  TY f v y  Z Z 

(5)

 TZ  TX  v x f    x Z  TZ   v  TZ  y  f TY  y Z  TZ

     

If (x0, y0) is defined as in (6)

TX   x0  f T Z  TY  y0  f  TZ

(6)

TZ   v  x Z  x  x0   T v y  Z  y  y0  Z 

(7)

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Pure Translation [2]

Equation (7) say that the motion field of a pure translation is radial. In particular, if TZ0, the motion field vectors point towards p0, which is called the focus of contraction. If TZ=0, from (5), all the motion field vectors are parallel.

TZ x  TX f  v x  Z  T y  TY f v y  Z Z 

(5)

TX  v x   f Z  T v y   f Y Z 

(8)

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Motion Parallax [6] Equation (8) say that their lengths are inversely proportional to the depth of the corresponding 3D points.

TX    v f  x Z  T v y   f Y Z 

(8)

http://upload.wikim edia.org/wikipedia/ commons/a/ab/Par allax.gif This animation is an example of parallax. As the viewpoint moves side to side, the objects in the distance appear to move more slowly than the objects close to the camera [6]. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Motion Parallax [2] If two 3D points are projected into one image point, that is coincident, rotational component will be the same. Notice that the motion vector V is about camera motion.

  X xy Y x 2 TZ x  TX f  Y f   Z y    v x  Z f f  2   T y  T f xy y Y v y  Z   X f  Z x  Y  X  Z f f

(4)

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Motion Parallax [2] The difference of two points’ motion field will be related with translation components. And, they will be radial w.r.t FOE or FOC.

  1  1 1  1   vx  TZ x  TX f       x  x0  TZ      Z1 Z 2   Z1 Z 2   v  T y  T f  1  1   y  y T  1  1      Y 0 Z   y  Z Z Z Z Z 2  2   1  1  FOC

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Motion Parallax [2] Motion Parallax The relative motion field of two instantaneously coincident points: 1. Does not depend on the rotational component of motion 2. Points towards (away from) the point p0, the vanishing point of the translation direction.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Pure Rotation w.r.t Y-axis [7] If there is no translation motion and rotation w.r.t x- and z- axis, from (4)

 Y x 2 v x  Y f  f  Y xy  vy   f

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Pure Rotation w.r.t Y-axis [7] Translational Motion

Z

Distance to the point, Z, is constant.

Z2

Rotational Motion

Z1 Z

Distance to the point, Z, is changing. According to Z, y is changing, too. E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Optical Flow [4]

The Image Brightness Constancy

E-mail: [email protected] Equation [2] http://web.yonsei.ac.kr/hgjung

Motion Field: Optical Flow [2]

I T  V   I t V

It

I

Assumption The image brightness is continuous and differentiable as many times as needed in both the spatial and temporal domain. The image brightness can be regarded as a plane in a small area.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Lucas-Kanade Method [8] Assuming that the optical flow (Vx,Yy) is constant in a small window of size mxm with m>1, which is center at (x, y) and numbering the pixels within as 1…n, n=m2, a set of equations can be found:

I T  V   I t

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

Motion Field: Aperture Problem [2], [9] The component of the motion field in the direction orthogonal to the spatial image gradient is not constrained by the image brightness constancy equation.  Given local information can determine component of optical flow vector only in direction of brightness gradient. The aperture problem. The grating appears to be moving down and to the right, perpendicular to the orientation of the bars. But it could be moving in many other directions, such as only down, or only to the right. It is impossible to determine unless the ends of the bars become visible in the aperture. http://upload.wikimedia.org/wikipedia/ commons/f/f0/Aperture_problem_anim ated.gif

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung

References 1. Richard Szeliski, “ Dense motion estimation,” Computer Vision: Algorithms and Applications, 19 June 2009 (draft), pp. 383-426. 2. Emanuele Trucco, Alessandro Verri, “8. Motion,” Introductory Techniques for 3-D Computer Vision, Prentice Hall, New Jersey 1998, pp.177-218. 3. Wikipedia, “Motion field,” available on www.wikipedia.org. 4. Wikipedia, “Optical flow,” available on www.wikipedia.org. 5. Alessandro Verri, Emanuele Trucco, “Finding the Epipole from Uncalibrated Optical Flow,” BMVC 1997, available on http://www.bmva.ac.uk/bmvc/1997/papers/052/bmvc.html. 6. Wikipedia, “Parallex,” available on www.wikipedia.org. 7. Jae Kyu Suhr, Ho Gi Jung, Kwanghyuk Bae, Jaihie Kim, “Outlier rejection for cameras on intelligent vehicles,” Pattern Recognition Letters 29 (2008) 828840. 8. Wikipedia, “Lucas-Kanade Optical Flow Method,” available on www.wikipedia.org. 9. Wikipedia, “Aperture Problem,” available on www.wikipedia.org.

E-mail: [email protected] http://web.yonsei.ac.kr/hgjung