Why can't Direct Linear Transformation (DLT) provide an optimal camera look?

Question

Why can't Direct Linear Transformation (DLT) provide an optimal camera look?

I read the source code of the solvePnP() function in OpenCV, when the flags parameter uses the default SOLVEPNP_ITERATIVE , it calls cvFindExtrinsicCameraParams2 in which FIRST uses DLT (if we have a non-planar set of 3D points) to initialize the camera pose 6DOF, and CvLevMarq solver to minimize retype error.

My question is: DLT poses the problem as a linear least square problem and solves it using SVD decomposition, this seems like the best solution, why are we still using the Lev-Marq iterative method?

Or, is the DLT algorithm problem / limitation worse? Why is a closed form solution minimizing LOCAL for a cost function?

+7

opencv extrinsic-parameters least-squares opencv-solvepnp

zhangxaochen May 07 '17 at 4:27

source share

1 answer

AldurDisciple · Answer 1 · 2017-05-13T14:11:36+0000

When you want to find a solution to a problem, the first step is to express the problem in mathematical terms, and you can use existing mathematical tools to find a solution to your equations. However, interesting problems can usually be expressed in many different mathematical ways, each of which can lead to a slightly different solution. Then, the work of analyzing various methods is performed to understand which of them provides the most stable / accurate / efficient / etc solution.

In the event of a PnP problem, we want the camera to represent the specified associations between the three-dimensional points and the image plane of their projections.

The first way to mathematically express this problem is to use it as a linear least squares problem. This approach is known as the DLT approach, and this is interesting because linear least squares have a closed solution that can be found by expanding a singular value. However, this approach assumes that camera P has 12 degrees of freedom, when in reality it has only 6 (3 for 3D rotation plus 3 for 3D translation). To obtain a 6DOF camera from the result of this approach, an approximation is required (which is not covered by the linear DLT cost function), which leads to an inaccurate solution.

A second way to express the PnP problem mathematically is to use a geometric error as a function of cost and find a camera pose that minimizes geometric error. Because the geometric error is non-linear, this approach evaluates the solution using iterative solvers such as the Levenberg-Marquardt algorithm. Such algorithms can take into account 6 degrees of freedom of the camera pose, which leads to exact solutions. However, since they are iterative approaches, they need to provide an initial assessment of the solution, which in practice is often obtained using the DLT approach.

Now, to answer the headline of your question: of course, the DLT algorithm gives optimal external camera equalizers, but it is optimal only in the sense of the linear cost function solved by the DLT algorithm. Over the years, scientists have found more complex cost functions, which led to more accurate solutions, but it was even more difficult to solve.

Why can't Direct Linear Transformation (DLT) provide an optimal camera look?

More articles: