Evaluation of camera position from homography or using the solvePnP () function

Question

Evaluation of camera position from homography or using the solvePnP () function

I am trying to create a static augmented reality scene over a photograph with 4 defined correspondences between coplanar points on the plane and the image.

Here is a step by step:

The user adds an image using the device’s camera. Suppose that it contains a rectangle captured with some perspective.
The user determines the physical size of the rectangle lying in the horizontal plane (YOZ from the point of view of SceneKit). Suppose that the center is the world origin (0, 0, 0), so we can easily find (x, y, z) for each angle.
The user determines the uv coordinates in the image coordinate system for each corner of the rectangle.
A SceneKit scene is created with a rectangle of the same size and is visible from the same point of view.
Other nodes can be added and moved in the scene.

I also measured the position of the iphone camera relative to the center of A4 paper. Thus, for this shot, the position was (0, 14, 42.5), measured in cm. Also, my iPhone was slightly broken on the table (5-10 degrees)

Using this data, I configured SCNCamera to get the desired perspective of the blue plane in the third image:

 let camera = SCNCamera() camera.xFov = 66 camera.zFar = 1000 camera.zNear = 0.01 cameraNode.camera = camera cameraAngle = -7 * CGFloat.pi / 180 cameraNode.rotation = SCNVector4(x: 1, y: 0, z: 0, w: Float(cameraAngle)) cameraNode.position = SCNVector3(x: 0, y: 14, z: 42.5)

This will give me a link to compare my result with.

To build AR using SceneKit, I need:

Adjust the SCNCamera fov to fit the real fov camera.
Calculate the position and rotation of the node camera using 4 root points between world points (x, 0, z) and image points (u, v)

H is homography; K is the inner matrix; [R | t] - External matrix

I tried two approaches to find the transformation matrix for the camera: using solvePnP from OpenCV and manual calculation from homography based on 4 coplanar points.

Manual approach:

1. Learn about homography

This step is successful because the UV coordinates of world origin look correct.

2. The internal matrix

To get the internal matrix of the iPhone 6, I used this application, which gave me the following result from 100 images 640 * 480 Resolution:

Assuming the input image has a 4: 3 aspect ratio, I can scale the above matrix depending on the resolution

I'm not sure, but this seems like a potential problem. I used cv :: calibrationMatrixValues to check fovx for the calculated internal matrix, and the result was ~ 50 °, and it should be close to 60 °.

3. Camera view matrix

 func findCameraPose(homography h: matrix_float3x3, size: CGSize) -> matrix_float4x3? { guard let intrinsic = intrinsicMatrix(imageSize: size), let intrinsicInverse = intrinsic.inverse else { return nil } let l1 = 1.0 / (intrinsicInverse * h.columns.0).norm let l2 = 1.0 / (intrinsicInverse * h.columns.1).norm let l3 = (l1+l2)/2 let r1 = l1 * (intrinsicInverse * h.columns.0) let r2 = l2 * (intrinsicInverse * h.columns.1) let r3 = cross(r1, r2) let t = l3 * (intrinsicInverse * h.columns.2) return matrix_float4x3(columns: (r1, r2, r3, t)) }

Result:

Since I measured the approximate position and orientation for this particular image, I know the transformation matrix that will give the expected result, and this is completely different:

I also slightly conserned about 2-3 elements of the reference rotation matrix, which is -9.1, while it should be close to zero instead, since there is very little rotation.

OpenCV approach:

OpenCV has a solvePnP function, so I tried to use it instead of reinventing the wheel.

OpenCV in Objective-C ++:

 typedef struct CameraPose { SCNVector4 rotationVector; SCNVector3 translationVector; } CameraPose; + (CameraPose)findCameraPose: (NSArray<NSValue *> *) objectPoints imagePoints: (NSArray<NSValue *> *) imagePoints size: (CGSize) size { vector<Point3f> cvObjectPoints = [self convertObjectPoints:objectPoints]; vector<Point2f> cvImagePoints = [self convertImagePoints:imagePoints withSize: size]; cv::Mat distCoeffs(4,1,cv::DataType<double>::type, 0.0); cv::Mat rvec(3,1,cv::DataType<double>::type); cv::Mat tvec(3,1,cv::DataType<double>::type); cv::Mat cameraMatrix = [self intrinsicMatrixWithImageSize: size]; cv::solvePnP(cvObjectPoints, cvImagePoints, cameraMatrix, distCoeffs, rvec, tvec); SCNVector4 rotationVector = SCNVector4Make(rvec.at<double>(0), rvec.at<double>(1), rvec.at<double>(2), norm(rvec)); SCNVector3 translationVector = SCNVector3Make(tvec.at<double>(0), tvec.at<double>(1), tvec.at<double>(2)); CameraPose result = CameraPose{rotationVector, translationVector}; return result; } + (vector<Point2f>) convertImagePoints: (NSArray<NSValue *> *) array withSize: (CGSize) size { vector<Point2f> points; for (NSValue * value in array) { CGPoint point = [value CGPointValue]; points.push_back(Point2f(point.x - size.width/2, point.y - size.height/2)); } return points; } + (vector<Point3f>) convertObjectPoints: (NSArray<NSValue *> *) array { vector<Point3f> points; for (NSValue * value in array) { CGPoint point = [value CGPointValue]; points.push_back(Point3f(point.x, 0.0, -point.y)); } return points; } + (cv::Mat) intrinsicMatrixWithImageSize: (CGSize) imageSize { double f = 0.84 * max(imageSize.width, imageSize.height); Mat result(3,3,cv::DataType<double>::type); cv::setIdentity(result); result.at<double>(0) = f; result.at<double>(4) = f; return result; }

Usage in Swift:

 func testSolvePnP() { let source = modelPoints().map { NSValue(cgPoint: $0) } let destination = perspectivePicker.currentPerspective.map { NSValue(cgPoint: $0)} let cameraPose = CameraPoseDetector.findCameraPose(source, imagePoints: destination, size: backgroundImageView.size); cameraNode.rotation = cameraPose.rotationVector cameraNode.position = cameraPose.translationVector }

Output:

The result is better, but far from my expectations.

Some other things I also tried:

This question is very similar, although I do not understand how the accepted answer works without built-in functions.
decomposeHomographyMat also did not give me the expected result

I am really stuck with this problem, so any help would be greatly appreciated.

+11

ios opencv augmented-reality scenekit homography

alexburtnik May 16, '17 at 17:30

source share

2 answers

Hi @alexburtnik I used your code here, but it doesn’t seem to work properly, and when I draw using SceneKit, the object floats over the entire surface of my marker, as if it were above the plane of the marker, but it’s not when I move to a higher angle towards him. See what I mean here . The lines around the phone are made using openCV drawing, so tracking is good.

I use Aruco tokens to generate rvec and tvec instead of solvePnP.

I can't find what I'm doing wrong ... any help is appreciated. Sorry for posting as an answer, but I can't format the code in the comments :(

 Mat rvec(3, 1, DataType<double>::type); Mat tvec(3, 1, DataType<double>::type); ... aruco::estimatePoseBoard(corners, markerIds, gridBoard, self.camMatrix, self.distCoeffs, rvec, tvec); [self updateCameraProjection:payload withRotation:rvec andTranslation:tvec]; ... -(void) updateCameraProjection:(ArucoPayload *)payload withRotation:(Mat)rvec andTranslation:(Mat)tvec { cv::Mat RotX(3, 3, cv::DataType<double>::type); cv::setIdentity(RotX); RotX.at<double>(4) = -1; RotX.at<double>(8) = -1; cv::Mat R; cv::Rodrigues(rvec, R); R = Rt(); Mat rvecConverted; Rodrigues(R, rvecConverted); rvecConverted = RotX * rvecConverted; Mat tvecConverted = -R * tvec; tvecConverted = RotX * tvecConverted; payload.rotationVector = SCNVector4Make(rvecConverted.at<double>(0), rvecConverted.at<double>(1), rvecConverted.at<double>(2), norm(rvecConverted)); payload.translationVector = SCNVector3Make(tvecConverted.at<double>(0), tvecConverted.at<double>(1), tvecConverted.at<double>(2)); }

-one

d0n13 Jun 26 '19 at 15:11

source share

alexburtnik · Accepted Answer · 2017-06-26T00:03:32+0000

In fact, I was one step away from a working solution with OpenCV .

My problem with the second approach was that I forgot to convert the output from solvePnP back to the SpriteKit coordinate system.

Please note that the input data (image and world points) were indeed correctly converted to the OpenCV coordinate system ( convertObjectPoints: and convertImagePoints:withSize: methods)

So, here is the corrected findCameraPose method with printed comments and intermediate results:

 + (CameraPose)findCameraPose: (NSArray<NSValue *> *) objectPoints imagePoints: (NSArray<NSValue *> *) imagePoints size: (CGSize) size { vector<Point3f> cvObjectPoints = [self convertObjectPoints:objectPoints]; vector<Point2f> cvImagePoints = [self convertImagePoints:imagePoints withSize: size]; std::cout << "object points: " << cvObjectPoints << std::endl; std::cout << "image points: " << cvImagePoints << std::endl; cv::Mat distCoeffs(4,1,cv::DataType<double>::type, 0.0); cv::Mat rvec(3,1,cv::DataType<double>::type); cv::Mat tvec(3,1,cv::DataType<double>::type); cv::Mat cameraMatrix = [self intrinsicMatrixWithImageSize: size]; cv::solvePnP(cvObjectPoints, cvImagePoints, cameraMatrix, distCoeffs, rvec, tvec); std::cout << "rvec: " << rvec << std::endl; std::cout << "tvec: " << tvec << std::endl; std::vector<cv::Point2f> projectedPoints; cvObjectPoints.push_back(Point3f(0.0, 0.0, 0.0)); cv::projectPoints(cvObjectPoints, rvec, tvec, cameraMatrix, distCoeffs, projectedPoints); for(unsigned int i = 0; i < projectedPoints.size(); ++i) { std::cout << "Image point: " << cvImagePoints[i] << " Projected to " << projectedPoints[i] << std::endl; } cv::Mat RotX(3, 3, cv::DataType<double>::type); cv::setIdentity(RotX); RotX.at<double>(4) = -1; //cos(180) = -1 RotX.at<double>(8) = -1; cv::Mat R; cv::Rodrigues(rvec, R); R = Rt(); // rotation of inverse Mat rvecConverted; Rodrigues(R, rvecConverted); // std::cout << "rvec in world coords:\n" << rvecConverted << std::endl; rvecConverted = RotX * rvecConverted; std::cout << "rvec scenekit :\n" << rvecConverted << std::endl; Mat tvecConverted = -R * tvec; std::cout << "tvec in world coords:\n" << tvecConverted << std::endl; tvecConverted = RotX * tvecConverted; std::cout << "tvec scenekit :\n" << tvecConverted << std::endl; SCNVector4 rotationVector = SCNVector4Make(rvecConverted.at<double>(0), rvecConverted.at<double>(1), rvecConverted.at<double>(2), norm(rvecConverted)); SCNVector3 translationVector = SCNVector3Make(tvecConverted.at<double>(0), tvecConverted.at<double>(1), tvecConverted.at<double>(2)); return CameraPose{rotationVector, translationVector}; }

Notes:

The RotX matrix means a 180-degree rotation around the x-axis, which converts any vector from the OpenCV coordinate system to SpriteKit.
The Rodrigue method converts a rotation vector into a rotation matrix (3x3) and vice versa

Evaluation of camera position from homography or using the solvePnP () function

Manual approach:

OpenCV approach:

More articles: