Pinhole camera model coordinate system

I recently studied a model of a camera with holes, but I was confused by the model provided by OpenCV and the tutorial "Multiple Geometry in Computer Vision."

I know that the next photo is a simplified model that switches the position of the image plane and the frame of the camera. To better illustrate and understand and take into account the main point (u0, v0), the ratio between two frames is x=f(X/Z)+u0 and y=f(Y/Z)+v0 .

enter image description here

However, I was really confused, because usually the coordinate of the image in the form of the coordinate of the fourth quadrant is as follows:

Can I directly substitute (x, y) in the following definition for the above “equivalent” pinhole model, which is not really convincing?

enter image description here

In addition, if the object is in the quadrant of the region (+ X, + Y) in the camera coordinate (of course, Z> f), in the equivalent model it should appear on the right half of the coordinate image. However, such an object in the image, made by a conventional camera, is supposed to be placed on the left half. Therefore, for me, this model is not reasonable.

Finally, I tried to get the original model as the next.

enter image description here

The result is x1=-f(X/Z) and y1=-f(Y/Z) .

Then I tried to find the relationship between the coordinates (x2, y2) and the camera coordinate. Result: x2=-f(X/Z)+u0 and y2=-f(Y/Z)+v0 .

Between the coordinates (x3, y3) and the coordinate of the camera, the result is x3=-f(X/Z)+u0 and y3=f(Y/Z)+v0 .

No matter which coordinate system I tried, none of them have the form x=f(X/Z)+u0 and y=f(Y/Z)+v0 , which are provided by some textbooks.

In addition, the forecast results on the (x2, y2) coordinate or (x3, y3) coordinate are also not reasonable for the same reason: an object in the area (+ X, + Y, + Z) in the camera coordinate should “appear” on the left half of the plane of the image taken by the camera.

Can someone point out that I misunderstood?

+7
opencv camera calibration perspectivecamera
source share
2 answers

I finally understood this problem and proved that my interpretation was correct if I applied Z. Zhang's “Flexible Camera Calibration” document by looking at a plane from unknown orientations. International Computer Vision Conference (ICCV'99), Corfu, Greece, pp. 666-673, September 1999.

Let me explain everything from scratch. The next photo is the original pinhole camera model and the predicted result on the image sensor. However, this is not what we should see in the “image”.

figure1

What we need to see is

figure2

Comparing Figures 1 and 2, we should notice that this image is up-down and left-right. My friend, who works for the CMOS sensor company, told me that there are built-in functions for automatically displaying the perceived image.

Since we want to model the relationship between the image coordinate and the world coordinate, we must directly consider the image sensor as a projection plane. What confused me earlier was that the projection is always limited to the projected side, and this misleads me to understand the conclusion geometrically.

Now we have to look from the "back side" of the image sensor as a blue (View Perspective) arrow.

The result is shown in Figure 2. The x1-y1 coordinate is now directed to the right and down, respectively, so the equations

 x1=-f(X/Z) y1=-f(Y/Z) 

Now, in terms of the xy coordinate, the equation

 x=f(X/Z)+u0 y=f(Y/Z)+v0 

which are described in the article.

Now let's look at an equivalent model that does not exist in the real world, but helps visual interpretation.

enter image description here

The principle is the same. Look from the center of the projection to the image plane. The result is

figure 4

where the projected "F" is right-left. Equations

 x1=f(X/Z) y1=f(Y/Z) 

Now, in terms of the xy coordinate, the equation

 x=f(X/Z)+u0 y=f(Y/Z)+v0 

which are described in the article.

And last but not least, because the unit of measurement in the world coordinate is mm or inch, and the image coordinate is pixels, there is a scaling factor where some books are described as

 x=a*f(X/Z)+u0 y=b*f(Y/Z)+v0 

or

 x=fx(X/Z)+u0 y=fy(Y/Z)+v0 

where fx=a*f , fy=b*f

+6
source share

In fact, it is much simpler: the coordinates of your object should be in the coordinates of the camera camera , which is a coordinate system whose x and y axes are parallel to the corresponding axis on the image plane, for example here: http://homepages.inf.ed.ac. uk / rbf / CVonline / LOCAL_COPIES / OWENS / LECT9 / node2.html

+1
source share

All Articles