First, you seem confused by the overall role of the camera / projector model: its role is to map the 3D points of the world to the points of a 2D image. This sounds obvious, but it means that for given external values ββof R, t (for orientation and position), the distortion function D (.) And intrisics K, you can make a 2D projection m of the three-dimensional point M for this particular camera as follows: m = KD (RM + t). The projectPoints function does exactly this (i.e., 3D-2D projection) for each input 3D point, so you need to give it input parameters related to the camera in which you want to project your 3D points (K & D projector, if you want the projection 2D coordinates, K & D camera, if you want the 2D coordinates of the camera).
Secondly, when you calibrate the camera and the projector together, you do not evaluate the set of external parameters R, t for the camera and the other for the projector, but only one R and one t, which represent the rotation and translation between the coordinate systems of the camera and the projector . For example, this means that it is assumed that your camera has rotation = identifier and translation = zero, and the projector has rotation = R and translation = t (or vice versa, depending on how you performed the calibration).
Now, with respect to the application you mentioned, the real problem is: how do you evaluate the 3D coordinates of this point?
Using two cameras and one projector, this would be easy: you could track objects of interest in two camera images, triangulate their three-dimensional positions using two 2D projections using the triangulatePoints function, and finally project this three-dimensional point into the projector 2D coordinates using projectPoints to find out where to display things with your projector.
With only one camera and one projector, this is still possible, but more difficult, because you cannot triangulate tracked points from just one observation. The main idea is to approach a problem, for example, the problem of a sparse assessment of stereopassivity. Possible method:
project an unambiguous image (for example, black and white noise) using a projector to texture the scene observed by the camera.
as before, track objects of interest in the camera image
for each object of interest, map a small window around its location in the camera image to the projector image to find where it is projected in the 2D coordinates of the projector.
Another approach, which, unlike the one described above, would use calibration parameters, may consist in a dense 3D reconstruction using stereoRectify and StereoBM::operator() (or gpu::StereoBM_GPU::operator() to implement the GPU), map the tracked 2D 3D positions using the estimated depth of the scene, and finally project into the projector using projectPoints .
In any case, it is easier and more accurate using two cameras.
Hope this helps.