Lowest overhead camera with GPU on Android

Question

Lowest overhead camera with GPU on Android

My application needs to do some processing on camcorders in real time on the processor before executing them on the GPU. The GPU also displays some other things that depend on the results of CPU processing; therefore, it is important to maintain synchronization so that we do not display the frame on the GPU until the processor processing results for that frame are available.

The question is, what is the lowest service approach for this on android?

Processing of the processor in my case simply requires an image in shades of gray, so the YUV format, where the Y plane is packed, is ideal (and, as a rule, fits well with the native format of camera devices). NV12, NV21 or fully flat YUV will provide ideal access with low bandwidth to shades of gray, which would be preferable on the processor side.

In the original camera API, setPreviewCallbackWithBuffer () was the only reasonable way to get data to the processor for processing. It had a separate Y plane, so it is ideal for processing a processor. Obtaining this frame for OpenGL for rendering with a low overhead image was a more complicated aspect. In the end, I wrote the NEON color conversion program to output RGB565 and just used glTexSubImage2d to get it on the GPU. This was first implemented in the Nexus 1 timeframe, where even a 320x240 call to glTexSubImage2d took 50 ms of processor time (I suppose bad drivers trying to make swizzling texture - this was significantly improved in updating the system later).

That day I looked at things like eglImage extensions, but they don't seem to be available or well documented enough for custom applications. I looked a bit at the inner classes of android GraphicsBuffer, but ideally I want to stay in the world of supported public APIs.

The android.hardware.camera2 API promised that it would be able to connect both ImageReader and SurfaceTexture to the capture session. Unfortunately, I see no way to provide the correct serial pipeline here - holding the updateTexImage () call until the processor is processed is simple enough, but if another frame arrives during this processing, updateTexImage () will skip the very last Frame . It seems that with multiple outputs there will be independent copies of the frames in each of the queues that I would like to avoid.

Ideally, this is what I would like:

Camera driver fills memory with last frame
The CPU receives a pointer to the data in memory, can read Y data without copying.
The processor processes the data and sets a flag in my code when the frame is ready
When you start to render a frame, check if a new frame is ready
Call some APIs to bind the same memory as the GL texture.
When a new frame is ready, release the buffer holding the previous frame back to the pool

I don’t see a way to do exactly this zero-copy style with an open API on Android, but what is closest to it?

One crazy thing I tried to work with but not documented: the ANativeWindow NDK API can accept NV12 format data, although the corresponding format constant is not one of them in public headers. This allows SurfaceTexture to be populated with NV12 data using memcpy () to avoid color conversion on the processor side and any swizzling that occurs on the driver side in glTexImage2d. This is still an additional copy of the data, although it seems that it is not needed, and again, since it is not documented, it may not work on all devices. A supported zero-copy serial camera -> ImageReader -> SurfaceTexture or its equivalent would be ideal.

+6

android android-ndk opengl-es android-camera

tangobravo May 29 '16 at 13:30

source share

1 answer

fadden · Accepted Answer · 2016-05-29T18:17:08+0000

The most effective way to process video is to completely abandon the processor, but it seems that this is not an option for you. Public APIs are usually focused on doing everything at the hardware level, since this is what the infrastructure itself needs, although there are some ways for RenderScript. (I assume you saw a demo of the Grafika filter , which uses fragmentary shaders.)

Access to CPU data used to denote slow camera APIs, or work with GraphicBuffer and relatively obscure EGL features (like this question ). ImageReader's point was to provide zero-copy YUV data from the camera.

You cannot serialize Camera -> ImageReader -> SurfaceTexture since ImageReader does not have a buffer forwarding API. Unfortunately, this could make this trivial. You can try to reproduce what SurfaceTexture does by using the EGL functions to package the buffer as an external texture, but again you fall into the non-public area of GraphicBuffer, and I'm worried about problems with the owner / life of the buffer.

I'm not sure how parallel paths will help you (Camera2 -> ImageReader, Camera2 -> SurfaceTexture), because what is sent to SurfaceTexture will not have your modifications. FWIW, it does not include an extra copy - in Lollipop or so, the BufferQueue has been updated so that individual buffers can move through multiple queues.

It is possible that there are some new new APIs that I have not seen yet, but from what I know, your ANativeWindow approach is probably the winner. I suspect you will be better off with one of the camera formats (YV12 or NV21) than with NV12, but I don’t know for sure.

FWIW, you will drop frames if your processing takes too much time, but if your processing is not uneven (some frames take much longer than others), you will have to drop frames no matter what. Once again moving into the non-public API area, you can switch SurfaceTexture to “synchronous” mode, but if your buffers fill up, you still throw frames.

Lowest overhead camera with GPU on Android

More articles: