What are bit streams in ffmpeg?

After carefully reading the FFmpeg Bitstream Filters Documentation , I still don't understand what they are for.

The document states that the filter:

performs stream level modifications without decoding

Can anyone else explain this to me? The use case will greatly clarify the situation. In addition, there are various filters. How do they differ?

+6
source share
1 answer

Let me explain with an example. FFmpeg video decoders usually work by converting one video clip per call to avcodec_decode_video2. Thus, it is expected that the input will have a “single image” value of the bit streams. Let's look at this problem of moving from a file (an array of disk bytes) to images per second.

For "raw" (applicationb) H264 files (.h264 / .bin / .264), individual unit block data (sps / pps header data streams or frame encoding data) are combined into a sequence of complete units, with the initial code (00 00 01 XX) between, where XX is the type of unit block. (To ensure that the data itself does not have 00 00 01 data, this RBSP was escaped.) Thus, h264 frame parser can simply cut the file into initial code markers. They look for consecutive packets that start at 00 00 01 and include and exclude the next occurrence of 00 00 01. Then they analyze the device type zero and the slice header to find which frame each packet belongs to, and return a set of nal units of one frame as an input to the h264 decoder.

H264 data in .mp4 files is different. You can imagine that the start code 00 00 01 can be considered redundant if there are already length markers in the multiplex format, as is the case with mp4. Thus, to save 3 bytes per frame, they remove the 00 00 01 prefix. They also put PPS / SPS in the file header, not add it before the first frame, and also skip 00 00 01 prefixes. So, if I were to enter this into the h264 decoder, which expects prefixes for all nal blocks, this will not work. The h264_mp4toannexb stream filter corrects this by identifying pps / sps in the extracted parts of the file header (ffmpeg calls this "extradata"), adding this and each nal from separate frame packets with a startup code and combining them back together before entering them into the h264 decoder.

Now you can feel that there is a very subtle difference between the parser and the bit filter. It's true. I believe that the official definition is that the parser accepts a sequence of input data and breaks them into frames without discarding any data or adding any data. The only thing the parser does is change the boundaries of the packets. On the other hand, the bitstream filter allows you to actually modify the data. I'm not sure that this definition is completely true (see, for example, vp9 below), but the conceptual reason mp4toannexb is BSF and not a parser (since it adds the 00 00 01 prefixes).

Other cases where such “bitstream settings” help simplify and unify decoders, but allow us to support all the file options that exist in nature:

  • mpeg4 (divx) b decompression of frames (to get sequences of B-frames, such as IBPs that are encoded as IPB, in AVI and get timestamps Correctly, people came up with this concept of B-frame packaging, where IBP / IPBs are framed as I - (PB) - () , that is, the third packet is empty and the second has two frames. This means that the time stamp associated with frames P and B in the decoding phase is correct. This also means that you have two frames for input for one package, which violates ffmpeg one -frame-in-one-frame-out, so we wrote bsf to break the package into two parts - along with removing the marker, which says that the package contains two frames, therefore BSF, and not the parser, - before entering it into the decoder. In practice, this solves other difficult problems with a multi-threaded frame. VP9 does the same ( called superframes), but splits frames in parser , so the parser / BSF sp illuminated is not always theoretically perfect; perhaps VP9 should be called BSF)
  • hevc mp4 to conversion application (same story as above but for hevc)
  • aac adts to asc (this is basically the same as h264 / hevc appb vs. mp4, but for aac audio)
+19
source

All Articles