Introduction to Video Compression Technology

Is it useful to study the H.261 specification for implementation in modern video compression technology, or do I need to start somewhere else? I'm not sure where to start, but H.261 seems simple enough to make concepts easier to understand.

+4
source share
2 answers

The specification is not a very good introduction - it is written primarily to be precise, and contains a small explanation of why everything is the way it is. H.261 is essentially the same as MPEG-1. One of the books I used (and found pretty well written) is the MPEG Video compression standard, Mitchell, Pennebaker, Fogg and LeGall. FWIW, this covers both MPEG-1 and MPEG-2 (aka h.261 and h.262, respectively).

+1
source

I partially agree with Jerry Coffin; I think that H.261 is definitely a good starting point for anyone learning about video compression, but reading the spec directly is not a good idea.

The main building blocks from H.261 that I would focus on are motion compensation, macroblocks, DCT to reduce spatial redundancy, and differential PCM (DPCM) to reduce temporary redundancy.

If I had to choose one general principle of video compression for training purposes, start with motion estimation and motion compensation. Try this exercise: imagine two consecutive video frames separated by only 1/30 of a second. They will be very similar, right? Without looking at the Internet, what would you do to use the information encoded in frame 1 to reduce the length of frame code 2? Now find a motion estimation search.

Then, how would you reduce spatial redundancy? H.261 uses something like JPEG and uses DCT.

Editing: by Wang, Osterman, and Zhang (p .293-4 on block hybrid video coding, which is essentially H.261):

In this encoder, each video frame is divided into blocks of a fixed size, and each block is processed more or less independently, therefore the designation is “block-based”. The word "hybrid" means that each block is encoded using a combination of temporal prediction with motion compensation and transform coding .... The first block is predicted from a previously encoded reference frame using a block-based motion estimate. The motion vector sets the offset between the current block and the best matching side. The predicted block is obtained from the previous frame by the calculated MV using motion compensation. Then, the prediction error block is encoded by converting it using DCT, quantizing the DCT coefficients, and converting them into binary codewords using variable length encoding.

+1
source

Source: https://habr.com/ru/post/1313101/


All Articles