What is the algorithm for choosing the best video scene?

When we upload videos to Youtube or other video sharing sites, the site automatically selects the best or most representative scene from the video to display as a video thumbnail. How it's done? I want to know which smart data or other algorithms to study in order to extract the most relevant scene from the video. Any pointers to literature or implementations would be very helpful.

+4
source share
3 answers

My suggestion:

  • i = 1
  • Compare frame i with frame i-1 (using, for example, the sum of the squares of the difference in pixel color intensity)
  • Is the difference> preset_threshold?
    • If yes: the sequence below the threshold frames has just ended. Is this the longest sequence?
      • If yes: best = start of this sequence.
  • i++
  • If i < length_of_clip : Go to 2.
  • Select the best frame.

The idea is this: find the longest β€œscene” (a series of frames whose transitions are below some arbitrary threshold) and show the first frame in this series.

+5
source

I strongly suspect that the "algorithm" is approximately (in pseudocode):

 Random(0, clip.Length) 
+4
source

A simple solution is to extract some frames of the video and display them randomly. By tracking the user's click speed, Youtube already knows how to rate these frames.

+1
source

All Articles