What format is the source (vhs, dvd, stills)? It is possible that the time stamp is encoded in the data.
Update details
As long as I fully understand the desire to have an automated end-to-end process (especially if you are selling this application, rather than creating your own tool), it would be more effective if someone manually entered the start time of each video (even if there are hundreds of them), and then spend a few weeks coding it to work automatically.
What would I do (without getting a simple, very fast, ultra-precise OCR solution that I don't think exists):
Create a pair of database tables, for example
video video_group ------- ----------- id id filename title start_time date_created group_id date_modified date_created date_deleted date_modified date_deleted
video_group may contain
id| title ----------- 1 | Unassigned 2 | 711 Mockingbird @ 75 3 | Kroger storage room
video will be pre-populated with video file names using the import script. Initially assign all a group_id of 1 (unassigned)
Create a simple Winforms or WPF application (skip my ASCII art):
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | Group: [=========]\/ [New group...] | | | | File: [=========]\/ | | | | Preview | | |--------------------------------------| [Next Video] | | | (first frame of selected video here) | [Prev] | | | | | | | | | | | | | | |--------------------------------------| | | Start Time | | [(enter start time value here as displayed on preview frame)] | | | | [Update] | -------------------------------------------------------------------
User (anyone can do this - secretary, janitor, even a recent CS graduate). All they need to do is read the time from the preview frame, enter it in the Start Time field and click "Update" or "Next" to update the database and go to the next. Keep group selections from one video to the next unless the user changes it.
Assuming it takes the user 30 seconds to read, enter, and then click, they can complete 100-150 videos in an hour (call it 75 for a more realistic estimate). And, trainees are much cheaper than developers time.
If you really have βhundredsβ of videos, it will still do it faster than trimming OCR. If OCR works for the most part, you will most likely need someone to manually examine everything to make sure the results are correct. who asks the question, why bother with OCR?