Fine tune image stitching in OpenCV

Question

Fine tune image stitching in OpenCV

(New to computer vision)

The goal is to recreate the level of the game using image stitching or any other method. The level that someone is playing is recorded on video, these frames will be the input.

Expected Result: (Level 4-4 SMB from http://www.vgmaps.com/ ) enter image description here

This is my first attempt to solve this problem using OpenCV (EmguCV). So far, the results are excellent, but I was wondering if there are more suitable methods, knowing that my input will be strictly in 2D?

I am open to try another framework / technique that is not too complicated.

Here are the source images:

enter image description here