For all purposes and tasks, I believe that the output cache is completely in memory - this means that if the application pool is processed, the image will need to be created again.
In the past, I had to do something similar, and I actually implemented a two-tier system that mainly used the HTTP cache server and used the file system as a backup. If something does not exist, I generated an image and saved it to disk And put it in the cache. Thus, if it is pushed out of the cache or the application pool is overwritten, I just have to load it from disk (it looks like you did the same).
As for "too much memory," if you explicitly use HttpContext.Cache instead of [OutputCache], you can control the priority of an item in the cache. You can then configure the settings in your application pool to control the amount of memory that it uses as a whole, but I'm not sure that much will be done there other than that. Pairs of images * 12 products does not look like I need a lot of memory.
Without knowing anything about your application, it sounds to me just like you could leave just by using outputcache. However, if you need something more robust and scalable, I would use the two-tier system that I described. Although, if you already have it implemented and working, "if it has not broken ..."
source share