I have a loop of instructions, for example (pseudocode):
for i = 1 to 1000000
It will take a lot of time. I would like to deduce some kind of progress and, more importantly, the remaining temporary rating for the user, so that they can decide whether they just have to sit there with their thumbs, go, have coffee, take a walk or go on a week-long vacation at that time how the algorithm compresses its numbers.
To simplify things, you can assume that the number of iterations will be large (for example, more than 100, so that you can print the move on each percentile).
The general algorithm is to simply measure the time of the last iteration, and then multiply by the number of remaining iterations and give this as a result. This is interrupted if each iteration can vary greatly depending on how long it takes to complete.
Another approach is to divide the time elapsed after the first iteration by the number of completed iterations, and then multiply this by the remaining iterations. This is interrupted if the durations of the iterations are not evenly distributed. For example, if the first few inputs are "difficult" and simplified by the end of the input array, the algorithm will reevaluate the remaining time until it almost completes (at which point it will be too high).
So, how can you get a better estimate of the remaining time, when the time of each iteration will be performed by an indirect, arbitrary function (such that it is inexpedient to analytically derive and realize the time until the completion of each iteration) iterative ordinate?
Two ideas that I can imagine can be fruitful areas of research, but I cannot fully explore myself at this time:
- The exponential average of the time to complete each past iteration, multiplied by the remaining iterations.
- Tracking time used to complete each iteration, then set the function and extrapolate.
Why are design-intensive solutions (e.g., fitting equations) in order:
First, for really big tasks, when this discussion is worth it, runtime can be measured in hours or days. Difficult mathematical operations these days take milliseconds, so the added burden will not be big - in my example above it is clear that doSomething takes as much time as a little math can cost, otherwise it would be indifferent for me to accurately estimate the remaining time in the first place.
Secondly, it is possible, for example, to iterate a bin in a percentile. Then, instead of working with a dataset of “iterations, complete in time and time,” the evaluator will work with a dataset “percent completion and time”, which has no more than 100 data points. This gives you added complexity: say your task takes an entire day or more. Evaluation of the remaining time only once, when each percentage is completed, means 100 evaluations of the evaluation function. When you already take one day, additional minutes and a half to estimate the remaining time do not matter much, but this already gives you a 1-second window for setting equations, and what is not - 1 second is a lot of time for performing mathematics on a modern system. Therefore, I welcome solutions that make heavy use of computing resources.
tl; dr: How to override the exact time estimation function for very lengthy tasks.