Algorithm for estimating the remaining time for an intensive cycle time with heterogeneous iterations

Question

Algorithm for estimating the remaining time for an intensive cycle time with heterogeneous iterations

I have a loop of instructions, for example (pseudocode):

for i = 1 to 1000000 // Process the ith input doSomething(input[i]) end

It will take a lot of time. I would like to deduce some kind of progress and, more importantly, the remaining temporary rating for the user, so that they can decide whether they just have to sit there with their thumbs, go, have coffee, take a walk or go on a week-long vacation at that time how the algorithm compresses its numbers.

To simplify things, you can assume that the number of iterations will be large (for example, more than 100, so that you can print the move on each percentile).

The general algorithm is to simply measure the time of the last iteration, and then multiply by the number of remaining iterations and give this as a result. This is interrupted if each iteration can vary greatly depending on how long it takes to complete.

Another approach is to divide the time elapsed after the first iteration by the number of completed iterations, and then multiply this by the remaining iterations. This is interrupted if the durations of the iterations are not evenly distributed. For example, if the first few inputs are "difficult" and simplified by the end of the input array, the algorithm will reevaluate the remaining time until it almost completes (at which point it will be too high).

So, how can you get a better estimate of the remaining time, when the time of each iteration will be performed by an indirect, arbitrary function (such that it is inexpedient to analytically derive and realize the time until the completion of each iteration) iterative ordinate?

Two ideas that I can imagine can be fruitful areas of research, but I cannot fully explore myself at this time:

The exponential average of the time to complete each past iteration, multiplied by the remaining iterations.
Tracking time used to complete each iteration, then set the function and extrapolate.

Why are design-intensive solutions (e.g., fitting equations) in order:

First, for really big tasks, when this discussion is worth it, runtime can be measured in hours or days. Difficult mathematical operations these days take milliseconds, so the added burden will not be big - in my example above it is clear that doSomething takes as much time as a little math can cost, otherwise it would be indifferent for me to accurately estimate the remaining time in the first place.

Secondly, it is possible, for example, to iterate a bin in a percentile. Then, instead of working with a dataset of “iterations, complete in time and time,” the evaluator will work with a dataset “percent completion and time”, which has no more than 100 data points. This gives you added complexity: say your task takes an entire day or more. Evaluation of the remaining time only once, when each percentage is completed, means 100 evaluations of the evaluation function. When you already take one day, additional minutes and a half to estimate the remaining time do not matter much, but this already gives you a 1-second window for setting equations, and what is not - 1 second is a lot of time for performing mathematics on a modern system. Therefore, I welcome solutions that make heavy use of computing resources.

tl; dr: How to override the exact time estimation function for very lengthy tasks.

+4

algorithm time progress-bar

Superbest Aug 20 '12 at 0:44

source share

3 answers

If you want to get a consistently good prediction, then the second method (suitable and extrapolation) is likely to do best, but only on condition that the fit function is a reasonable match with the true dependence of the processing time as a function of the index. For example, if f (n) is an O (n ^ 2) algorithm predicting the time for

 for i = 1 to N f(i)

it will take approximately k * N ^ 3 time to resolve. Thus, fitting cubic to total time should provide a pretty good aproximation, but setting it quadratic or exponential can be worse than a simple percentage completion of the approximation. Similarly, if f is O (2 ^ n), then any polynomial fit will significantly underestimate the remaining time. All of this suggests that N is large enough to dominate the true behavior of O (n ^ 2).

Thus, although a well-fitting fitting function should be able to accurately predict the remaining time, a universal predictive function is unlikely to be useful.

+2

Penguino Aug 20 '12 at 4:20

source share

I already did something similar. The easiest way I found that creates a fairly accurate time estimate (again, in p-code):

 initTime = getTime() for i = 0 to maxIter doSomething() remainTime = convertToHoursMinutes(((getTime - initTime)/i)*maxIter) next

Thus, you have an average time per iteration, and after 30-50 iterations, your user can have a good idea of the remaining time (in the end, the central limit theorem comes into play).

-1

Nick b Aug 20 '12 at 1:07

source share

Elkamina · Accepted Answer · 2012-08-20T05:33:37+0000

In addition to the Penguino algorithm: instead of matching n and f (n), you can match log (n) and log (f (n)). If your complexity is polynomial, this will work.

Algorithm for estimating the remaining time for an intensive cycle time with heterogeneous iterations

More articles: