It can be formulated as a change in the shortest path .
We need to cover a set of ranges of total length M with at most N pages. Pages can have L different lengths, they are not aligned (can be placed at any address). The difference between the total "extra" area and the total page length is equal to the constant M , which minimizes the total page length.
Let us plot the graph associated with this problem. Each memory address in any range has a corresponding vertex on the graph. Each vertex has L outgoing edges corresponding to pages L , starting from a given address. The length of each edge is the length of the page. Each edge comes to a certain vertex of the graph, depending on where the corresponding page ends:
- The page ends in some unoccupied position. The edges will return to the vertex corresponding to the starting address of the first range after this position.
- The page ends within a certain range. The edges will return to the top corresponding to the end address of the page.
- The page ends in an unoccupied position with an address greater than the address of any range. Edge comes to the top of the destination. (The starting address of the very first range corresponds to the starting vertex, and we must find the shortest path between these two vertices).
Since the resulting graph is a DAG , the shortest path can be found in linear time, processing the vertices in topological order (or, even simpler, in the order of the corresponding memory addresses).
For each vertex, store an array of N pairs {path length, reverse pointer}. Although the shortest path algorithm visits any vertex, index this array with the number of transitions in the path, and if the path is shorter than the stored path length, replace {path length, reverse pointer}. When each vertex is processed, find the shortest path in the array of pairs belonging to the target vertex, and then restore the path using the back pointers. This path describes the optimal coverage.
The worst time complexity O (L * M * N) is determined by the maximum number of edges (L * M) and the number of elements in the array belonging to each vertex (N). If the ranges are sparse, most of the edges fall into the starting address of a certain range, most of the vertices corresponding to the internal addresses are not used, and the time complexity is much less.
This algorithm requires O (M * N) space, but for sparse ranges this can be significantly reduced if we put all the vertices of the graph (or, possibly, all length / pointer pairs) in the hash map.