I am trying to sort the list of identifiers based on a "sorting map", which is an array of tuples (ID1, ID2, timestamp) , which determines which identifiers should be sorted before other identifiers. Here are the rules:
ID1 should be sorted to ID2 .- A timestamp can be used to break links with newer timestamps that exceed older timestamps. for example, a given sort key
(C, A, 1/1/1900), (C, B, 1/1/2000) , then B sorted to A - there may be cycles, for example.
(A, B, 1/1/1950), (B, C, 1/1/1980), (C, A, 1/1/1900) . The timestamp can be used to interrupt the cycles, while the records with the old timestamps in the cycle are deleted from the sort card until the cycle disappears. - If the identifier is not on the sort map, it is sorted after any identifier that is present in the sort map
Example: to display the sort (C, A, 1/1/1900), (C, B, 1/1/2000) and the list (A, B, C, D) for sorting, the sorted result will be (C, B, A, D) .
I am puzzled by turning these rules into an algorithm. Here is what I still have:
Get the latest sorting map from the database. I get a maximum of one entry for each unique pair of identifiers.
Removing loops from a sorting chart. How? Or is it easier to just ignore loops as part of step 4?
Convert a sort map into memory for optimal performance. For example, build a hash table whose key is each unique identifier on the sorting map, so I can quickly find all sorting lines containing a specific identifier.
Sort my identifier array using a shared binary sort library using a custom comparison function that accepts any two identifiers ID1 and ID2 . Comparison function:
but. View all sorting records containing ID1 or ID2 using the hash table from step # 3.
b. If I already have an entry containing both ID1 and ID2 on the sorting card, stop - we know which one should be the first!
from. If neither ID1 nor ID2 are found on the sort map, then this is a tie. Return deterministically arbitrary results (e.g. lower id wins).
e. If one identifier is on the sorting map and the other is not, stop. Found should be sorted first.
e. If we get here, both identifiers are on the sort map, but there is no direct comparison on the sort map. Now what?
Performance does not cause much concern, since the maximum size of the sorting map is less than 20 thousand lines, and the maximum number of sorted identifiers is less than 30.
Any ideas?
FWIW, we will use the .NET List<T>.Sort(Comparison<T>) to do the sorting in C #, but the underlying algorithm is obviously a language and agnostic platform.
If you're interested, the real need for this algorithm is needed here:
Our company is building mobile applications for delivery drivers who visit about 20 places every day from the territory of 100-150 of all places for which they are responsible. A list of places every day is dynamically assigned based on the inventory of each location. Places that have a low inventory receive delivery of new inventory, and places where there is still enough inventory are not visited.
Drivers can visit places in any order, but they usually follow the same routes every day (for example, they visit places in the southern part of the city when traffic is shining in the morning, and then they visit places in the northern part of the city when traffic is heavier than the South).
We decided not to use third-party routing software, which automatically determines the most efficient route. Instead, we found it better to allow the driver to choose a route because the routing program has a difficult time with restrictions such as “that the docking station is usually only available until 7AM” or “the guy who needs to sign the delivery receipt, Fridays, which have a big impact on supply schedules.
In any case, we would like to use the historical driver selection to sort the route every day in the same order in which the driver visited the same locations for the last time. This will give the driver a beautifully designed route every day that matches his preferences, without having to manually re-schedule, except in unusual cases. This will save the driver a minute or two every day, which will add up over time.
Each historical route is indeed a list like this (ID1, ID2, ID3, ..., IDN, timestamp), but as an alternative to keeping hundreds of past schedules, I thought it would be easier to put each N-machine historical route into pairs of cars . This means that I have to store at most N * N-1 tuples, because new orders always push the old ones out of the sorting map. If this is a poor simplification, let me know .; -)