If you need to take things into account, such as the skovorodkin example in a comment,
[(1, 4), (4, 8), (8, 10)]
(or even more complex examples), then one of the ways to use them effectively is to use charts.
Let's say you create a digraph (possibly using networkx ), where each pair is a node, and there is an edge from (a, b) to node (c, d) if b == c. Now do a topological sort , iterate in order, and merge accordingly. You must take care to handle nodes with two (or more) outgoing edges properly.
I understand that your question says that you want to avoid loops due to the large size of the list. Conversely, for long lists, I doubt that you will even find an effective linear time solution using list comprehension (or something like that). Note that you cannot sort the list in linear time, for example.
Here is a possible implementation:
Let's say we start with
l = [(1,4), (8,10), (19,25), (10,13), (14,16), (25,30)]
To remove duplicates, the following is simplified:
l = list(set(l))
Now, to build a digraph:
import networkx as nx import collections g = nx.DiGraph()
Peaks are just pairs:
g.add_nodes_from(l)
To build edges, we need a dictionary:
froms = collections.defaultdict(list) for p in l: froms[p[0]].append(p)
Now we can add the edges:
for p in l: for from_p in froms[p[1]]: g.add_edge(p, from_p)
The following two lines are not needed - they are just here to show what the graph looks like at this stage:
>>> g.nodes() [(25, 30), (14, 16), (10, 13), (8, 10), (1, 4), (19, 25)] >>> g.edges() [((8, 10), (10, 13)), ((19, 25), (25, 30))]
Now, sort the pairs by topological view:
l = nx.topological_sort(g)
Finally, the hard part here. The result is a DAG. We have to go recursively, but remember that we have already visited.
Let's create a request for what we visited:
visited = {p: False for p in l}
Now the recursive function that sets the node returns the maximum edge of the range from any node available from it:
def visit(p): neighbs = g.neighbors(p) if visited[p] or not neighbs: visited[p] = True return p[1] mx = max([visit(neighb_p) for neighb_p in neighbs]) visited[p] = True return mx
We are all ready. Let me create a list for the last pairs:
final_l = []
and visit all nodes:
for p in l: if visited[p]: continue final_l.append((p[0], visit(p)))
Here's the end result:
>>> final_l [(1, 4), (8, 13), (14, 16)]