I would like to propose an alternative solution that for me looks much simpler and easier to work with.
This is because it looks exactly like a problem that can be very easily solved using left fold, which is exactly what reduce is in python ( http://en.wikipedia.org/wiki/Fold_%28higher-order_function % 29 )
reduce (function, iterable [, initializer])
Apply the function of two arguments cumulatively to the iteration elements from left to right to reduce iterability to a single value. For example, the abbreviation (lambda x, y: x + y, [1, 2, 3, 4, 5]) computes (((((1 + 2) +3) +4) +5). The left argument x is the accumulated value and the right argument, y, is the update value from the iterable. If there is an optional initializer, it is placed before the iterability elements in the calculation and is used by default when the iterability is empty. If no initializer is specified and the iterability contains only one element, the first element is returned. Approximately equivalent:
Simply put, I would iterable , which would be line_nums one value at a time, using a function that will decide if the value is part of already created sequences or not. So I would end up with a list of lists representing sequences of consecutive numbers. Then I will convert them to a range ( xx-yy ) or just one value ( xx ).
So my solution would look like this:
def make_sequences(sequences, val): if sequences != [] and sequences[-1][-1] == val - 1: return sequences[:-1] + [sequences[-1] + [val]] return sequences + [[val]] def sequence_to_string(s): return '%s-%s' % (s[0], s[-1]) if len(s) > 1 else str(s[0]) def get_line_numbers_concat(line_nums): return ', '.join( sequence_to_string(seq) '%s-%s' % (seq[0], seq[-1]) for seq in reduce(make_sequences, line_nums, []) )
The functions sequence_to_string(..) and get_line_numbers_concat(..) pretty simple, so Iβll just explain what happens inside make_sequences(..) :
def make_sequences(sequences, val):
The first time it calls sequences will be [] (since it was passed to reduce in get_line_numbers_concat(..) ), the next time it calls the sequence list, the results of make_sequences(..) will be passed as sequences to subsequent calls to make_sequences(..) . To make this clear, it would line_nums out called up using the original line_nums :
make_sequences([], 10007) ==> [[10007]] make_sequences([[10007]], 10008) ==> [[10007, 10008]] ... make_sequences([[10007, 10008, 10009, 10010, 10011]], 10013) ==> [[10007, 10008, 10009, 10010, 1011], [10013]] ...
Then it remains for us to decide if val belongs to the last sequence in sequences :
if sequences != [] and sequences[-1][-1] == val - 1: # (1)
This ensures that sequences not empty (otherwise we get an index error), and then we check to see if the last number in the last sequence in the sequence (ie sequences[-1][-1] is equal to val - 1 and therefore, val must be added to this last sequence.
This is done here:
return sequences[:-1] + [sequences[-1] + [val]]
where we take all the sequences except the last ( sequences[:-1] ) and add a new sequence for them, which is the result of adding val to the last Sequence.
If, however, condition (1) is not true, which means either the absence of previous sequences ( seqences == [] ) or the last number of the last sequence is not less than val . In this case, we add a new sequence with only one val value:
return sequences + [[val]]