We can use the basic recursive algorithm from this answer and modify it to create partitions of a certain length without the need to generate and filter unwanted partitions.
def sorted_k_partitions(seq, k): """Returns a list of all unique k-partitions of `seq`. Each partition is a list of parts, and each part is a tuple. The parts in each individual partition will be sorted in shortlex order (ie, by length first, then lexicographically). The overall list of partitions will then be sorted by the length of their first part, the length of their second part, ..., the length of their last part, and then lexicographically. """ n = len(seq) groups = []
Quite a lot is going on here, so let me explain.
First, we start with a bottom-up procedural implementation (teminology?) Of the same recursive algorithm mentioned above :
def partitions(seq): """-> a list of all unique partitions of `seq` in no particular order. Each partition is a list of parts, and each part is a tuple. """ n = len(seq) groups = []
The main algorithm is in the nested function generate_partitions . In principle, it looks through the sequence, and for each element it: 1) puts the element in each of the current groups (aka parts) in the working set and returns; 2) puts the element in its own new group.
When we get to the end of the sequence ( i == n ), we get a (deep) copy of the working set that we collected.
Now, to get partitions of a certain length, we could just filter or group the results for those we are looking for and do with them, but this approach does a lot of unnecessary work (i.e. recursive calls) if we just wanted partitions of some length k .
Please note that in the above function, the section length (i.e. the number of groups) increases whenever:
Performed
.... Thus, we limit the size of the partition by simply placing protection on this block, for example:
def partitions(seq, k): ... def generate_partitions(i): ...
Adding a new parameter and only this line to the partitions function will now lead to the generation of only partitions up to k in length. This is almost what we want. The problem is that the for loop sometimes still generates partitions shorter than k .
To trim those recursive branches, we only need to execute the for loop, when we can be sure that we have enough remaining elements in our sequence to expand the working set of a total of groups k . The number of remaining elements - or elements that have not yet been placed in the group - n - i (or len(seq) - i ). And k - len(groups) is the number of new groups that need to be added to create a valid k-partition. If n - i <= k - len(groups) , we cannot discard the element by adding it to one of the current groups - we must create a new group.
So, we just add one more guard, this time to another recursive branch:
def generate_partitions(i): ... # only add to current groups if the number of remaining items # exceeds the number of required new groups. if n - i > k - len(groups): for group in groups: group.append(seq[i]) yield from generate_partitions(i + 1) group.pop() # only add a new group if the total number would not exceed k if len(groups) < k: groups.append([seq[i]]) yield from generate_partitions(i + 1) groups.pop()
And with that, you have a working k-partition generator. You could probably collapse some of the recursive calls even more (for example, if there are 3 remaining elements and we need 3 more groups, then you already know that you have to separate each element into your own group), but I wanted to show how small modification of the basic algorithm that generates all sections.
It remains only to sort the results. Unfortunately, instead of figuring out how to directly generate sections in the right order (an exercise for a smarter dog), I cheated and just sorted the post-generation.
def sorted_k_partitions(seq, k): ... result = generate_partitions(0)
A few self-evident, with the exception of key features. The first:
key = lambda p: (len(p), p)
says that it sorts the sequence by length, and then by the sequence itself (which by default in Python is ordered by lexicography). p means "part". This is used to sort parts / groups within a section. This key means that, for example, (4,) precedes (1, 2, 3) , so that [(1, 2, 3), (4,)] sorted as [(4,), (1, 2, 3)] .
key = lambda ps: (*map(len, ps), ps) # or for Python versions <3.5: lambda ps: tuple(map(len, ps)) + (ps,)
ps here means "parts", plural. This indicates that the sequence is sorted by the length of each of its elements (which should be the sequences themselves), then (lexicographically) by the sequence itself. This is used to sort the sections relative to each other, so, for example, [(4,), (1, 2, 3)] precedes [(1, 2), (3, 4)] .
Following:
seq = [1, 2, 3, 4] for k in 1, 2, 3, 4: for groups in sorted_k_partitions(seq, k): print(k, groups)
gives:
1 [(1, 2, 3, 4)] 2 [(1,), (2, 3, 4)] 2 [(2,), (1, 3, 4)] 2 [(3,), (1, 2, 4)] 2 [(4,), (1, 2, 3)] 2 [(1, 2), (3, 4)] 2 [(1, 3), (2, 4)] 2 [(1, 4), (2, 3)] 3 [(1,), (2,), (3, 4)] 3 [(1,), (3,), (2, 4)] 3 [(1,), (4,), (2, 3)] 3 [(2,), (3,), (1, 4)] 3 [(2,), (4,), (1, 3)] 3 [(3,), (4,), (1, 2)] 4 [(1,), (2,), (3,), (4,)]