Effective Power Algorithm for Minimum Length Subsets

I use the following C # function to get a set of privileges limited to subsets of minimum length

string[] PowerSet(int min_len, string set) { IEnumerable<IEnumerable<string>> seed = new List<IEnumerable<string>>() { Enumerable.Empty<string>() }; return set.Replace(" ", "") .Split(',') .Aggregate(seed, (a, b) => a.Concat(a.Select(x => x.Concat(new[] { b })))) .Where(subset => subset.Count() >= min_len) .Select(subset => string.Join(",", subset)) .ToArray(); } 

the problem is that when the original set is large, the algorithm must work very hard, even if the minimum length is also large.

eg:

  PowerSet(27, "1,11,12,17,22,127,128,135,240,254,277,284,292,296,399,309,322,326,333,439,440,442,447,567,580,590,692,697"); 

It should be very simple, but too much for the specified function. I am looking for a brief modification of my function that could handle these cases efficiently.

+3
source share
1 answer

Having carefully studied your method, one of the inefficiencies is that any possible subset is created, regardless of whether it has enough members to guarantee inclusion in a limited super-set.

Instead, consider the following extension method. This method can trim some unnecessary subsets based on their calculation in order to avoid redundant calculations.

 public static List<List<T>> PowerSet<T>(List<T> startingSet, int minSubsetSize) { List<List<T>> subsetList = new List<List<T>>(); //The set bits of each intermediate value represent unique //combinations from the startingSet. //We can start checking for combinations at (1<<minSubsetSize)-1 since //values less than that will not yield large enough subsets. int iLimit = 1 << startingSet.Count; for (int i = (1 << minSubsetSize)-1; i < iLimit; i++) { //Get the number of 1 in this 'i' int setBitCount = NumberOfSetBits(i); //Only include this subset if it will have at least minSubsetSize members. if (setBitCount >= minSubsetSize) { List<T> subset = new List<T>(setBitCount); for (int j = 0; j < startingSet.Count; j++) { //If the j'th bit in i is set, //then add the j'th element of the startingSet to this subset. if ((i & (1 << j)) != 0) { subset.Add(startingSet[j]); } } subsetList.Add(subset); } } return subsetList; } 

The number of set bits in each incremental i indicates how many members will be in the subset. If there are not enough set bits, then it makes no sense to do the work of creating a subset represented by a bit combination. NumberOfSetBits can be implemented in several ways. See How to count the number of bits in a 32-bit integer? for various approaches, explanations and references. Here is one example taken from this SO question.

 public static int NumberOfSetBits(int i) { i = i - ((i >> 1) & 0x55555555); i = (i & 0x33333333) + ((i >> 2) & 0x33333333); return (((i + (i >> 4)) & 0x0F0F0F0F) * 0x01010101) >> 24; } 

Now, although this solution works for your example, I think you will run into long lead times and memory problems if you reduce the minimum size of the subset too far or continue to increase the size of the startingSet . Without the specific requirements outlined in your question, I cannot judge whether this solution will work for you and / or is safe for your range of expected input cases.

If you find that this solution is still too slow, the operations can be divided into parallel computations, possibly using the PLINQ functions.

Finally, if you want to dress up an extension method with LINQ, it will look like this. However, as it is written, I think that you will see slower performance without any changes.

 public static IEnumerable<List<T>> PowerSet<T>(List<T> startingSet, int minSubsetSize) { var startingSetIndexes = Enumerable.Range(0, startingSet.Count).ToList(); var candidates = Enumerable.Range((1 << minSubsetSize)-1, 1 << startingSet.Count) .Where(p => NumberOfSetBits(p) >= minSubsetSize) .ToList(); foreach (int p in candidates) { yield return startingSetIndexes.Where(setInd => (p & (1 << setInd)) != 0) .Select(setInd => startingSet[setInd]) .ToList(); } } 
+2
source

All Articles