Iteration to determine combinations of variable number of elements

I am making a program that uses dynamic programming to decide how to distribute some files (movies) among DVDs to use fewer DVDs.

After much thought, I decided that a good way to do this is to look at every possible combination of movies smaller than 4.38 GB (the actual size of the DVD) and select the largest (that is, the one that destroys the least space) and delete the movies in the most effective combination and repeat until you finish the films.

The problem is that I don’t know how to make a loop so that I can find all the possible combinations, given that the films vary in size, so you cannot use a certain number of nested loops.

pseudo code:

Some kind of loop: best_combination=[] best_combination_size=0 if current_combination<4.38 and current_combination>best_combination_size: best_combination=current_combination best_combination_size=best_combination_size print(best_combination) delete best_combination from list_of_movies 

first time to ask a question .. so say goodbye to me guys !! thanks in advance

PS I figured out a way to do this using Dijkstra, which I think will be fast, but not memory friendly. If anyone is interested, I will discuss this with pleasure.

+5
source share
2 answers

You really must adhere to the overall evacuation of the bin packaging . The Wikipedia article gives a good overview of approaches, including links to exact approaches to specific tasks. But always keep in mind: this is an np-complete problem!

I will show you an example confirming my hint that you should adhere to heuristics.

The following python code:

  • creates parametrized random tasks (normal distribution by several means / stds; receive-select to make sure that the movie is not larger than DVD)
  • uses some random binpacking library that implements some greedy heuristic (I have not tried or tested this library before, so no guarantees!) I don’t know which heuristic is used)
  • uses a naive implementation of mixed integer programming (which is solved using a commercial solver ; open source solutions such as cbc, but can be used for good approximate solutions)

Code

 import numpy as np from cvxpy import * from time import time """ Generate some test-data """ np.random.seed(1) N = 150 # movies means = [700, 1400, 4300] stds = [100, 300, 500] DVD_SIZE = 4400 movies = [] for movie in range(N): while True: random_mean_index = np.random.randint(low=0, high=len(means)) random_size = np.random.randn() * stds[random_mean_index] + means[random_mean_index] if random_size <= DVD_SIZE: movies.append(random_size) break """ HEURISTIC SOLUTION """ import binpacking start = time() bins = binpacking.to_constant_volume(movies, DVD_SIZE) end = time() print('Heuristic solution: ') for b in bins: print(b) print('used bins: ', len(bins)) print('used time (seconds): ', end-start) """ Preprocessing """ movie_sizes_sorted = sorted(movies) max_movies_per_dvd = 0 occupied = 0 for i in range(N): if occupied + movie_sizes_sorted[i] <= DVD_SIZE: max_movies_per_dvd += 1 occupied += movie_sizes_sorted[i] else: break """ Solve problem """ # Variables X = Bool(N, N) # N * number-DVDS I = Bool(N) # indicator: DVD used # Constraints constraints = [] # (1) DVDs not overfilled for dvd in range(N): constraints.append(sum_entries(mul_elemwise(movies, X[:, dvd])) <= DVD_SIZE) # (2) All movies distributed exactly once for movie in range(N): constraints.append(sum_entries(X[movie, :]) == 1) # (3) Indicators for dvd in range(N): constraints.append(sum_entries(X[:, dvd]) <= I[dvd] * (max_movies_per_dvd + 1)) # Objective objective = Minimize(sum_entries(I)) # Problem problem = Problem(objective, constraints) start = time() problem.solve(solver=GUROBI, MIPFocus=1, verbose=True) #problem.solve(solver=CBC, CliqueCuts=True)#, GomoryCuts=True, KnapsackCuts=True, verbose=True)#, GomoryCuts=True, MIRCuts=True, ProbingCuts=True, #CliqueCuts=True, FlowCoverCuts=True, LiftProjectCuts=True, #verbose=True) end = time() """ Print solution """ for dvd in range(N): movies_ = [] for movie in range(N): if np.isclose(X.value[movie, dvd], 1): movies_.append(movies[movie]) if movies_: print('DVD') for movie in movies_: print(' movie with size: ', movie) print('Distributed ', N, ' movies to ', int(objective.value), ' dvds') print('Optimizatio took (seconds): ', end-start) 

Partial exit

 Heuristic solution: ------------------- ('used bins: ', 60) ('used time (seconds): ', 0.0045168399810791016) MIP-approach: ------------- Root relaxation: objective 2.142857e+01, 1921 iterations, 0.10 seconds Nodes | Current Node | Objective Bounds | Work Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time 0 0 21.42857 0 120 106.00000 21.42857 79.8% - 0s H 0 0 68.0000000 21.42857 68.5% - 0s H 0 0 63.0000000 21.42857 66.0% - 0s 0 0 21.42857 0 250 63.00000 21.42857 66.0% - 1s H 0 0 62.0000000 21.42857 65.4% - 2s 0 0 21.42857 0 256 62.00000 21.42857 65.4% - 2s 0 0 21.42857 0 304 62.00000 21.42857 65.4% - 2s 0 0 21.42857 0 109 62.00000 21.42857 65.4% - 3s 0 2 21.42857 0 108 62.00000 21.42857 65.4% - 4s 40 2 27.61568 20 93 62.00000 27.61568 55.5% 110 5s H 156 10 61.0000000 58.00000 4.92% 55.3 8s 262 4 59.00000 84 61 61.00000 59.00000 3.28% 44.2 10s 413 81 infeasible 110 61.00000 59.00000 3.28% 37.2 15s H 417 78 60.0000000 59.00000 1.67% 36.9 15s 1834 1212 59.00000 232 40 60.00000 59.00000 1.67% 25.7 20s ... ... 57011 44660 infeasible 520 60.00000 59.00000 1.67% 27.1 456s 57337 44972 59.00000 527 34 60.00000 59.00000 1.67% 27.1 460s 58445 45817 59.00000 80 94 60.00000 59.00000 1.67% 26.9 466s 59387 46592 59.00000 340 65 60.00000 59.00000 1.67% 26.8 472s 

Analysis

Some notes regarding the above example:

  • heuristic gets some solution of value 60 instantly
  • the commercial solver takes longer, but also find a solution to 60 (15 seconds)
    • also trying to find the best solution or proof that it does not exist (MIP solvers are full = find the optimal solution or proof that there is no infinite time!)
    • No progress for some time!
    • but: we got proof that at best there is a solution of size 59
    • = maybe you save one DVD, solving the problem optimally; but it’s hard to find a solution, and we don’t know if this solution exists (yet)!

Notes

  • The above observations are largely dependent on data statistics.
  • It’s easy to try other problems (maybe less), where the commercial MIP solver finds a solution that uses 1 less DVD (for example, 49 vs 50)
    • It's not worth it (remember: open source solvers are even more afraid)
    • The formula is very simple and not configured at all (don't blame solvers only)
  • There are precise algorithms (which can be much more difficult to implement) that may be suitable.

Conclusion

Heuristics are very easy to implement and provide very good solutions in general. Most of them also come with very good theoretical guarantees (for example, no more than 11/9 opt + 1 #DVD are used compared to the optimal solution = first approach to reducing heuristics) . Despite the fact that I'm interested in optimization in general, I will probably use the heuristic approach.

A common problem is also very popular, so in many programming languages ​​there is a good library for this problem!

+1
source

Without claiming that the solution given in this answer is optimized, optimal, or possesses any other great qualities, here is a greedy approach to the dvd packaging problem.

 import System.Random import Data.List import Data.Ord -- F# programmers are so used to this operator, I just cannot live without it ...yet. (|>) ab = ba data Dvd = Dvd { duration :: Int, movies :: [Int] } deriving (Show,Eq) dvdCapacity = 1000 :: Int -- a dvd has capacity for 1000 time units - arbitrary unit -- the duration of a movie is between 1 and 1000 time units r = randomRs (1,1000) (mkStdGen 42) :: [Int] -- our test data set of 1000 movies, random movie durations drawn from r allMovies = zip [1..1000] (take 1000 r) allMoviesSorted = reverse $ sortBy (comparing snd) allMovies remainingCapacity dvd = dvdCapacity - duration dvd emptyDvd = Dvd { duration = 0, movies = [] } -- from the remaining movies, pick the largest one with at most maxDuration length. pickLargest remaining maxDuration = let (left,right) = remaining |> break (\ (a,b) -> b <= maxDuration) (h,t) = case right of [] -> (Nothing,[]) otherwise -> (Just (head right), right |> tail) in (h,[left, t] |> concat) -- add a track (movie) to a dvd addTrack dvd track = Dvd {duration = (duration dvd) + snd track, movies = fst track : (movies dvd) } -- pick dvd from dvds with largest remaining capacity -- and add the largest remaining fitting track greedyPack movies dvds | movies == [] = (dvds,[]) | otherwise = let dvds' = reverse $ sortBy (comparing remainingCapacity) dvds (picked,movies') = case dvds' of [] -> (Nothing, movies) (x:xs) -> pickLargest movies (remainingCapacity x) in case picked of Nothing -> -- None of the current dvds had enough capacity remaining -- tp pick another movie and add it. -> Add new empty dvd -- and run greedyPack again. greedyPack movies' (emptyDvd : dvds') Just p -> -- The best fitting movie could be added to the -- dvd with the largest remaining capacity. greedyPack movies' (addTrack (head dvds') p : tail dvds') (result,residual) = greedyPack allMoviesSorted [emptyDvd] usedDvdsCount = length result totalPlayTime = allMovies |> foldl (\ s (i,d) -> s + d) 0 optimalDvdsCount = round $ 0.5 + fromIntegral totalPlayTime / fromIntegral dvdCapacity solutionQuality = length result - optimalDvdsCount 

Compared to the theoretical optimal number of dvds, it sends 4 additional dvds to this dataset.

0
source

All Articles