I wrote a little here related to the project that I was trying to work on, and I continue to encounter design problems and have to design from scratch. So I wonder if I can post what I'm trying to do, and someone can help me figure out how I can get the result I want.
BackGround:
I am new to programming and try to learn. Therefore, I took the project that interests me, which includes basically collecting the list and breaking down each number using only the numbers from the list. I know that I could easily use this command (which I did), but I also wanted to learn about Hbase, Hadoop and parallel processing, so I wanted to do this so that I could split the process on different machines. I thought that for this you need to use dynamic programming and recursion to create a feature table that can be broken down even further.
Example:
If I send the list: [1,2, 4] , I should get {1: [[1]], 2: [[1, 1]], 4: [[2, 2]]} . Which basically says 2 + 2 = 4, 1 + 1 = 2 and 1 = 1..so trying to see all the ways to make 4, I can just look at this list (which will be in the database) and see 2 + 2 = 4 then beat 2. etc. I have a search job, but with a breakdown I am having problems. Using brute force will not allow me to use large numbers (say, a million, with a thousand numbers on the list) in such a way that I can use hadoop or some other tool to scale it. Here are some more examples of possible results:
[1,2, 3, 4] = {1: [[1]], 2: [[1, 1]], 3: [[1, 2]], 4: [[1, 3], [2, 2]]} [1,2, 3, 4, 6, 10] = {1: [[1]], 2: [[1, 1]], 3: [[1, 2]], 4: [[1, 3], [2, 2]], 6: [[2, 4], [3, 3]], 10: [[4, 6], [2, 2, 2, 2, 2]]} [10, 30, 50] = 50: [[10, 10, 30]], 30: [[10, 10, 10]]}
The logic with this approach is that it does not take time to calculate the next possible data in the list, so if I send a list with a million numbers, it can do it quickly and even scale it to the hadoop cluster.
The code I created to make it work is here , but this question was about how to fix the design problem. I got advice that this is a partition problem, and looked around and found much simpler versions of what I was trying to do ( activestate ), but this is not exactly what I am trying to do, because it breaks the number and does not use a specific data set for of this.
Question:
So, hopefully, I have clearly described what I'm trying to do. What do I need to read / study / learn to create a list section table in python using dynamic programming so that it can scale? This is just a hobby, not a time, but I feel like I have been working on it for more than 3 months, and every time I come across design problems that make me start from scratch. How can I build this correctly (to see my wrong way, see My link above)? I found googled and found solutions to the backpack problem and partition problems, but they seem to be more suitable for school work and not designed to scale with large datasets.
Hope someone can give me an understanding, but regardless, thank you for that.