I am trying to do a dataframe transformation that I cannot solve. I tried several approaches from stackoverflow and the pandas documentation: apply, apply (lambda: ...), swaps and unions. There are too many attempts to list here, but not sure which approach is best, or maybe I tried the right approach with the wrong syntax.
Basically, I have a data framework, and I need 1) to shift the columns, 2) the number of columns to be shifted varies and depends on the variable in the data frame, 3) creates columns at the end of the data frame where you need to place the offset and 4) place zeros to newly created intervals.
df1 = pd.DataFrame({'first' : ['John', 'Mary', 'Larry', 'jerry'], '1' : [5.5, 6.0,10,20], '2' : [100, 200, 300, 400], '3' : [150, 100, 240, 110], 'offset' : ([1,0,2,1])}) goal_df = pd.DataFrame({'first' : ['John', 'Mary', 'Larry', 'jerry'], '1' : [0.0, 6.0, 0.0, 0], '2' : [5.5, 200, 0.0, 20], '3' : [100, 100, 10, 400], '4' : [150, 0.0, 300, 110], '5' : [0.0, 0.0, 240, 0.0]}) df1 1 2 3 first offset 5.5 100 150 John 1 6.0 200 100 Mary 0 10.0 300 240 Larry 2 20.0 400 110 jerry 1 goal_df 1 2 3 4 5 first 0 5.5 100 150 0 John 6 200.0 100 0 0 Mary 0 0.0 10 300 240 Larry 0 20.0 400 110 0 jerry
This dataset will have c. 500 lines and c. 120 columns. The amount of bias will be very close to 0-12. I thought about this with the basic functions of Python, but I also found that a complex and temporary consumer of the program would win in the ultimate goal, which is to remove some tasks performed in Microsoft Excel.
I complain a lot about how Excel is inferior to such large tasks, but it seems that the current offset () function in Excel outperforms this very simple use, but with thousands of formulas, very slowly. I sold my workplace on the advantages of Python over Excel, and this is my first real test, so speed is very important to me because I try to convince my colleagues that Python can gobble up this table much faster than the current limit the file weighs in file size 96 MB in size.
I came pretty close to the melt () function, and then took the old column numbers and added the offset to them. However, I had a lot of problems trying to reform the dataframe with a hinge. No luck with the application or application (lambda)!
Thanks for any help anyone can give!