The code below assumes that pandas can be much slower than numpy, at least in the specific case of the clip () function. Surprisingly, when performing calculations in numpy, making the circuit from pandas to numpy and back to pandas is still much faster than in pandas.
Should pandas function be implemented in this workaround?
In [49]: arr = np.random.randn(1000, 1000) In [50]: df=pd.DataFrame(arr) In [51]: %timeit np.clip(arr, 0, None) 100 loops, best of 3: 8.18 ms per loop In [52]: %timeit df.clip_lower(0) 1 loops, best of 3: 344 ms per loop In [53]: %timeit pd.DataFrame(np.clip(df.values, 0, None)) 100 loops, best of 3: 8.4 ms per loop
python numpy pandas
Soldalma
source share