Bypass
If you have continuous dates with small spaces that can be calculated, as in your example, you can sort the series and then use cumsum to get around this problem, for example:
import pandas as pd dates = pd.TimeSeries(pd.date_range('1700-01-01', periods=4500, freq='m')) dates.sort() dateshift = dates.shift(1) (dates - dateshift).fillna(0).dt.days.cumsum().describe() count 4500.000000 mean 68466.072444 std 39543.094524 min 0.000000 25% 34233.250000 50% 68465.500000 75% 102699.500000 max 136935.000000 dtype: float64
See that min and max are both positive.
Failaround
If your spaces are too large, this workaround does not work. Like here:
dates = pd.Series(pd.datetools.to_datetime(['2016-06-06', '1700-01-01','2200-01-01'])) dates.sort() dateshift = dates.shift(1) (dates - dateshift).fillna(0).dt.days.cumsum() 1 0 0 -97931 2 -30883
This is because we calculate the step between each date and then add them. And when they are sorted, we guarantee the smallest possible steps, however each step is too large to handle in this case.
Reset order
As you see in the Failaround example, the series is no longer indexed. Correct this by calling the .reset_index(inplace=True) method in the series.
firelynx
source share