Pandas replace values ​​in data timers

I have a pandas dataframe df with pandas.tseries.index.DatetimeIndex as an index.

The data is as follows:

Time Open High Low Close Volume 2007-04-01 21:02:00 1.968 2.389 1.968 2.389 18.300000 2007-04-01 21:03:00 157.140 157.140 157.140 157.140 2.400000 

....

I want to replace one datapoint, gives day 2.389 in the Close with NaN column:

 In: df["Close"].replace(2.389, np.nan) Out: 2007-04-01 21:02:00 2.389 2007-04-01 21:03:00 157.140 

Replace did not change 2.389 to NaN. What's wrong?

+5
source share
2 answers

replace may not work with float because the floating point view you see in repr in the DataFrame may not be the same as the underlying float. For example, the actual Close value might be:

 In [141]: df = pd.DataFrame({'Close': [2.389000000001]}) 

but the df view is as follows:

 In [142]: df Out[142]: Close 0 2.389 

Therefore, instead of checking equality of float, it is usually better to check proximity:

 In [150]: import numpy as np In [151]: mask = np.isclose(df['Close'], 2.389) In [152]: mask Out[152]: array([ True], dtype=bool) 

Then you can use the boolean mask to select and change the desired values:

 In [145]: df.loc[mask, 'Close'] = np.nan In [146]: df Out[146]: Close 0 NaN 
+6
source

You need to assign the result df['Close'] or pass param inplace=True : df['Close'].replace(2.389, np.NaN, inplace=True)

eg:.

 In [5]: df['Close'] = df['Close'].replace(2.389, np.NaN) df['Close'] Out[5]: 0 2.389 1 157.140 Name: Close, dtype: float64 

Most pandas operations return a copy, and some accept the inplace parameter.

Check out the docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.replace.html#pandas.Series.replace

+2
source

Source: https://habr.com/ru/post/1211286/


All Articles