How to determine if a column in pandas has a datetime type frame? How to determine if a column is numeric?

I am trying to filter out columns in a pandas frame based on whether they have type date or not. I can figure out which ones, but then I have to parse this output or manually select the columns. I want to automatically select date columns. Here is what I have as an example - I would only like to select the "date_col" column in this case.

import pandas as pd df = pd.DataFrame([['Feb-2017', 1, 2], ['Mar-2017', 1, 2], ['Apr-2017', 1, 2], ['May-2017', 1, 2]], columns=['date_str', 'col1', 'col2']) df['date_col'] = pd.to_datetime(df['date_str']) df.dtypes 

Of:

 date_str object col1 int64 col2 int64 date_col datetime64[ns] dtype: object 
+15
python numpy pandas dataframe
source share
3 answers

Pandas has a cool function called select_dtypes , which can either accept or include (or both) as parameters. It filters the dataframe based on dtypes. Therefore, in this case, you would like to include the dtype np.datetime64 columns. To filter by integers, you should use [np.int64, np.int32, np.int16, np.int] for float: [np.float32, np.float64, np.float16, np.float] to filter only by numeric columns: [np.number] .

 df.select_dtypes(include=[np.datetime64]) 

Of:

  date_col 0 2017-02-01 1 2017-03-01 2 2017-04-01 3 2017-05-01 

IN:

 df.select_dtypes(include=[np.number]) 

Of:

  col1 col2 0 1 2 1 1 2 2 1 2 3 1 2 
+19
source share

uglier bit Numpy Alternative:

 In [102]: df.loc[:, [np.issubdtype(t, np.datetime64) for t in df.dtypes]] Out[102]: date_col 0 2017-02-01 1 2017-03-01 2 2017-04-01 3 2017-05-01 In [103]: df.loc[:, [np.issubdtype(t, np.number) for t in df.dtypes]] Out[103]: col1 col2 0 1 2 1 1 2 2 1 2 3 1 2 
+4
source share

I just ran into this problem and found that @ charlie-haley's answer is not general enough for my use case. In particular, np.datetime64 does not seem to match datetime64[ns, UTC] .

 df['date_col'] = pd.to_datetime(df['date_str'], utc=True) print(df.date_str.dtype) # datetime64[ns, UTC] 

You can also expand the dtypes list to include other types, but this does not seem to be a good solution for future compatibility, so I used the is_datetime64_any_dtype function from pandas api instead.

IN:

 from pandas.api.types import is_datetime64_any_dtype as is_datetime df[[column for column in df.columns if is_datetime(df[column])]] 

Of:

  date_col 0 2017-02-01 00:00:00+00:00 1 2017-03-01 00:00:00+00:00 2 2017-04-01 00:00:00+00:00 3 2017-05-01 00:00:00+00:00 
0
source share

All Articles