I have a pandas dataframe with some categorical predictors (i.e. variables) like 0 and 1 and some numerical variables. When I adapt it to stasmodel like:
est = sm.OLS(y, X).fit()
He throws out:
Pandas data cast to numpy dtype of object. Check input data with np.asarray(data).
I converted all DataFrame data types using df.convert_objects(convert_numeric=True)
After that, all dtypes of the dataframe variables are displayed as int32 or int64. But in the end, it still shows a dtype: object , like this:
4516 int32 4523 int32 4525 int32 4531 int32 4533 int32 4542 int32 4562 int32 sex int64 race int64 dispstd int64 age_days int64 dtype: object
Here 4516, 4523 are variable labels.
Any idea? I need to build a multi-regression model for more than a hundred variables. To do this, I combined 3 pandas DataFrames to come up with the final DataFrame for use in building the model.
Sanoj source share