I can not compare dataframe with string! But I can compare its transposition

Consider a data block df

df = pd.DataFrame({
    1: [1, 2],
    2: ['a', 3],
    3: [None, 7]
})

df

   1  2    3
0  1  a  NaN
1  2  3  7.0

When I compare the string

df == 'a'
TypeError: Could not compare ['a'] with block values

However, taking transposition fixes the problem ?!

(df.T == 'a').T

       1      2      3
0  False   True  False
1  False  False  False

What is this mistake? Can I fix the way I build my data file? What is different from comparing with transposition?

+6
source share
1 answer

When creating a data frame, declare dtype=object:

In [1013]: df = pd.DataFrame({
      ...:     1: [1, 2],
      ...:     2: ['a', 3],
      ...:     3: [None, 7]
      ...: }, dtype=object)

In [1014]: df
Out[1014]: 
   1  2     3
0  1  a  None
1  2  3     7

Now you can compare without transposing:

In [1015]: df == 'a'
Out[1015]: 
       1      2      3
0  False   True  False
1  False  False  False

I am convinced that to begin with, your columns are not objects (they are used forcibly wherever possible), but transposition forces a change due to mixed values.


This is detected in the source code pandas/internals.py:

if not isinstance(result, np.ndarray):
    # differentiate between an invalid ndarray-ndarray comparison
    # and an invalid type comparison
    ...
    raise TypeError('Could not compare [%s] with block values' %
                    repr(other))

dtype , .

+4

All Articles