Use first row as column names? Pandas read_html

Question

Use first row as column names? Pandas read_html

I have this simple one line script:

from pandas import read_html print read_html('http://money.cnn.com/data/hotstocks/', flavor = 'bs4')

Which works fine, but there are no column names, they are identified as 1, 2, 3. Is there an easy way to tell pandas to use the first row as the column names? I know that I can just save the names in a list and set them and then skip the first line, but I wonder if there is a simpler / better way.

He is currently printing:

  0 1 2 3 0 Company Price Change % Change 1 AAPL Apple Inc 115.31 +6.17 +5.65% 2 BAC Bank of America Corp 15.20 -0.43 -2.75% 3 YHOO Yahoo! Inc 46.46 -1.53 -3.19% 4 MSFT Microsoft Corp 41.19 -1.47 -3.45% 5 FB Facebook Inc 76.24 +0.46 +0.61% 6 GE General Electric Co 23.84 -0.54 -2.21% 7 T AT&T Inc 32.68 -0.13 -0.40% 8 F Ford Motor Co 14.46 -0.24 -1.63% 9 INTC Intel Corp 33.78 -0.41 -1.20% 10 CSCO Cisco Systems Inc 26.80 -0.09 -0.35%

+5

python pandas parsing

mobone Jan 29 '15 at 3:31

source share

1 answer

Jab · Accepted Answer · 2015-01-29T03:54:55+0000

'read_html` takes a header parameter. You can pass the row index:

 read_html('http://money.cnn.com/data/hotstocks/', header =0, flavor = 'bs4')

It is worth noting this disclaimer in the documents:

For example, you may need to manually assign column names if the column names are converted to NaN when passing the argument header = 0

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.html.read_html.html

Use first row as column names? Pandas read_html

More articles: