A DataFrame is, above all, a column-based data structure. Under the hood, the data inside the DataFrame is stored in blocks. Roughly speaking, there is one block for each type. Each column has one type of dtype. Thus, access to a column can be made by selecting the corresponding column from one block. On the contrary, selecting one row requires selecting the corresponding row from each block, and then forming a new series and copying data from each row of the block into a series. Thus, iterating through rows of a DataFrame (under the hood) is not a natural process, like iterating through columns.
If you need to iterate through the lines, you can still by calling df.iterrows() . You should avoid using df.iterrows , if possible, for the same reason why it is unnatural - this requires copying, which makes the process slower than iterating through the columns.
unutbu
source share