What is the performance difference between DBI fetchall_hashref and fetchall_arrayref?

I am writing some Perl scripts to manage large amounts (about 42 million rows in total, but this will not be done in one case) of data in two PostgreSQL databases.

It makes sense to use for some of my queries fetchall_hashrefbecause I have synthetic keys. However, in other cases, I use an array of three columns as a unique key.

This made me think about performance differences between fetchall_arrayrefand fetchall_hashref. I know that in both cases everything happens in memory, so choosing a few GB of data is probably not a good idea, but apart from that, there seems to be very little guidance in the documentation when it comes to performance.

My googling was not successful, so if someone could point me towards some general performance research, I would be grateful.

(I know that I could compare this myself, but, unfortunately, for dev purposes, I do not have access to a machine that has identical production equipment, so I'm looking for general recommendations or even best practices).

+5
source share
2 answers

The first question is whether you really need to use it fetchallfirst. If you do not need all 42 million lines in memory at once, then do not read them all at once! bind_columnsand fetchrow_arrayrefare generally suitable when possible, as ysth already pointed out.

, fetchall , , fetchall_arrayref , , , .

- . , fetchall_hashref, id => row, field name => field value. 42 , , 42 -... , , fetchall_arrayref. ( DBI tie fetchall_hashref, .)

+3

, , , , .

, fetchrow_arrayref bind_columns - ( DBI) .

+5

All Articles