Is it better to have multiple arrays or one multidimensional array?

I am working on a project where I have to perform calculations on data arrays in PHP. Some of these calculations include working with multiple arrays. All have the same length (quantity).

Question : it is more efficient (memory and processor usage) to put data in a multidimensional array or store in two arrays.

Keep in mind that some of these arrays can have thousands of values.

Example : to better clarify, here is an example of data and usage:

X = 1,2,3,4,5

Y = 2,3,3,4,4

Calculate the correlation between X and Y.

For this:

  • Get the sum of X and Y from columns
  • Get the sum of X ^ 2 and Y ^ 2 from columns
  • Then we calculate the correlation formula

My thoughts : Combining two arrays into a multidimensional array would allow to calculate fewer iterations, but first they would need to be combined.

So, my main concern and reason for the request is that less resources are required to create a multidimensional array and iterate through it 1x or it is better to save them separately and iterate over each of them - do 2 iterations.

Or is there a better way to do calculations on arrays that aren't iteration related?

+4
source share
5 answers

If you already have data as two separate arrays, merging them first will be a waste of time and resources that I could imagine.

There are two types of array access in PHP, iterative, which uses internal pointers and sequentially accesses through the associated key / index, which is a hash map, not a sequence. If you are going to look at all the elements of an array, and this can be done in order, try iteratively and then using either the built-in array_ functions or the iterator functions reset (), next (), cur (), end (), each ().

Take a look at the array_reduce () function in PHP, this can help you quickly achieve these kinds of things. Although in this simple case, you might be better off doing a direct for () loop and using the reset (), next (), cur () array iterator functions to get values ​​from each array, or if they are entered equally with the key, you can simply execute foreach () and use the key from one to the other.

$sum_x = array_reduce($x, create_function('$x1,$x2', 'return $x1 + $x2;'), 0); $sum_y = array_reduce($y, create_function('$y1,$y2', 'return $y1 + $y2;'), 0); $sum_x2 = array_reduce($x, create_function('$x1,$x2', 'return $x1 + $x2 * $x2;'), 0); $sum_y2 = array_reduce($y, create_function('$y1,$y2', 'return $y1 + $y2 * $y2;'), 0); 

or

 $sum_x = 0; $sum_y = 0; $sum_x2 = 0; $sum_y2 = 0; foreach (array_keys($x) as $i) { $sum_x += $x[$i]; $sum_y += $y[$i]; $sum_x2 += $x[$i] * $x[$i]; $sum_y2 += $y[$i] * $y[$i]; } 
+3
source

Given that all arrays in PHP are hash tables and are associative, I would suggest that the greatest performance gain will be less than iterations. I would use a multidimensional array.

+1
source

Write a test case? You can use PEAR to determine this: http://pear.php.net/package/Benchmark

+1
source

It does not depend on PHP. Reference frequency for data is often important, as cache misses are expensive.

For example, if you process elements in parallel arrays (all ?1 , then all ?2 ...), it is more efficient to organize them in memory as:

 A1 B1 C1 ... A2 B2 C2 ... A3 B3 C3 ... 

Instead of typical:

 A1 A2 A3 ... B1 B2 B3 ... C1 C2 C3 ... 

Of course, it depends on your specific calculation. Uploading your data to the first layout can take considerable time. After all, profiling is the only way to be sure.

+1
source

I do not see how a difference processor or memory will exist between a two-dimensional array or two one-dimensional arrays. You must use the same amount of memory. Will they both have the same number or elements?

0
source

All Articles