It seems you cannot decide if performance or floating point precision is important.
If floating point precision was paramount, then you would separate the positive and negative elements by sorting each segment. Then sum in ascending order of absolute value. Yes, I know its more work than anyone, and it will probably be a waste of time.
Instead, use adequate accuracy so that any errors made are irrelevant. Use good numerical testing methods, etc., so that there are no problems.
Regarding time, for an NxM array,
sum (A (:)) will require the addition of N * M-1.
sum (sum (A)) will require (N-1) * M + M-1 = N * M-1 additions.
Any method requires the same number of additions, so for a large array, even if the interpreter is not smart enough to admit that they are both the same, who needs it?
This is simply not a problem. Do not make a mountain out of the mall to worry about it.
Edit: In response to Amro's comment about errors for one method over another, you can control a bit. Additions will be performed in a different order, but there is no certainty which sequence will be better.
A = randn(1000); format long g
The two solutions are pretty close. In fact, compared to eps, the difference is hardly significant.
sum(A(:)) ans = 945.760668102446 sum(sum(A)) ans = 945.760668102449 sum(sum(A)) - sum(A(:)) ans = 2.72848410531878e-12 eps(sum(A(:))) ans = 1.13686837721616e-13
Suppose you select the segregate trick that I mentioned. See that the negative and positive parts are large enough so that there is a loss of accuracy.
sum(sort(A(A<0),'descend')) ans = -398276.24754782 sum(sort(A(A<0),'descend')) + sum(sort(A(A>=0),'ascend')) ans = 945.7606681037
So you really need to accumulate the pieces in the array with higher precision. We can try the following:
[~,tags] = sort(abs(A(:))); sum(A(tags)) ans = 945.760668102446
An interesting problem arises even in these tests. Will there be a problem because tests are executed on a random (normal) array? Essentially, we can consider the sum (A (:)) as a random walk, a drunken walk. But consider the sum (sum (A)). Each element of the sum (A) (i.e. the Inner Sum) itself represents the sum of 1000 normal deviations. Take a look at some of them:
sum(A) ans = Columns 1 through 6 -32.6319600960983 36.8984589766173 38.2749084367497 27.3297721091922 30.5600109446534 -59.039228262402 Columns 7 through 12 3.82231962760523 4.11017616179294 -68.1497901792032 35.4196443983385 7.05786623564426 -27.1215387236418 Columns 13 through 18
When we add them, there will be a loss of accuracy. Thus, perhaps the operation as a sum (A (:)) may be a little more accurate. This is true? What if we use higher accuracy for accumulation? So, firstly, I will form the sum over the columns using doubles, then convert to 25 decimal digits and sum the lines. (I only displayed 20 digits here, leaving 5 digits as guard digits.)
sum(hpf(sum(A))) ans = 945.76066810244807408
Or instead, it is immediately converted to 25 digits of accuracy, and then summing up the result.
sum(hpf(A(:)) 945.76066810244749807
Thus, both forms in double precision were equally erroneous here, in opposite directions. In the end, all this is a moot point, since any of the alternatives, I showed much more time compared to a simple sum of variations (A (:)) or sum (sum (A)). Just pick one and donβt worry.