Why don't most languages optimize "0 * ...", and are there any languages?

Question

Why don't most languages optimize "0 * ...", and are there any languages?

I wrote a 2D curve algorithm and had some code that efficiently performed a la summation:

for (i=0, end=...; i<end; i++) { value += coefficients[i] * expensiveToCalculateValue(i); }

where the value of coefficients[i] is equal to zero for some stages of the iteration. Since zero time is something equal to zero anyway (at least with simple arithmetic rules), I decided that I could significantly optimize this code by first checking to see if coefficients[i] zero, and if so, just continue for the next iterations. Added, sorted, works brilliantly.

But this leaves the question: why is this not done for me? This is not some kind of creative niche version of multiplication, it is simple arithmetic. Almost all short-circuit operations in the binary language OR and AND, if an operand is found that makes the result invariant from this point, so why is arithmetic multiplication by zero not equally short-circuited?

I tried to run this code (modified for syntax) in Java, PHP, JavaScript, Perl, Python, C ++ and even looked at what Prolog did, but none of them realized that when they see "zero times" .. " They should not evaluate the potentially costly second (or third, fourth, etc.) term:

 printed = 0; function veryExpensive() { print "oh god this costs so much, x" + (printed++); return 0; } value = 0 * veryExpensive() * veryExpensive() * veryExpensive()

They all just run veryExpensive() three times.

Now I understand that you can - if you are such a person - write your veryExpensive function to do administrative overhead work, assuming that you can rely on it to work, even though its result does not contribute to the arithmetic expression ( if you do this, you are probably abusing the language, but everyone loves the sneaky ninja code at some point during their programming life cycle), but you only do this because you know that the language is not accidentally optimized for this case. Your expressiveness of the code would not hurt you if the language optimized your arithmetic evaluation.

So: is there any historical precedent that caused the boat of the languages currently used to optimize "true OR ..." and "false AND ..." but not "zero TIMES ..."? Why are we optimizing binary operations, but not for MUL 0? (And if we are lucky, someone has a fascinating story to tell about why we are not closing now)

Update

Both John Skeet and Nick Bugalis give good arguments in favor of why optimizing this in the existing language will lead to problems, but Nick will answer the question with the question much more, so I marked his answer as “correct”. However, they cover various aspects of the same problem, so the real answer is a combination of the two.

+6

optimization language-agnostic syntax

Mike 'Pomax' Kamermans Mar 07 '13 at 17:42

source share

2 answers

They all just finish veryExpensive () three times.

And so they must.

but you do this only because you know that the language is not accidentally optimized for this case.

No, this is not a "funny" question, not an optimizer. This is an optimization issue that does not violate the language specifications.

If the language indicates that the operation X * Y first evaluates X , then evaluates Y , then multiplies the two values, then this is simply incorrect optimization to remove the estimate Y if the value is X 0.

Often there are operators that behave like this, of course, in particular (in C-like languages):

The conditional operator a ? b : c a ? b : c , which will evaluate only b or c
x || y x || y , which will evaluate only Y if X false
x && y , which will evaluate only Y if X true

And C # has an operator with zero coalescing x ?? y x ?? y , which evaluates only Y if X is null.

Multiplication can be defined in this way, but I suspect that:

In most operations of multiplication, the branching blow of conditional execution is not worth it. But you would like the behavior to be clearly defined: either it is short-circuited or not; it should not (IMO) undefined.
This makes the language more complex for both definition and use. You will probably need an “unconditional multiplication” operation that always evaluates both operands (just like x | y and x & y don't short-circuit).

Basically, I don’t think it will cost extra complexity for all participants.

+9

Jon skeet Mar 07 '13 at 17:51

source share

Nik Bougalis · Accepted Answer · 2013-03-07T17:57:22+0000

If the compiler automatically adds a runtime check, it may or may not make sense. In your particular case, it may be true that validation improves performance, but your particular case is not the final one, everything that is evaluated by optimizations.

If the multiplication is super expensive, and the microprocessor does not have internal optimization of multiplications by zero, and the result of zero multiplication is guaranteed (for example, 0 * NaN != 0 ) equal to zero, then the check may make sense. But this is a lot of operands, and, as you know, you can short-circuit and.

Suppose you have a quasi-random distribution of numbers and some unknown ratio of zero to non-zero numbers. Depending on the relationship, the path lengths of null and nonzero sequences, and processor branch prediction algorithms, verification can actually cause problems (i.e. pipelines).

Do you still think that the compiler should insert such a check on your behalf?

Why don't most languages ​​optimize "0 * ...", and are there any languages?

More articles:

Why don't most languages optimize "0 * ...", and are there any languages?