(This is longer than I expected, please carry me.)
Most languages are made up of what is called a syntax: a language consists of several clearly defined keywords, and the full set of expressions that you can build in that language is created from this syntax.
For example, let's say you have a simple four-digit arithmetic “language” that accepts only integers with one digit as input and completely ignores the order of operations (I told you that it is a simple language). This language can be defined with the syntax:
// The | means "or" and the := represents definition $expression := $number | $expression $operator $expression $number := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 $operator := + | - | * | /
From these three rules, you can build any number of arithmetic expressions with a single-bit input. You can then write a parser for the syntax that breaks any valid input into its component types ( $expression , $number or $operator ) and processes the result. For example, the expression 3 + 4 * 5 can be broken down as follows:
// Parentheses used for ease of explanation; they have no true syntactical meaning $expression = 3 + 4 * 5 = $expression $operator (4 * 5) // Expand into $exp $op $exp = $number $operator $expression // Rewrite: $exp -> $num = $number $operator $expression $operator $expression // Expand again = $number $operator $number $operator $number // Rewrite again
We now have fully parsed syntax in our specific language for the original expression. After that, we can go through and write a parser to find the results of all combinations of $number $operator $number , and spit out the result when there is only one $number left.
Note that the final parsed version of our original expression does not have $expression constructs. This is because $expression can always be reduced to a combination of other things in our language.
PHP is very similar: language constructs are recognized as the equivalent of our $number or $operator . They cannot be reduced to other language constructs ; instead, they are the basic units from which the language is built. The key difference between functions and language constructs is this: the parser directly deals with language constructs. This simplifies functions in language constructs.
The reason that language constructs may or may not contain brackets, and the reason has some return values, while others do not depend entirely on the specific technical characteristics of the PHP parser implementation. I am not so good at how the parser works, so I can’t specifically answer these questions, but imagine a second language starting with this:
$expression := ($expression) | ...
In fact, this language is free to accept any found expressions and get rid of the surrounding parentheses. PHP (and here I use pure guesswork) can use something similar for its language constructs: print("Hello") can be reduced to print "Hello" before it is analyzed, or vice versa (language definitions can add brackets and also eliminate of them).
This is the root of why you cannot redefine language constructs such as echo or print : they are efficiently hardcoded into the parser, while functions are mapped to a set of language constructs, and the parser allows you to change this mapping at compile time or run time to replace your own set of language constructs or expressions.
At the end of the day, the inner difference between constructs and expressions is this: language constructs are understood and processed by the parser. The built-in functions provided by the language are compared and simplified to a set of language constructs before parsing.
Additional Information:
Edit: After reading some other answers, people get good points. Among them:
- More complex language than function. This is true, if only slightly, because the PHP interpreter does not need to match this function with its language-based equivalents before parsing. However, on a modern machine, the difference is quite small.
- The built-in language bypasses error checking. This may or may not be true, depending on the internal PHP implementation for each built-in. Of course, it is true that most often functions will have more advanced error checking and other functions that are not built-in.
- Language constructs cannot be used as function callbacks. This is true because construction is not a function . These are separate entities. When encoding inline code, you do not encode a function that takes arguments - the syntax of the inline is processed directly by the analyzer and is recognized as inline, and not as a function. (This might be easier to understand if you are considering languages with first-class functions: efficiently, you can pass functions as objects. You cannot do this with built-in functions.)