A shell is an interface to an operating system. This is usually a more or less reliable programming language, but with functions designed to simplify interaction with the operating system and file system. The POSIX shell (hereinafter referred to as the “shell”) is a bit of a muta combining some functions of LISP (s-expressions have much in common with the word splitting shell) and C (most of the shell arithmetic semantics are derived from C).
Another root of shell syntax comes from its upbringing as a misch of individual UNIX utilities. Most of which are often built into the shell, can actually be implemented as external commands. It throws many shell neophytes for a loop when they realize that /bin/[ exists on many systems.
$ if '/bin/[' -f '/bin/['; then echo t; fi
cotton wool?
This makes much more sense if you look at how the shell is implemented. Here is the implementation I made as an exercise. This is in Python, but I hope this is not a hang for everyone. This is not terribly cool, but instructive:
#!/usr/bin/env python from __future__ import print_function import os, sys '''Hacky barebones shell.''' try: input=raw_input except NameError: pass def main(): while True: cmd = input('prompt> ') args = cmd.split() if not args: continue cpid = os.fork() if cpid == 0:
I hope that the above makes it clear that the shell execution model is pretty much:
1. Expand words. 2. Assume the first word is a command. 3. Execute that command with the following words as arguments.
Extension, team resolution, execution. All shell semantics are related to one of these three things, although they are much richer than the implementation described above.
Not all fork commands. In fact, there are several commands that do not make tons of sense implemented as external (such that they should be fork ), but even those are often available as external for strict POSIX compliance.
Bash builds on this base, adding new features and keywords to improve the POSIX shell. It is almost compatible with sh, and bash is so ubiquitous that some script authors go on for years, not realizing that the script may actually not work in a strict POSIXly system. (I also wonder how people can care so much about the semantics and style of a single programming language and so little for semantics and shell style, but I'm at odds.)
Assessment Procedure
This is a bit of a tricky question: bash interprets expressions in its main syntax from left to right, but in its arithmetic syntax it follows priority C. However, expressions are different from decompositions. In the EXPANSION section of the bash manual:
The order of decompositions: expansion of the bracket; tilde expansion, parameter and variable expansion, arithmetic expansion and command substitution (done in order from left to right); word splitting; and path name extension.
If you understand word translation, path extension, and parameter expansion, you understand well what bash does. Note that the path name extension following the dictionary is critical, as it ensures that a file with a space in its name can still be mapped to glob. That's why good use of glob extensions is better than parsing commands in general.
Scale
Scope
Like the old ECMAscript, the shell has a dynamic scope unless you explicitly declare the names inside the function.
$ foo() { echo $x; } $ bar() { local x; echo $x; } $ foo $ bar $ x=123 $ foo 123 $ bar $ …
Environment and process area
Subshells inherit variables from their parent shells, but other kinds of processes do not inherit unexecuted names.
$ x=123 $ ( echo $x ) 123 $ bash -c 'echo $x' $ export x $ bash -c 'echo $x' 123 $ y=123 bash -c 'echo $y'
You can combine these rules:
$ foo() { > local -x bar=123
Introductory discipline
Um, types. Yes. bash really has no types, and everything expands to a string (or perhaps the word would be more appropriate). But consider the different types of extensions.
Lines
Quite a lot can be considered as a string. Brads in bash are strings, the meaning of which depends entirely on the extension applied to it.
No extension
It may be worthwhile to demonstrate that a bare word is really just a word, and that quotes do not change anything about it.
$ echo foo foo $ 'echo' foo foo $ "echo" foo foo
Substring Extension
$ fail='echoes' $ set -x
For more information on extensions, see the Parameter Expansion section of the manual. It is quite powerful.
Integers and Arithmetic Expressions
You can fill in names using the integer attribute to tell the shell to treat the right side of the assignment expressions as arithmetic. Then, when the parameter expands, it will be calculated as integer math before it expands to ... lines.
$ foo=10+10 $ echo $foo 10+10 $ declare -i foo $ foo=$foo
Arrays
Arguments and Positional Parameters
Before talking about arrays, it might be worth discussing positional parameters. The arguments for the shell script can be obtained using the numbered parameters $1 , $2 , $3 , etc. You can access all of these options at once using "$@" , whose extension has a lot to do with arrays. You can set and change positional parameters using the built-in set or shift functions, or simply by calling a shell or shell with these parameters:
$ bash -c 'for ((i=1;i<=$#;i++)); do > printf "\$%d => %s\n" "$i" "${@:i:1}" > done' -- foo bar baz $1 => foo $2 => bar $3 => baz $ showpp() { > local i > for ((i=1;i<=$#;i++)); do > printf '$%d => %s\n' "$i" "${@:i:1}" > done > } $ showpp foo bar baz $1 => foo $2 => bar $3 => baz $ showshift() { > shift 3 > showpp "$@" > } $ showshift foo bar baz biz quux xyzzy $1 => biz $2 => quux $3 => xyzzy
The bash manual also sometimes refers to $0 as a positional parameter. I find this confusing because it does not include it in the argument count $# , but it is a numbered parameter, so meh. $0 is the name of the shell or current shell script.
Arrays
The syntax of arrays is modeled after the positional parameters, so in most cases it is useful to think of arrays as the named form of "external positional parameters" if you want. Arrays can be declared using the following approaches:
$ foo=( element0 element1 element2 ) $ bar[3]=element3 $ baz=( [12]=element12 [0]=element0 )
You can access array elements by index:
$ echo "${foo[1]}" element1
You can cut arrays:
$ printf '"%s"\n' "${foo[@]:1}" "element1" "element2"
If you treat the array as a regular parameter, you will get a null index.
$ echo "$baz" element0 $ echo "$bar"
If you use quotation marks or backslashes to prevent the creation of words, the array will support a given set of words:
$ foo=( 'elementa bc' 'def' ) $ echo "${#foo[@]}" 2
The main difference between arrays and positional parameters:
- Positional parameters are not sparse. If
$12 installed, you can be sure that $11 also installed. (It can be set to an empty string, but $# will not be less than 12.) If "${arr[12]}" , there is no guarantee that "${arr[11]}" set, and the array length can be as small as 1. - The zero element of an array is uniquely the zero element of this array. In positional parameters, the null element is not the first argument, but the name of the shell or shell script.
- In
shift you need to slice and reassign it, for example arr=( "${arr[@]:1}" ) . You can also do unset arr[0] , but that will do the first element at index 1. - Arrays can be implicitly divided between shell functions as global, but you must explicitly pass the positional parameters of the shell function to see them.
It is often convenient to use pathname extensions to create arrays of file names:
$ dirs=( */ )
Teams
Teams are key, but they are also covered in more depth than I can in the manual. Read the SHELL GRAMMAR section. Various teams:
- Simple commands (e.g.
$ startx ) - Pipelines (e.g.
$ yes | make config ) (lol) - Lists (e.g.
$ grep -qF foo file && sed 's/foo/bar/' file > newfile ) - Compound commands (e.g.
$ ( cd -P /var/www/webroot && echo "webroot is $PWD" ) ) - Co-processes (complex, without example)
- Functions (Named compound command that can be thought of as a simple command)
Execution model
Of course, the execution model includes a bunch and a stack. It is endemic to all UNIX programs. bash also has a call stack for shell functions, visible through a nested use of the built-in caller .
Literature:
Please comment if you want me to expand further in a certain direction.