Semantics for bash scripts?

Question

Semantics for bash scripts?

More than any other language I know, I "learned" Bash by Googling every time I need a little thing. Therefore, I can combine small scripts that seem to work. However, I really don't know what is going on, and I was hoping for a more formal introduction to Bash as a programming language. For example: what is the evaluation order? What are the rules for coverage? What is typing discipline, for example. is it all string? What is the state of the program is the assignment of string values to integer values of variable names; Moreover, for example, a stack? Got a bunch? And so on.

I thought to consult the GNU Bash leadership for this kind of insight, but it doesn't seem to be what I want; it is more a list of syntactic sugar underwear than an explanation of the basic semantic model. The downside of one “bash tutorials” is only worse. Perhaps I should first learn sh and understand Bash as syntactic sugar on top of this? However, I do not know if this is an accurate model.

Any suggestions?

EDIT: I was asked to give examples of what I am ideally looking for. A rather extreme example of what I consider to be “formal semantics” is this JavaScript Essence document . Perhaps a slightly less formal example is the 2010 Haskell report .

+82

bash formal-semantics

jameshfisher Apr 21 '14 at 22:32

source share

3 answers

The answer to your question "What is a typing discipline, for example, is the whole string" Bash variables are character strings. But Bash allows arithmetic and variable comparisons when the variables are integers. Exceptions to Bash rule variables are character strings when specified variables are typed or declared differently

 $ A=10/2 $ echo "A = $A" # Variable A acting like a String. A = 10/2 $ B=1 $ let B="$B+1" # Let is internal to bash. $ echo "B = $B" # One is added to B was Behaving as an integer. B = 2 $ A=1024 # A Defaults to string $ B=${A/24/STRING01} # Substitute "24" with "STRING01". $ echo "B = $B" # $B STRING is a string B = 10STRING01 $ B=${A/24/STRING01} # Substitute "24" with "STRING01". $ declare -i B $ echo "B = $B" # Declaring a variable with non-integers in it doesn't change the contents. B = 10STRING01 $ B=${B/STRING01/24} # Substitute "STRING01" with "24". $ echo "B = $B" B = 1024 $ declare -i B=10/2 # Declare B and assigning it an integer value $ echo "B = $B" # Variable B behaving as an Integer B = 5

Declare parameter values:

-a A variable is an array.
-f Use function names only.
-i The variable should be considered as a whole; Arithmetic evaluation is performed when a variable is assigned a value.
-p Display the attributes and values of each variable. When -p is used, additional options are ignored.
-r Make read-only variables. These variables cannot then be assigned values using subsequent assignment operators and cannot be undone.
-t Give each variable a trace attribute.
-x Mark each variable for export to subsequent commands through the environment.

+5

Keith Reynolds Apr 22

source share

The bash manpage contains a bit more information than most manpages, and includes some of what you ask for. My guess after more than a decade of writing the bash script is that due to its “history as an extension of sh” it has some funky syntax (to maintain backward compatibility with sh).

FWIW, my experience was like yours; although various books (for example, O'Reilly "Learning the bash Shell", etc.) help in the syntax, there are many strange ways to solve various problems, and some of them are not in the book and should be google.

+1

philwalk Apr 21 '14 at 23:59 on

source share

kojiro · Accepted Answer · 2014-04-21 23:59

A shell is an interface to an operating system. This is usually a more or less reliable programming language, but with functions designed to simplify interaction with the operating system and file system. The POSIX shell (hereinafter referred to as the “shell”) is a bit of a muta combining some functions of LISP (s-expressions have much in common with the word splitting shell) and C (most of the shell arithmetic semantics are derived from C).

Another root of shell syntax comes from its upbringing as a misch of individual UNIX utilities. Most of which are often built into the shell, can actually be implemented as external commands. It throws many shell neophytes for a loop when they realize that /bin/[ exists on many systems.

 $ if '/bin/[' -f '/bin/['; then echo t; fi # Tested as-is on OS X, without the `]` t

cotton wool?

This makes much more sense if you look at how the shell is implemented. Here is the implementation I made as an exercise. This is in Python, but I hope this is not a hang for everyone. This is not terribly cool, but instructive:

 #!/usr/bin/env python from __future__ import print_function import os, sys '''Hacky barebones shell.''' try: input=raw_input except NameError: pass def main(): while True: cmd = input('prompt> ') args = cmd.split() if not args: continue cpid = os.fork() if cpid == 0: # We're in a child process os.execl(args[0], *args) else: os.waitpid(cpid, 0) if __name__ == '__main__': main()

I hope that the above makes it clear that the shell execution model is pretty much:

 1. Expand words. 2. Assume the first word is a command. 3. Execute that command with the following words as arguments.

Extension, team resolution, execution. All shell semantics are related to one of these three things, although they are much richer than the implementation described above.

Not all fork commands. In fact, there are several commands that do not make tons of sense implemented as external (such that they should be fork ), but even those are often available as external for strict POSIX compliance.

Bash builds on this base, adding new features and keywords to improve the POSIX shell. It is almost compatible with sh, and bash is so ubiquitous that some script authors go on for years, not realizing that the script may actually not work in a strict POSIXly system. (I also wonder how people can care so much about the semantics and style of a single programming language and so little for semantics and shell style, but I'm at odds.)

Assessment Procedure

This is a bit of a tricky question: bash interprets expressions in its main syntax from left to right, but in its arithmetic syntax it follows priority C. However, expressions are different from decompositions. In the EXPANSION section of the bash manual:

The order of decompositions: expansion of the bracket; tilde expansion, parameter and variable expansion, arithmetic expansion and command substitution (done in order from left to right); word splitting; and path name extension.

If you understand word translation, path extension, and parameter expansion, you understand well what bash does. Note that the path name extension following the dictionary is critical, as it ensures that a file with a space in its name can still be mapped to glob. That's why good use of glob extensions is better than parsing commands in general.

Scale

Scope

Like the old ECMAscript, the shell has a dynamic scope unless you explicitly declare the names inside the function.

 $ foo() { echo $x; } $ bar() { local x; echo $x; } $ foo $ bar $ x=123 $ foo 123 $ bar $ …

Environment and process area

Subshells inherit variables from their parent shells, but other kinds of processes do not inherit unexecuted names.

 $ x=123 $ ( echo $x ) 123 $ bash -c 'echo $x' $ export x $ bash -c 'echo $x' 123 $ y=123 bash -c 'echo $y' # another way to transiently export a name 123

You can combine these rules:

 $ foo() { > local -x bar=123 # Export foo, but only in this scope > bash -c 'echo $bar' > } $ foo 123 $ echo $bar $

Introductory discipline

Um, types. Yes. bash really has no types, and everything expands to a string (or perhaps the word would be more appropriate). But consider the different types of extensions.

Lines

Quite a lot can be considered as a string. Brads in bash are strings, the meaning of which depends entirely on the extension applied to it.

No extension

It may be worthwhile to demonstrate that a bare word is really just a word, and that quotes do not change anything about it.

 $ echo foo foo $ 'echo' foo foo $ "echo" foo foo

Substring Extension

 $ fail='echoes' $ set -x # So we can see what going on $ "${fail:0:-2}" Hello World + echo Hello World Hello World

For more information on extensions, see the Parameter Expansion section of the manual. It is quite powerful.

Integers and Arithmetic Expressions

You can fill in names using the integer attribute to tell the shell to treat the right side of the assignment expressions as arithmetic. Then, when the parameter expands, it will be calculated as integer math before it expands to ... lines.

 $ foo=10+10 $ echo $foo 10+10 $ declare -i foo $ foo=$foo # Must re-evaluate the assignment $ echo $foo 20 $ echo "${foo:0:1}" # Still just a string 2

Arrays

Arguments and Positional Parameters

Before talking about arrays, it might be worth discussing positional parameters. The arguments for the shell script can be obtained using the numbered parameters $1 , $2 , $3 , etc. You can access all of these options at once using "$@" , whose extension has a lot to do with arrays. You can set and change positional parameters using the built-in set or shift functions, or simply by calling a shell or shell with these parameters:

 $ bash -c 'for ((i=1;i<=$#;i++)); do > printf "\$%d => %s\n" "$i" "${@:i:1}" > done' -- foo bar baz $1 => foo $2 => bar $3 => baz $ showpp() { > local i > for ((i=1;i<=$#;i++)); do > printf '$%d => %s\n' "$i" "${@:i:1}" > done > } $ showpp foo bar baz $1 => foo $2 => bar $3 => baz $ showshift() { > shift 3 > showpp "$@" > } $ showshift foo bar baz biz quux xyzzy $1 => biz $2 => quux $3 => xyzzy

The bash manual also sometimes refers to $0 as a positional parameter. I find this confusing because it does not include it in the argument count $# , but it is a numbered parameter, so meh. $0 is the name of the shell or current shell script.

Arrays

The syntax of arrays is modeled after the positional parameters, so in most cases it is useful to think of arrays as the named form of "external positional parameters" if you want. Arrays can be declared using the following approaches:

 $ foo=( element0 element1 element2 ) $ bar[3]=element3 $ baz=( [12]=element12 [0]=element0 )

You can access array elements by index:

 $ echo "${foo[1]}" element1

You can cut arrays:

 $ printf '"%s"\n' "${foo[@]:1}" "element1" "element2"

If you treat the array as a regular parameter, you will get a null index.

 $ echo "$baz" element0 $ echo "$bar" # Even if the zeroth index isn't set $ …

If you use quotation marks or backslashes to prevent the creation of words, the array will support a given set of words:

 $ foo=( 'elementa bc' 'def' ) $ echo "${#foo[@]}" 2

The main difference between arrays and positional parameters:

Positional parameters are not sparse. If $12 installed, you can be sure that $11 also installed. (It can be set to an empty string, but $# will not be less than 12.) If "${arr[12]}" , there is no guarantee that "${arr[11]}" set, and the array length can be as small as 1.
The zero element of an array is uniquely the zero element of this array. In positional parameters, the null element is not the first argument, but the name of the shell or shell script.
In shift you need to slice and reassign it, for example arr=( "${arr[@]:1}" ) . You can also do unset arr[0] , but that will do the first element at index 1.
Arrays can be implicitly divided between shell functions as global, but you must explicitly pass the positional parameters of the shell function to see them.

It is often convenient to use pathname extensions to create arrays of file names:

 $ dirs=( */ )

Teams

Teams are key, but they are also covered in more depth than I can in the manual. Read the SHELL GRAMMAR section. Various teams:

Simple commands (e.g. $ startx )
Pipelines (e.g. $ yes | make config ) (lol)
Lists (e.g. $ grep -qF foo file && sed 's/foo/bar/' file > newfile )
Compound commands (e.g. $ ( cd -P /var/www/webroot && echo "webroot is $PWD" ) )
Co-processes (complex, without example)
Functions (Named compound command that can be thought of as a simple command)

Execution model

Of course, the execution model includes a bunch and a stack. It is endemic to all UNIX programs. bash also has a call stack for shell functions, visible through a nested use of the built-in caller .

Literature:

The SHELL GRAMMAR the bash manual
XCU Shell Command Documentation
Bash Guide on the Greycat wiki.
Advanced UNIX Programming

Please comment if you want me to expand further in a certain direction.

Semantics for bash scripts?

Assessment Procedure

Scale

Scope

Environment and process area

Introductory discipline

Lines

Integers and Arithmetic Expressions

Arrays

Teams

Execution model

More articles: