How does a conditional expression compare strings?

#!/usr/bin/env bash echo 'Using conditional expression:' [[ ' ' < '0' ]] && echo ok || echo not ok [[ ' a' < '0a' ]] && echo ok || echo not ok echo 'Using test:' [ ' ' \< '0' ] && echo ok || echo not ok [ ' a' \< '0a' ] && echo ok || echo not ok 

Output:

 Using conditional expression: ok not ok Using test: ok ok 

bash --version : GNU bash, version 4.2.45(1)-release (x86_64-pc-linux-gnu)

uname -a : Linux linuxmint 3.8.0-19-generic

+7
bash
source share
3 answers

The bash guide says:

When used with [[,, and>, operators sort lexicographically using the current locale. The test team is sorted using ASCII order.

It comes down to using strcoll (3) or strcmp (3) respectively.

Use the following program (strcoll_strcmp.c) to verify this:

 #include <stdio.h> #include <string.h> #include <locale.h> int main(int argc, char **argv) { setlocale(LC_ALL, ""); if (argc != 3) { fprintf(stderr, "Usage: %s str1 str2\n", argv[0]); return 1; } printf("strcoll('%s', '%s'): %d\n", argv[1], argv[2], strcoll(argv[1], argv[2])); printf("strcmp('%s', '%s'): %d\n", argv[1], argv[2], strcmp(argv[1], argv[2])); return 0; } 

Please note the difference:

 $ LC_ALL=C ./strcoll_strcmp ' a' '0a' strcoll(' a', '0a'): -16 strcmp(' a', '0a'): -16 $ LC_ALL=en_US.UTF-8 ./strcoll_strcmp ' a' '0a' strcoll(' a', '0a'): 10 strcmp(' a', '0a'): -16 

That is why they are compared as such, I'm not sure. This should be due to some English lexicographic sorting rules. I think the exact rules are described in ISO 14651 Method for comparing character strings and a description of the general order template and the accompanying table of patterns. Glibc contains this data in the source tree in libc/localedata/locales .

+5
source share

The behavior you observe can be explained by the following: manual :

 bash-4.1 and later use the current locale's collation sequence and strcoll(3). 

You seem to be looking for an ASCII-based comparison. You can change the behavior by setting either compat32 or compat40 .

 $ cat test shopt -s compat40 echo 'Using conditional expression:' [[ ' ' < '0' ]] && echo ok || echo not ok [[ ' a' < '0a' ]] && echo ok || echo not ok echo 'Using test:' [ ' ' \< '0' ] && echo ok || echo not ok [ ' a' \< '0a' ] && echo ok || echo not ok $ bash test Using conditional expression: ok ok Using test: ok ok 

From the manual:

 compat32 If set, Bash changes its behavior to that of version 3.2 with respect to locale-specific string comparison when using the '[[' conditional command's '<' and '>' operators. Bash versions prior to bash-4.0 use ASCII collation and strcmp(3); bash-4.1 and later use the current locale's collation sequence and strcoll(3). compat40 If set, Bash changes its behavior to that of version 4.0 with respect to locale-specific string comparison when using the '[[' conditional command's '<' and '>' operators (see previous item) and the effect of interrupting a command list. 
+4
source share

The < operator, when used inside [ ] or [[ ]] compares two strings in alphabetical order of ASCII. This means that a less than b . The caveat here is that since [ ] is a bit complicated and secret, you need to hide < , another wise base thinks you want to do the redirection.

However, two tests are equivalent:

 [ 'a' \< 'b' ] && echo ok [[ 'a' < 'b' ]] && echo ok 

In your example, ' a' definitely less than '0a' , since the space has a decimal value of 20, and '0' has a value of 48.

So, I think you found a mistake there.

 [ ' a' \< '0a' ] 

and

 [[ ' a' < '0a' ]] 

must be equivalent, and correct in [ ] correct.

+2
source share

All Articles