How to sort an array in Bash

I have an array in Bash, for example:

array=(acbf 3 5) 

I need to sort an array. Not just showing the contents in a sorted way, but also getting a new array with sorted elements. A new sorted array can be brand new or old.

+121
sorting arrays bash shell
Sep 16 2018-11-11T00: 00Z
source share
15 answers

You do not need a lot of code:

 IFS=$'\n' sorted=($(sort <<<"${array[*]}")) unset IFS 

Supports spaces in elements (unless it is a newline) and works in Bash 3.x.

eg.:

 $ array=("ac" bf "3 5") $ IFS=$'\n' sorted=($(sort <<<"${array[*]}")); unset IFS $ printf "[%s]\n" "${sorted[@]}" [3 5] [ac] [b] [f] 

Note. @sorontar pointed out that caution should be exercised if elements contain wildcards such as * or ? :

The sorted = ($ (...)) part uses the split and glob operator. You must turn off glob: set -f or set -o noglob or shopt -op noglob , or an array element, for example * , will be expanded to a list of files.

What's happening:

The result is six things that happen in the following order:

  1. IFS=$'\n'
  2. "${array[*]}"
  3. <<<
  4. sort
  5. sorted=($(...))
  6. unset IFS

First, IFS=$'\n'

This is an important part of our operation, which affects results 2 and 5 as follows:

Given:

  • "${array[*]}" expands to each element bounded by the first IFS character
  • sorted=() creates elements by dividing each IFS character

IFS=$'\n' sets up the elements to expand using a new line as a separator, and then create them so that each line becomes an element. (i.e. split on a new line.)

Newline separation is important because sort works like this (sorting by line). Separating only with a new line is not so important, but it is necessary to keep elements containing spaces or tabs.

The default IFS is a space, a tab, followed by a new line, and it is not suitable for our work.

The rest of the sort <<<"${array[*]}"

<<< , called here strings , takes the extension "${array[*]}" as described above, and feeds it to standard sort input.

In our sort example, the following line is passed:

 ac b f 3 5 

Since sort sorts, it produces:

 3 5 ac b f 

Next, the sorted=($(...))

The $(...) , called command substitution , causes its contents ( sort <<<"${array[*]} ) to be run as a normal command, while the standard output received is output as a literal that goes anywhere $(...) .

In our example, this produces something similar to simple spelling:

 sorted=(3 5 ac b f ) 

sorted then becomes an array, which is created by breaking this literal on each new line.

Finally, unset IFS

This resets the IFS value to its default value, and this is just good practice.

This is done so that we do not create problems with anything that relies on IFS later in our script. (Otherwise, we need to remember that we have all changed - which may be impractical for complex scenarios.)

+174
Aug 03 '12 at 5:17
source share

Original answer:

 array=(acb "ff" 3 5) readarray -t sorted < <(for a in "${array[@]}"; do echo "$a"; done | sort) 

exit:

 $ for a in "${sorted[@]}"; do echo "$a"; done 3 5 a b c ff 

Note This version matches values ​​containing special characters or spaces (except for newlines)

Note readarray is supported in bash 4+.




Change Based on the @Dimitre suggestion, I updated it to:

 readarray -t sorted < <(printf '%s\0' "${array[@]}" | sort -z | xargs -0n1) 

which has an advantage even in understanding sorting elements with newline characters that are embedded correctly. Unfortunately, as @ruakh correctly signaled, this did not mean that the result of readarray would be correct, because readarray not able to use NUL instead of regular readarray as line separators.

+34
Sep 16 '11 at 9:23 a.m.
source share

Here's a clean implementation of Bash quicksort:

 #!/bin/bash # quicksorts positional arguments # return is in array qsort_ret qsort() { local pivot i smaller=() larger=() qsort_ret=() (($#==0)) && return 0 pivot=$1 shift for i; do if [[ $i < $pivot ]]; then smaller+=( "$i" ) else larger+=( "$i" ) fi done qsort "${smaller[@]}" smaller=( "${qsort_ret[@]}" ) qsort "${larger[@]}" larger=( "${qsort_ret[@]}" ) qsort_ret=( "${smaller[@]}" "$pivot" "${larger[@]}" ) } 

Use as, for example,

 $ array=(acbf 3 5) $ qsort "${array[@]}" $ declare -p qsort_ret declare -a qsort_ret='([0]="3" [1]="5" [2]="a" [3]="b" [4]="c" [5]="f")' 

This implementation is recursive ... so here's an iterative quicksort:

 #!/bin/bash # quicksorts positional arguments # return is in array qsort_ret # Note: iterative, NOT recursive! :) qsort() { (($#==0)) && return 0 local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger qsort_ret=("$@") while ((${#stack[@]})); do beg=${stack[0]} end=${stack[1]} stack=( "${stack[@]:2}" ) smaller=() larger=() pivot=${qsort_ret[beg]} for ((i=beg+1;i<=end;++i)); do if [[ "${qsort_ret[i]}" < "$pivot" ]]; then smaller+=( "${qsort_ret[i]}" ) else larger+=( "${qsort_ret[i]}" ) fi done qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" ) if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi done } 

In both cases, you can change the order used: I used string comparisons, but you can use arithmetic comparisons, compare wrt file modification time, etc., just use the appropriate test; you can even make it more general and use the first argument, which is used to use the test function, for example,

 #!/bin/bash # quicksorts positional arguments # return is in array qsort_ret # Note: iterative, NOT recursive! :) # First argument is a function name that takes two arguments and compares them qsort() { (($#<=1)) && return 0 local compare_fun=$1 shift local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger qsort_ret=("$@") while ((${#stack[@]})); do beg=${stack[0]} end=${stack[1]} stack=( "${stack[@]:2}" ) smaller=() larger=() pivot=${qsort_ret[beg]} for ((i=beg+1;i<=end;++i)); do if "$compare_fun" "${qsort_ret[i]}" "$pivot"; then smaller+=( "${qsort_ret[i]}" ) else larger+=( "${qsort_ret[i]}" ) fi done qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" ) if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi done } 

Then you can use this comparison function:

 compare_mtime() { [[ $1 -nt $2 ]]; } 

and use:

 $ qsort compare_mtime * $ declare -p qsort_ret 

so that the files in the current folder are sorted by modification time (first the newest).

Note. These features are pure bash! no external utilities and no subshells! they are safe for any funny characters you may have (spaces, newlines, globe characters, etc.).

+33
Jun 01 '15 at 14:38
source share

If you do not need to handle special shell characters in array elements:

 array=(acbf 3 5) sorted=($(printf '%s\n' "${array[@]}"|sort)) 

In bash, you'll need an external sorting program.

With zsh, external programs are not required and special shell characters are easily processed:

 % array=('aa' cbf 3 5); printf '%s\n' "${(o)array[@]}" 3 5 aa b c f 

ksh has set -s to sort ASCIIbetically.

+27
Sep 16 '11 at 9:29 a.m.
source share

On a 3-hour train ride from Munich to Frankfurt, which was hard for me to get to, because Oktoberfest starts tomorrow), I thought about my first post. Using a global array is a much better idea for a general sorting function. The following function handles harsh lines (newlines, spaces, etc.):

 declare BSORT=() function bubble_sort() { # # @param [ARGUMENTS]... # # Sort all positional arguments and store them in global array BSORT. # Without arguments sort this array. Return the number of iterations made. # # Bubble sorting lets the heaviest element sink to the bottom. # (($# > 0)) && BSORT=("$@") local j=0 ubound=$((${#BSORT[*]} - 1)) while ((ubound > 0)) do local i=0 while ((i < ubound)) do if [ "${BSORT[$i]}" \> "${BSORT[$((i + 1))]}" ] then local t="${BSORT[$i]}" BSORT[$i]="${BSORT[$((i + 1))]}" BSORT[$((i + 1))]="$t" fi ((++i)) done ((++j)) ((--ubound)) done echo $j } bubble_sort acb 'zy' 3 5 echo ${BSORT[@]} 

Fingerprints:

 3 5 abczy 

The same output is created from

 BSORT=(acb 'zy' 3 5) bubble_sort echo ${BSORT[@]} 

Please note that probably Bash internally uses smart pointers, so a swap operation can be cheap (although I doubt it). However, bubble_sort demonstrates that more advanced features, such as merge_sort , are also available for the shell language.

+8
Sep 16 2018-11-11T00:
source share

tl; dr :

Sorting the a_in array and storing the result in a_out (elements must not have inline newline [1] ):

Bash v4 +:

 readarray -t a_out < <(printf '%s\n' "${a_in[@]}" | sort) 

Bash v3:

 IFS=$'\n' read -d '' -r -a a_out < <(printf '%s\n' "${a_in[@]}" | sort) 

Benefits for Antique Solution :

  • You do not need to worry about random rotation (random interpretation of array elements as file name patterns), therefore, to disable drag and drop ( set -f and set +f to restore later), you do not need an additional command.

  • You do not need to worry about resetting IFS using unset IFS . [2]




Additional reading: explanation and code sample

The above combines Bash with an external sort utility for a solution that works with arbitrary single-line elements and lexical or numerical sorting (optional by field) :

  • Performance . For about 20 elements or more, it will be faster than a pure Bash solution - significantly and more often, when you get about 100 elements.
    (The exact thresholds will depend on your specific input, machine, and platform.)

    • The reason is that it avoids bash loops .
  • printf '%s\n' "${a_in[@]}" | sort printf '%s\n' "${a_in[@]}" | sort performs sorting (lexically, by default - see sort POSIX spec ):

    • "${a_in[@]}" safely expands the elements of the a_in array as separate arguments, regardless of what they contain (including spaces).

    • printf '%s\n' then prints each argument, i.e. each element of the array is on its own line, as it is.

  • Note the use of process substitution ( <(...) ) to provide a sorted result as input to read / readarray (by redirecting to stdin, < ), since read / readarray must execute in the current shell (should not run in a subshell) so that the output variable a_out visible to the current shell (so that the variable remains defined in the rest of the script).

  • Reading sort output into array variable:

    • Bash v4 +: readarray -t a_out reads individual lines readarray -t a_out sort to the elements of the variable of the a_out array, not including the trailing \n in each element ( -t ).

    • Bash v3: readarray does not exist, so read must be used:
      IFS=$'\n' read -d '' -r -a a_out tells read to read the a_out variable into array ( -a ), reading all the input, line by line ( -d '' ), but breaking it into array elements into lines new line ( IFS=$'\n' . $'\n' , which creates a literal new line (LF), is a so-called line with quoting ANSI C ).
      ( -r , the parameter that should always be used with read disables unexpected character processing \ .)

Annotated code example:

 #!/usr/bin/env bash # Define input array `a_in`: # Note the element with embedded whitespace ('a c')and the element that looks like # a glob ('*'), chosen to demonstrate that elements with line-internal whitespace # and glob-like contents are correctly preserved. a_in=( 'ac' bf 5 '*' 10 ) # Sort and store output in array `a_out` # Saving back into `a_in` is also an option. IFS=$'\n' read -d '' -r -a a_out < <(printf '%s\n' "${a_in[@]}" | sort) # Bash 4.x: use the simpler `readarray -t`: # readarray -t a_out < <(printf '%s\n' "${a_in[@]}" | sort) # Print sorted output array, line by line: printf '%s\n' "${a_out[@]}" 

Due to the use of sort without parameters, this gives lexical sorting (numbers are sorted before letters, and digital sequences are processed lexically, and not as numbers):

 * 10 5 ac b f 

If you need digital sorting by 1st field, you should use sort -k1,1n instead of sort , which gives (sorting not numbers for numbers and numbers is sorted correctly):

 * ac b f 5 10 



[1] To process elements with embedded newlines use the following option (Bash v4 +, with GNU sort ):
readarray -d '' -t a_out < <(printf '%s\0' "${a_in[@]}" | sort -z) .
Useful answer Michał Górny contains a solution to Bash v3.

[2] As long as IFS installed in Bash v3, this change applies to the team.
On the contrary, what follows IFS=$'\n' in the answer to antak is the assignment, not the command, in which case the IFS change is global.

+8
Oct 27 '15 at 2:34
source share

Another solution using external sort , and copes with any special characters (except NULs :)). Should work with bash -3.2 and GNU or BSD sort (unfortunately, POSIX does not include -z ).

 local e new_array=() while IFS= read -r -d '' e; do new_array+=( "${e}" ) done < <(printf "%s\0" "${array[@]}" | LC_ALL=C sort -z) 

First look at the input redirection at the end. We use the built-in printf to write zero-terminated array elements. Quoting ensures that the elements of the array are passed as is, and the printf shell specification forces it to reuse the last part of the format string for each remaining parameter. That is, this is equivalent to something like:

 for e in "${array[@]}"; do printf "%s\0" "${e}" done 

A list of null-terminated items is then passed to sort . The -z option forces it to read elements with a null character, sort them, and also output zero-end. If you need to get only unique elements, you can pass -u as it is more portable than uniq -z . LC_ALL=C provides a stable sort order regardless of the locale - sometimes useful for scripts. If you want sort respect the locale, delete it.

The <() construct receives a descriptor for reading from the generated pipeline, and < redirects the standard input of the while . If you need access to the standard inputs inside the channel, you can use a different descriptor - an exercise for the reader :).

Now back to the beginning. The read built-in module reads the output from the redirected stdin. Setting an empty IFS disables word separation, which is not necessary here - as a result, read reads the entire "line" of input into one provided variable. The -r option also disables unwanted processing. Finally, -d '' sets the line separator to NUL, i.e. tells read to read lines with zero termination.

As a result, the loop is executed once for each subsequent element of the array with zero completion, and the value is stored in e . The example simply puts the elements in another array, but you can process them directly :).

Of course, this is only one of many ways to achieve the same goal. As I see it, it's easier than implementing a complete sorting algorithm in bash, and in some cases it will be faster. It handles all special characters, including newlines, and should work on most common systems. Most importantly, it can teach you something new and amazing about bash :).

+6
Nov 22 '14 at 22:58
source share

try the following:

 echo ${array[@]} | awk 'BEGIN{RS=" ";} {print $1}' | sort 

The output will be:

 3
 5
 a
 b
 c
 f

The problem is resolved.

+2
Feb 14 2018-12-12T00:
source share

min sort:

 #!/bin/bash array=(.....) index_of_element1=0 while (( ${index_of_element1} < ${#array[@]} )); do element_1="${array[${index_of_element1}]}" index_of_element2=$((index_of_element1 + 1)) index_of_min=${index_of_element1} min_element="${element_1}" for element_2 in "${array[@]:$((index_of_element1 + 1))}"; do min_element="'printf "%s\n%s" "${min_element}" "${element_2}" | sort | head -n+1'" if [[ "${min_element}" == "${element_2}" ]]; then index_of_min=${index_of_element2} fi let index_of_element2++ done array[${index_of_element1}]="${min_element}" array[${index_of_min}]="${element_1}" let index_of_element1++ done 
+2
Mar 07 '18 at 21:35
source share

If you can calculate a unique integer for each element of the array, for example:

 tab='0123456789abcdefghijklmnopqrstuvwxyz' # build the reversed ordinal map for ((i = 0; i < ${#tab}; i++)); do declare -g ord_${tab:i:1}=$i done function sexy_int() { local sum=0 local i ch ref for ((i = 0; i < ${#1}; i++)); do ch="${1:i:1}" ref="ord_$ch" (( sum += ${!ref} )) done return $sum } sexy_int hello echo "hello -> $?" sexy_int world echo "world -> $?" 

then you can use these integers as indexes of arrays because Bash always uses a sparse array, so no need to worry about unused indexes:

 array=(acbf 3 5) for el in "${array[@]}"; do sexy_int "$el" sorted[$?]="$el" done echo "${sorted[@]}" 
  • Pros. Quickly.
  • Cons. Duplicate elements are combined and it is not possible to match content with 32-bit unique integers.
+1
Apr 03 '13 at 2:42 on
source share
 array=(acbf 3 5) new_array=($(echo "${array[@]}" | sed 's/ /\n/g' | sort)) echo ${new_array[@]} 

The echo content of new_array will be:

 3 5 abcf 
+1
Feb 06 '15 at 18:12
source share

I'm not sure if you need an external sorting program in Bash.

Here is my implementation for a simple bubble sorting algorithm.

 function bubble_sort() { # # Sorts all positional arguments and echoes them back. # # Bubble sorting lets the heaviest (longest) element sink to the bottom. # local array=($@) max=$(($# - 1)) while ((max > 0)) do local i=0 while ((i < max)) do if [ ${array[$i]} \> ${array[$((i + 1))]} ] then local t=${array[$i]} array[$i]=${array[$((i + 1))]} array[$((i + 1))]=$t fi ((i += 1)) done ((max -= 1)) done echo ${array[@]} } array=(acbf 3 5) echo " input: ${array[@]}" echo "output: $(bubble_sort ${array[@]})" 

This will print:

  input: acbf 3 5 output: 3 5 abcf 
0
Sep 16 '11 at 10:27
source share
 a=(eb 'c d') shuf -e "${a[@]}" | sort >/tmp/f mapfile -tg </tmp/f 
0
Oct 23 '15 at 21:13
source share

There is a workaround for the usual spaces and strings problem:

Use a character that is not in the source array (for example, $'\1' or $'\4' or similar).

This function performs the task:

 # Sort an Array may have spaces or newlines with a workaround (wa=$'\4') sortarray(){ local wa=$'\4' IFS='' if [[ $* =~ [$wa] ]]; then echo "$0: error: array contains the workaround char" >&2 exit 1 fi set -f; local IFS=$'\n' x nl=$'\n' set -- $(printf '%s\n' "${@//$nl/$wa}" | sort -n) for x do sorted+=("${x//$wa/$nl}") done } 

This will sort the array:

 $ array=( ab 'cd' $'e\nf' $'g\1h') $ sortarray "${array[@]}" $ printf '<%s>\n' "${sorted[@]}" <a> <b> <c d> <e f> <gh> 

This will complain that the source array contains a workaround character:

 $ array=( ab 'cd' $'e\nf' $'g\4h') $ sortarray "${array[@]}" ./script: error: array contains the workaround char 

Description

  • We set two local variables wa (workaround char) and zero IFS
  • Then (with ifs null) we check that the entire array is $* .
  • Does not contain any problems char [[ $* =~ [$wa] ]] .
  • If so, raise a message and report the error: exit 1
  • Avoid file name extensions: set -f
  • Set the new IFS value ( IFS=$'\n' ) to the loop variable x and the new line var ( nl=$'\n' ).
  • We print all the values ​​of the received arguments (input array $@ ).
  • but we replace any new line in a roundabout way char "${@//$nl/$wa}" .
  • send these values ​​to sort sort -n .
  • and put all the sorted values ​​in the set -- positional arguments.
  • Then we assign each argument one by one (to save new lines).
  • in a for x loop
  • into a new array: sorted+=(…)
  • inside quotation marks to save any existing newline.
  • restoring a workaround to the new line "${x//$wa/$nl}" .
  • done
0
Oct 25 '16 at 3:45
source share

sorted=($(echo ${array[@]} | tr " " "\n" | sort))

In the spirit of bash / linux, I will use the best command line tool for each step. sort performs the main task, but requires input separated by a new line, not space, so a simple laid pipeline just does:

The contents of the Echo array -> replace space with a new line -> sort

$() should reflect the result

($()) is to put the "echo result" in an array

Note : as @sorontar is mentioned in comment to another question:

The sorted = ($ (...)) part uses the split and glob operator. You must disable glob: set -f or set -o noglob or shopt -op noglob, or an array element like * will be expanded to a list of files.

-2
Mar 07 '16 at 19:20
source share



All Articles