Extract all substrings in bash

Question

Extract all substrings in bash

Looking for a solution in bash (will be part of a larger script).

Given a variable containing form information

  diff -r efb93662e8a7 -r 53784895c0f7 diff.txt
 --- diff.txt Fri Jan 23 14:48:30 2009 +0000
 +++ b / diff.txt Fri Jan 23 14:49:58 2009 +0000
 @@ -1.9 +0.0 @@ 
 -diff -r 9741ec300459 myfile.c 
 ---- myfile.c Thu Aug 21 18:22:17 2008 +0000 
 - +++ b / myfile.c Thu Aug 21 18:22:17 2008 +0000 -@ @ -1.4 +1.4 @@ 
 - int myfunc () 
 - { 
 - return 1; 
 - + return 10; 
 -}

I want to extract both (here diff.txt and myfile.c, but future cases will not be limited to this number) file names in the form line "edited: filename1 filename2 ... filenameN".

To clarify, I want to extract a few matching file names into a string.

The command "$ (expr" $ editing ": '. * --- [[: space:]] \ ([[: graph:]] * \) [[: space:]]')" returns the last file name correctly but not the previous ones.

EDIT: Requires the ability to identify edited file names (possibly including spaces), that is, file names that appear after "---" and until the day "Fri / Thu ...".

Thank you for your help (and many people have answered so far).

+4

bash shell

anon Jan 23 '09 at 18:36

source share

4 answers

I would suggest using an external tool for it - here is one way with perl:

 $(echo "$variable" | perl -e 'print "edited:"; while (<>) { while (/--- (\S+)/g) { print " $1"; } }')

I am sure that this can be done more elegantly, but I can’t think now that it will not take a more substantial program.

+1

David z Jan 23 '09 at 18:57

source share

Here is a simple, working solution:

 txt=$(cat) str="edited: " for word in $txt; do if echo $word | grep -qi '^[a-z0-9-_]*\.[az]*$'; then str="$str $word" fi done echo $str

Launch:

 anton@CAPTAIN-FALCON ~/Desktop $ bash sol.sh diff -r efb93662e8a7 -r 53784895c0f7 diff.txt --- diff.txt Fri Jan 23 14:48:30 2 009 +0000 +++ b/diff.txt Fri Jan 23 14:49:58 2009 +0000 @@ -1,9 +0,0 @@ -diff -r 9741ec300459 myfile.c ---- myfile.c Thu Aug 21 18:22:17 2008 +0000 -+++ b/myfil ec Thu Aug 21 18:22:17 2008 +0000 -@ @ -1,4 +1,4 @@ - int myfunc() - { -- return 1; -+ return 10; - } edited: diff.txt diff.txt myfile.c myfile.c

Edit : Dicking around with Grep for a while led to the following script, but I'm starting to wonder if pure bash is the right tool to work ... It would seem like there would be a lot of angular cases where you either missed some files or got errors file names.

 #! /bin/bash rawFiles=`cat | grep -ioz ' -* [a-z0-9-_\ ]*\.[az]*'` for file in $rawFiles; do if ! echo $file | grep -q '^-*$'; then files="$files${file} " fi done echo "edited: $files"

+1

user50264 Jan 23 '09 at 19:18

source share

Could you complete your operation before installing $ edit - then you might have line breaks?

Then perhaps some sed will be able to extract the file names.

0

Douglas leeder Jan 23 '09 at 18:46

source share

Colas nahaboo · Accepted Answer · 2009-01-23T22:44:29+0000

Solution using only bash built-in modules, no external programs:

res="edited: "; var="${var#* --- } --- " while test -n "$var";do res="$res ${var%% *}"; var="${var#* --- }";done echo "$res"

It iterates over all occurrences of "---". The trick is to prepare the line using the first garbarge trim from the very beginning (to the first ---) and adding “---” at the end, to subsequently have simpler logic in the while loop.

This is with bash the most useful function, the # and% strings to trim

Extract all substrings in bash

More articles: