Text specification for file tree?

I am looking for examples of specifying files in a tree structure, for example, to specify a set of files to search in grep. I would like to be able to include and exclude files and directories by coincidence of names. I am sure there are examples, but it’s hard for me to find them.

Here is an example of a possible syntax:

*.py *.html *.txt *.js -*.pyc -.svn/ -*combo_*.js 

(this would mean including a file with the extensions .py.html.txt.js, excluding .pyc files, anything in the .svn directory and any file matching combo_.js)

I know that I saw similar specifications in other tools before. Is this the bell of any bells for everyone?

+4
source share
7 answers

There is no single standard format for this kind of thing, but if you want to copy something generally accepted, check out the rsync documentation . See the “ENABLE / EXCLUSION PATTERN RULES” chapter.

+4
source

Apache Ant provides ' ant globs or templates where:

 **/foo/**/*.java 

means "any file ending with" .java "in a directory that includes a directory named" foo "in its path" - including. / foo / X.java

+2
source

How about find in unixish?

You can find, of course, more than creating a list of files, but this is one of the common ways to use it. On the man page:

NAME find - go through the file hierarchy

SYNTAXIS find [-H | -L | -P] [-EXdsx] [-f pathname] pathname ... expression find [-H | -L | -P] [-EXdsx] -f pathname [pathname ...] expression

DESCRIPTION The find utility recursively descends a directory tree for each path name that evaluates an expression (consisting of the primaries'' and operands '' listed below) in terms of each file in the tree.

to achieve your goal I would write something like (formatted for readability):

 find ./ \( -name *.{py,html,txt,js,pyc} -or \ -name *combo_*.js -or \ \( -name *.svn -and -type d\)\) \ -print 

Moreover, there is an idomatic template using xargs , which makes the search suitable for sending the entire list constructed in this way to an arbitrary command, as in:

 find /path -type f -print0 | xargs -0 rm 
+1
source

Does your example syntax implicitly understand that there is an escape character so that you can explicitly include a file starting with a dash? (The same question applies to any other wildcards, but I guess I expect to see more dash files in my names than asterisks.)

Different shells use * (and perhaps? To match a single char), as in your example, but they usually only match character strings that don't include the path separator (for example, '\ on Windows systems,' / ' in the other place). I have also seen such version control applications, since Perforce uses additional templates that may correspond to path component delimiters. For example, with Perforce, the pattern "foo / ... ext" (without quotes) will match all files in the foo / directory structure that end with "ext", regardless of whether they are in foo / self or in one of its own streaming directories. This seems like a useful example.

+1
source

If you use bash, you can use the extglob extension to get some nice global features. Turn it on as follows:

 shopt -s extglob 

Then you can do the following:

 # everything but .html, .jpg or ,gif files ls -d !(*.html|*gif|*jpg) # list file9, file22 but not fileit ls file+([0-9]) # begins with apl or un only ls -d +(apl*|un*) 

See also on this page .

+1
source

find(1) is a great tool described in the previous answer, but if it becomes more complex, you should consider writing your own script in any of the usual suspects (Ruby, Perl, Python, etc.) or try using one from more powerful shells like zsh, which has ** globbing commands, and you can specify what to exclude. The latter is probably harder though.

0
source

You might want to check out ack , which allows you to specify the types of files to search with options such as --perl , etc.

It also ignores default .svn directories, as well as kernel dumps, editor cracks, binaries, etc.

0
source

All Articles