Recursively "normalize" file names

I mean getting rid of special characters in file names, etc.

I made a script that can rename files recursively [http://pastebin.com/raw.php?i=kXeHbDQw]:

for example: before:

THIS i.s my file (1).txt

after running the script:

This-i-s-my-file-1.txt

Ok there he is:

But: when I wanted to test it "completely", with these file names:

¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÂÃÄÅÆÇÈÊËÌÎÏÐÑÒÔÕרÙUÛUÝÞßàâãäåæçèêëìîïðñòôõ÷øùûýþÿ.txt
áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&'()*+,:;<=>?@[\]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£.txt

it does not work [http://pastebin.com/raw.php?i=iu8Pwrnr]:

$ sh renamer.sh directorythathasthefiles
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£': No such file or directory
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£': No such file or directory
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£': No such file or directory
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£': No such file or directory
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£': No such file or directory
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£': No such file or directory
mv: cannot stat `./áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†....and so on
$

therefore, "mv" cannot handle special characters ..: \

I worked on it for many hours.

Does anyone have a worker? [which can handle the characters [file names] in these two lines?]

+5
source share
4 answers

mv . script .


:

  • find ls .

    • for DEPTH in..., ?

      find -maxdepth 100 -type d
      
    • find -type d
      
    • ls, find

      find -not -type d
      
    • , :

      find -not -type d -print0 | while read -r -d '' FILENAME; do
      

      read .

  • ls | replace . - . sed sed.

    sed 's/á/a/g; s/í/i/g; ...'
    

    ( sed 'y/áí/ai/', , , Unicode. , perl -CS -Mutf8 -pe 'y/áí/ai/' .)

  • ASCII: " - ASCII- 33....255". .

    • Unicode UTF-8, "" - , . ( - "e" - , "ė" - .)

    • True ASCII 128 . , , - ISO 8859 ( ANSI) - , ISO 8859-1. 8859-16, "ASCII" .

  • echo -n $(command) .

  • . ,

    directory=$(dirname "$path")
    oldnname=$(basename "$path")
    # filter $oldname
    mv "$path" "$directory/$newname"
    
  • egrep . . ( cd.)

  • , , ...

    if [[ -e $directory/$newname ]]; then
        echo "target already exists, skipping: $oldname -> $newname"
        continue
    else
        mv "$path" "$directory/$newname"
    fi
    
  • sed 's/------------/-/g' :

    sed -r 's/-{2,}/-/g'
    
  • [ ] in tr [foo] [bar] . tr [ [ ] ].

  • ?

    echo "$FOLDERNAME" | sed "s/$/\//g"
    

    ?

    echo "$FOLDERNAME/"
    

, , detox.

+17

- :

find . -print0 -type f | awk 'BEGIN {RS="\x00"} { printf "%s\x00", $0; gsub("[^[:alnum:]]", "-"); printf "%s\0", $0 }' | xargs -0 -L 2 mv

xargs (1) , . awk (1) .

: sed -e '//+/-/g' "-" .

+6

, script , , read, read -r. , :

áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&'()*+,:;<=>?@[\]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£.txt
áíüűúöőóéÁÍÜŰÚÖŐÓÉ!"#$%&\'()*+,:;<=>?@[]^_`{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’""•–—˜™š›œžŸ¡¢£
+4

...

script:

** sed , :

dev:~$ echo 'áàaieeé!.txt' | sed -e 's/[áàã]/a/g; s/[éè]/e/g'
aaaieee!.txt

** , ,

$ NEWNAME='áàaieeé!.txt'
$ NEWNAME="$(echo "$NEWNAME" | sed -e 's/[áàã]/a/g; s/[éè]/e/g')"
$ NEWNAME="$(echo "$NEWNAME" | sed -e 's/aa*/a/g')"
$ echo $NEWNAME
aieee!.txt

** ls | read ..., :

for OLDNAME in $DIR/*; do
  blah
  blah
  blah
done

** Separate your path and rename the logic into two scenarios. One script finds the files that need to be renamed, one script handles the normalization of one file. When you recognize the find command, you will realize that you can drop the first script :)

+1
source

All Articles