Sunday, May 1, 2011

rm fails to delete files by wildcard from a script, but works from a shell prompt

I've run into a really silly problem with a Linux shell script. I want to delete all files with the extension ".bz2" in a directory. In the script I call

rm "$archivedir/*.bz2"

where $archivedir is a directory path. Should be pretty simple, shouldn't it? Somehow, it manages to fail with this error:

rm: cannot remove `/var/archives/monthly/April/*.bz2': No such file or directory

But there is a file in that directory called test.bz2 and if I change my script to

echo rm "$archivedir/*.bz2"

and copy/paste the output of that line into a terminal window the file is removed successfully. What am I doing wrong?

From stackoverflow
  • The quotes are causing the string to be interpreted as a string literal, try removing them.

  • Just to expand on this a bit, bash has fairly complicated rules for dealing with metacharacters in quotes. In general

    • almost nothing is interpreted in single-quotes:

       echo '$foo/*.c'                  => $foo/*.c
       echo '\\*'                       => \\*
      
    • shell substitution is done inside double quotes, but file metacharacters aren't expanded:

       FOO=hello; echo "$foo/*.c"       => hello/*.c
      
    • everything inside backquotes is passed to the subshell which interprets them. A shell variable that is not exported doesn't get defined in the subshell. So, the first command echoes blank, but the second and third echo "bye":

      BAR=bye echo `echo $BAR`
      BAR=bye; echo `echo $BAR`
      export BAR=bye; echo `echo $BAR`
      

    (And getting this to print the way you want it in SO takes several tries is apparently impossible...)

    Charlie Martin : Jonathan, you're such a showoff. ;-) Thanks.
  • To expand a bit more:

    • In Unix, programs generally do not interpret wildcards themselves. The shell interprets unquoted wildcards, and replaces each wildcard argument with a list of matching file names. if $archivedir might contain spaces, then rm $archivedir/*.bz2 might not do what you

    • You can disable this process by quoting the wildcard character, using double or single quotes, or a backslash before it. However, that's not what you want here - you do want the wildcard expanded to the list of files that it matches.

    • Be careful about writing rm $archivedir/*.bz2 (without quotes). The word splitting (i.e., breaking the command line up into arguments) happens after $archivedir is substituted. So if $archivedir contains spaces, then you'll get extra arguments that you weren't intending. Say archivedir is /var/archives/monthly/April to June. Then you'll get the equivalent of writing rm /var/archives/monthly/April to June/*.bz2, which tries to delete the files "/var/archives/monthly/April", "to", and all files matching "June/*.bz2", which isn't what you want.

    The correct solution is to write:

    rm "$archivedir"/*.bz2
  • More info about the internal workings: Try typing:

    echo "$archivedir/*.bz2"
    

    at the shell prompt. You'll see it expands immediately. So rm never sees the * at all; instead, it is that list that is passed to it.

    Edit: I see you basically tried that. What you want to do then, is try it when there are several bz2 files in that directory. Then you'll see the effect.

  • Your original line

    rm "$archivedir/*.bz2"
    

    Can be re-written as

    rm "$archivedir"/*.bz2
    

    to achieve the same effect. The wildcard expansion is not taking place properly in your existing setup. By shifting the double-quote to the "front" of the file path (which is legitimate) you avoid this.

0 comments:

Post a Comment