Monday, July 23, 2007

Troublesome Filenames

smith: In my public_html directory there are subdirectories called "~backup" and "~smith". "Backup" is the name we use for file archival of old work. When I cd to them I'm taken to some home directory. What's going on?

me: That looked funny at first, but I think it's ok.

A ~ (tilde) character is shell-ese for a home directory, either your own by itself or that of a given user with "~username". So when you entered "cd ~smith", it actually took you to your home directory. The /public_html/~smith thing is a real directory with a tilde as the first character of the name. You can create a directory with that name by typing

        mkdir public_html/~smith
The tilde is only interpreted as a home directory when it's the first character of a directory name, and only by csh/tcsh/zsh/ksh. Shell scripts are typically interpreted by the Bourne shell (/bin/sh), which doesn't know what "~smith" means, so sometimes shell scripts will create files named like that.

Entering

        cd ~smith/public_html/~smith
took me to a directory that didn't look like your home directory. I renamed public_html/~backup to public_html/backup and public_html/~smith to public_html/smith.

There are other misbehaving filenames. Usually they come about the same way the "public_html/~smith did: UNIX file systems allows filenames to contain characters that are special to some command interpreters. For example, a mistaken cut-and-paste operation can generate several spurious commands, perhaps creating files with random names. Misbehaved filenames may make it hard to work with files, even in the newer shell programs.

For instance, files beginning with a hyphen ("-") will confound some commands. If you have a file named "-h" and try to do anything to it, the usual commands such as mv, cp, or rm will interpret the name as a command line switch and tell you you're typing the command incorrectly:

~/temp me@server 8:48> ls -l
total 0
-rw-r--r-- 1 me mygroup 0 Feb 20 08:46 --help
-rw-r--r-- 1 me mygroup 0 Feb 20 08:46 -h

~/temp me@server 8:48> rm -h
rm: illegal option -- h
usage: rm [-fiRr] file ...
zsh: exit 2 rm -h

~/temp me@server 8:48> mv -h goodname

mv: illegal option -- h
mv: Insufficient arguments (1)
Usage: mv [-f] [-i] f1 f2
mv [-f] [-i] f1 ... fn d1
mv [-f] [-i] d1 d2
zsh: exit 2 mv -h goodname
The solution is to specify a path name for the file, either an absolute path as in /usr/local/bin/--badfile, or a relative path such as "./" (dot forward-slash, meaning the current directory).
~/temp me@server 8:48> rm ./-h    
~/temp me@server 8:48> mv ./--help goodname
~/temp me@server 8:48> ls -l
total 0
-rw-r--r-- 1 me mygroup 0 Feb 20 08:46 goodname

Other inconvenient filenames can require a "\" (backslash) to quote characters that are special to the shell, such as "-" (hyphen), "~" (tilde), " " (space). The newer shells offering command-line completion will automatically quote the characters for you.

Advanced users may wish to apply the stream editor sed to the problem.

For a list of names of punctuation marks, I recommend the Wikipedia entry.

No comments: