Book Home Programming PerlSearch this book

32.22. File::Glob

use File::Glob ':glob';      # Override glob built-in.
@list = <*.[Cchy]>;          # Now uses POSIX glob, not csh glob.

use File::Glob qw(:glob csh_glob);
@sources = bsd_glob("*.{C,c,h,y,pm,xs}", GLOB_CSH);
@sources = csh_glob("*.{C,c,h,y,pm,xs}");  # (same thing)

use File::Glob ':glob';
# call glob with extra arguments
$homedir = bsd_glob('~jrhacker', GLOB_TILDE | GLOB_ERR);
if (GLOB_ERROR) {
    # An error occurred expanding the home directory.
}
The File::Glob module's bsd_glob function implements the glob(3) routine from the C library. An optional second argument contains flags governing additional matching properties. The :glob import tag imports both the function and the necessary flags.

The module also implements a csh_glob function. This is what the built-in Perl glob and GLOBPAT fileglobbing operators really call. Calling csh_glob is (mostly) like calling bsd_glob this way:

bsd_glob(@_ ? $_[0] : $_,
    GLOB_BRACE | GLOB_NOMAGIC | GLOB_QUOTE | GLOB_TILDE);
If you import the :glob tag, then all calls to the built-in fileglobbing operators in the current package will really call the module's bsd_glob function instead of its csh_glob function. One reason you might want to do this is that, although bsd_glob handles patterns with whitespace in them correctly, csh_glob handles them, um, in the historical fashion. Old scripts would write <*.c *.h> to glob both of those. Neither function is bothered by whitespace in the actual filenames, however.

The bsd_glob function takes an argument containing the fileglobbing pattern (not a regular expression pattern) plus an optional flags argument. Filenames with a leading dot are not matched unless specifically requested. The return value is influenced by the flags in the second argument, which should be bitwise ORed together:[3]

[3]Due to restrictions in the syntax of the built-in glob operator, you may need to call the function as bsd_glob if you want to pass it the second argument.

GLOB_BRACE

Preprocess the string to expand {pat,pat,...} strings as csh(1) would. The pattern {} is left unexpanded for historical reasons, mostly to ease typing of find(1) patterns.

GLOB_CSH

Synonym for GLOB_BRACE | GLOB_NOMAGIC | GLOB_QUOTE | GLOB_TILDE.

GLOB_ERR

Return an error when bsd_glob encounters a directory it cannot open or read. Ordinarily, bsd_glob skips over the error, looking for more matches.

GLOB_MARK

Return values that are directories with a slash appended.

GLOB_NOCASE

By default, filenames are case sensitive; this flag makes bsd_glob treat case differences as insignificant. (But see below for exceptions on MS-DOSish systems).

GLOB_NOCHECK

If the pattern does not match any pathname, then makes bsd_glob return a list consisting of only the pattern, as /bin/sh does. If GLOB_QUOTE is set, its effect is present in the pattern returned.

GLOB_NOMAGIC

Same as GLOB_NOCHECK but it only returns the pattern if it does not contain any of the special characters *, ? or [. NOMAGIC is provided to simplify implementing the historic csh(1) globbing behavior and should probably not be used anywhere else.

GLOB_NOSORT

By default, the pathnames are sorted in ascending order (using normal character comparisons irrespective of locale setting). This flag prevents that sorting for a small increase in speed.

GLOB_QUOTE

Use the backslash character \ for quoting: every occurrence of a backslash followed by a character in the pattern is replaced by that character, avoiding any special interpretation of the character. (But see below for exceptions on MS-DOSish systems).

GLOB_TILDE

Allow patterns whose first path component is ~USER. If USER is omitted, the tilde by itself (or followed by a slash) represents the current user's home directory.

The bsd_glob function returns a (possibly empty) list of matching paths, which will be tainted if that matters to your program. On error, GLOB_ERROR will be true and $! ($OS_ERROR) will be set to the standard system error. GLOB_ERROR is guaranteed to be false if no error occurred, and to be either GLOB_ABEND or GLOB_NOSPACE otherwise. (GLOB_ABEND means that the bsd_glob was stopped due to some error, GLOB_NOSPACE because it ran out of memory.) If bsd_glob had already found some matching paths when the error occurred, it returns the list of filenames found so far, and also setsGLOB_ERROR. Note that this implementation of bsd_glob varies from most others by not considering ENOENT and ENOTDIR as terminating error conditions. Instead, it continues processing despite those errors, unless the GLOB_ERR flag is set.

If no flag argument is supplied, your system's defaults are followed, meaning that filenames differing only in case are indistinguishable from one another on VMS, OS/2, old Mac OS (but not Mac OS X), and Microsoft systems (but not when Perl was built with Cygwin). If you supply any flags at all and still want this behavior, then you must include GLOB_NOCASE in the flags. Whatever system you're on, you can change your defaults up front by importing the :case or :nocase flags.

On MS-DOSish systems, the backslash is a valid directory separator character.[4] In this case, use of backslash as a quoting character (via GLOB_QUOTE) interferes with the use of backslash as a directory separator. The best (simplest, most portable) solution is to use slashes for directory separators, backslashes for quoting. However, this does not match some users' expectations, so backslashes (under GLOB_QUOTE) quote only the glob metacharacters [, ], {, }, -, ~, and \ itself. All other backslashes are passed through unchanged, if you can manage to get them by Perl's own backslash quoting in strings. It may take as many as four backslashes to finally match one in the filesystem. This is so completely insane that even MS-DOSish users should strongly consider using slashes. If you really want to use backslashes, look into the standard File::DosGlob module, as it might be more to your liking than Unix-flavored fileglobbing.

[4] Although technically, so is a slash--at least as far as those kernels and syscalls are concerned; command shells are remarkably less enlightened.



Library Navigation Links

Copyright © 2002 O'Reilly & Associates. All rights reserved.