Book HomePHP CookbookSearch this book

13.9. Escaping Special Characters in a Regular Expression

13.9.1. Problem

You want to have characters such as * or + treated as literals, not as metacharacters, inside a regular expression. This is useful when allowing users to type in search strings you want to use inside a regular expression.

13.9.2. Solution

Use preg_quote( ) to escape Perl-compatible regular-expression metacharacters:

$pattern = preg_quote('The Education of H*Y*M*A*N K*A*P*L*A*N').':(\d+)';
if (preg_match("/$pattern/",$book_rank,$matches)) {
    print "Leo Rosten's book ranked: ".$matches[1];
}

Use quotemeta( ) to escape POSIX metacharacters:

$pattern = quotemeta('M*A*S*H').':[0-9]+';
if (ereg($pattern,$tv_show_rank,$matches)) {
    print 'Radar, Hot Lips, and the gang ranked: '.$matches[1];
}

13.9.3. Discussion

Here are the characters that preg_quote( ) escapes:

. \ + * ? ^ $ [ ] ( ) { } < > = ! | :

Here are the characters that quotemeta( ) escapes:

. \ + * ? ^ $ [ ] ( )

These functions escape the metacharacters with backslash.

The quotemeta( ) function doesn't match all POSIX metacharacters. The characters {, }, and | are also valid metacharacters but aren't converted. This is another good reason to use preg_match( ) instead of ereg( ).

You can also pass preg_quote( ) an additional character to escape as a second argument. It's useful to pass your pattern delimiter (usually /) as this argument so it also gets escaped. This is important if you incorporate user input into a regular-expression pattern. The following code expects $_REQUEST['search_term'] from a web form and searches for words beginning with $_REQUEST['search_term'] in a string $s:

$search_term = preg_quote($_REQUEST['search_term'],'/');
if (preg_match("/\b$search_term/i",$s)) {
   print 'match!';
}

Using preg_quote( ) ensures the regular expression is interpreted properly if, for example, a Magnum, P.I. fan enters t.c. as a search term. Without preg_quote( ), this matches tic, tucker, and any other words whose first letter is t and third letter is c. Passing the pattern delimiter to preg_quote( ) as well makes sure that user input with forward slashes in it, such as CP/M, is also handled correctly.

13.9.4. See Also

Documentation on preg_quote( ) at http://www.php.net/preg-quote and quotemeta( ) at http://www.php.net/quotemeta.



Library Navigation Links

Copyright © 2003 O'Reilly & Associates. All rights reserved.