Mastering Regular
Expressions Full Index -- use your browser's find function to search.
\? 139
\<...\> 21,
25, 50, 131-132, 150
\<...\>,
in egrep 15
\<...\>, in
Emacs 100
\<...\>, mimicking in
Perl 341-342
\+ 139
\(...\) 135
`\+' history 87
\0 116-117
\1 136, 300, 303
\1, in
Perl 41
\A 111, 127-128
\A, in
Java 373
\A,
optimization 246
\a 114-115
\b 65, 114-115, 400
\b, backspace and word
boundary 44, 46
\b, in
Perl 286
\b\B 240
\C 328
\D 49, 119
\d 49, 119
\d, in
Perl 288
\e 79, 114-115
\E 290
\f 114-115
\f, introduced 44
\G 128-131, 212, 315-316,
362
\G, advanced
example 130
\G, in
Java 373
\G,
in .NET 402
\G, optimization 246
\G, optimization, \kname (see
named capture)
\l
290
\L...\E
290
\L...\E, inhibiting 292
\n 49, 114-115
\n, introduced 44
\n, machine-dependency 114
\N{LATIN SMALL LETTER SHARP
S} 290
\N{name}
290
\N{name}, inhibiting 292
\p{...} 119
\p{^...}
288
\p{all}
380
\p{All}
123
\p{All}, in
Perl 288
\p{Any} 123
\p{Any}, in
Perl 288
\p{Arrows} 122
\p{Assigned} 123-124
\p{Assigned}, in
Perl 288
\p{Basic_Latin} 122
\p{Box_Drawing} 122
\p{C} 120
\p{Cc} 121
\p{Cf} 121
\p{Cherokee} 120
\p{Close_Punctuation}
121
\p{Cn} 121,
123-124, 380, 401
\p{Co} 121
\p{Connector_Punctuation}
121
\p{Control}
121
\p{Currency}
122
\p{Currency_Symbol}
121
\p{Cyrillic}
120, 122
\p{Dash_Punctuation}
121
\p{Decimal_Digit_Number}
121
\p{Dingbats}
122
\p{Enclosing_Mark}
121
\p{Final_Punctuation}
121
\p{Format}
121
\p{Gujarati}
120
\p{Han}
120
\p{Hangul_Jamo}
122
\p{Hebrew} 120,
122
\p{Hiragana}
120
\p{InArrows}
122
\p{InBasic_Latin}
122
\p{InBox_Drawing}
122
\p{InCurrency}
122
\p{InCyrillic}
122
\p{InDingbats}
122
\p{InHangul_Jamo}
122
\p{InHebrew}
122
\p{Inherited}
122
\p{Initial_Punctuation}
121
\p{InKatakana}
122
\p{InTamil}
122
\p{InTibetan}
122
\p{IsCherokee}
120
\p{IsCommon}
122
\p{IsCyrillic}
120
\p{IsGujarati}
120
\p{IsHan}
120
\p{IsHebrew}
120
\p{IsHiragana}
120
\p{IsKatakana}
120
\p{IsLatin}
120
\p{IsThai}
120
\p{IsTibetan}
122
\p{Katakana}
120, 122
\p{L}
119-120, 131, 380, 390
\p{L&} 120-121,
123
\p{L&}, in
Perl 288
\p{Latin} 120
\p{Letter} 120, 288
\p{Letter_Number}
121
\p{Line_Separator}
121
\p{Ll} 121,
400
\p{Lm} 121,
400
\p{Lo} 121,
400
\p{Lowercase_Letter}
121
\p{Lt} 121,
400
\p{Lu} 121,
400
\p{M} 120,
125
\p{Mark}
120
\p{Math_Symbol}
121
\p{Mc}
121
\p{Me}
121
\p{Mn}
121
\p{Modifier_Letter}
121
\p{Modifier_Symbol}
121
\p{N} 120,
390
\p{Nd} 121, 380,
400
\p{Nl}
121
\p{No}
121
\p{Non_Spacing_Mark}
121
\p{Number}
120
\p{Open_Punctuation}
121
\p{Other}
120
\p{Other_Letter}
121
\p{Other_Number}
121
\p{Other_Punctuation}
121
\p{Other_Symbol}
121
\p{P}
120
\p{Paragraph_Separator}
121
\p{Pc} 121,
400
\p{Pd}
121
\p{Pe}
121
\p{Pf} 121,
400
\p{Pi} 121,
400
\p{Po}
121
\p{Private_Use}
121
\p{Ps}
121
\p{Punctuation}
120
\p{S}
120
\p{Sc}
121-122
\p{Separator} 120
\p{Sk} 121
\p{Sm} 121
\p{So} 121
\p{Space_Separator}
121
\p{Spacing_Combining_Mark}
121
\p{Symbol}
120
\p{Tamil}
122
\p{Thai}
120
\p{Tibetan}
122
\p{Titlecase_Letter}
121
\p{Unassigned}
121, 123
\p{Unassigned}, in
Perl 288
\p{Uppercase_Letter}
121
\p{Z} 119-120,
380, 400
\p{Zl}
121
\p{Zp}
121
\p{Zs}
121
\Q...\E
290
\Q...\E, inhibiting 292
\Q...\E, in
Java 373
\r 49, 114-115
\r, machine-dependency 114
\s 49, 119
\s, introduction 47
\s, in
Emacs 127
\s,
in Perl 288
\S 49, 56, 119
\t 49, 114-115
\t, introduced 44
\u 116, 290, 400
\U 116
\U...\E 290
\U...\E, inhibiting 292
\V 364
\v 114-115, 364
\W 49, 119
\w 49, 65, 119
\w, in
Emacs 127
\w,
many different interpretations
93
\w, in
Perl 288
\x 116, 400
\x, in
Perl 286
\X 107, 125
\z 111, 127-128, 316
\z, in
Java 373
\z,
optimization 246
\Z 111, 127-128
\Z, in
Java 373
\Z,
optimization 246
// 322
/c 129-130, 315
/e 319-321
/g 61, 130, 307, 311-312,
315, 319
/g, introduced 51
/g, with regex
object 354
/i 134
/i, introduced 47
/i, with
study 359
/m 134
/o 352-353
/o, with regex
object 354
/osmosis 293
/s 134
/x 134, 288
/x, introduced 72
/x, history 90
-Dr 363
-i as -y 86
-y old grep
86
<>
54
<>, and
$_ 79
!~ 309
$_ 79, 308, 311, 314, 318,
322, 353-354, 359
$_, in
.NET 418
$& 299-300
$&, checking
for 358
$&, mimicking 302, 357
$&, naughty 356
$&, in
.NET 418
$&, okay for
debugging 331
$&, pre-match
copy 355
$$ in .NET 418
$* 362
$ 111-112, 128
$, escaping 77
$, optimization 246
$, Perl
interpolation 289
$+ 300-301, 345
$+, example 202
$+, .NET 202
$+, in
.NET 418
$/ 35, 78
$' 300
$', checking
for 358
$',
mimicking 357
$', naughty 356
$', in
.NET 418
$',
okay for debugging 331
$', pre-match
copy 355
$` 300
$`, checking
for 358
$`,
mimicking 357
$`, naughty 356
$`, in
.NET 418
$`,
okay for debugging 331
$`, pre-match
copy 355
$0 300
$1 135-136, 300, 303
$1, introduced 41
$1, in
Java 388
$1,
in .NET 418
$1, in other
languages 136
$1, pre-match
copy 355
$ARGV 79
$HostnameRegex 76, 136, 303,
351
$HttpUrl 303,
305, 345, 351
$LevelN 330, 343
$^N 300-301, 344-346
${name}
403
${name~}
418
$NestedStuffRegex 339,
346
$^R 302,
327
$^W 297
% Perl interpolation
289
(?!) 240, 333,
335, 340-341
(?#...) 99, 134,
414
(?#...), in
Java 373
(?#...), in Java,
(?:...) (see
non-capturing parentheses)
(?#...),
in Java, (...) (see
parentheses)
(?#...), in Java, (?i) (see: case-insensitive
mode; mode modifier)
(?#...), in Java, (?i:...) (see
mode-modified span)
(?#...), in Java, (?if
then|else)
(see conditional)
(?#...), in Java, (?m:...) (see
mode-modified span)
(?#...), in Java, (?m) (see: enhanced
line-anchor mode; mode modifier)
(?n) 402
.*, introduced 55
.*, mechanics of
matching 152
.*, optimization 246
.*, warning
about 56
.NET 399-432
.NET, $+ 202
.NET, flavor
overview 91
.NET,
after-match data 136
.NET, benchmarking 236
.NET, JIT
404
.NET, line
anchors 128
.NET,
literal-text mode 135
.NET, MISL
404
.NET, object
model 411
.NET, regex approach 96-97
.NET, regex
flavor 401
.NET, search-and-replace 408,
417-418
.NET, URL parsing
example 204
.NET,
version covered 91
.NET, word
boundaries 132
=~ 308-309, 318
=~, introduced 38
=~, introduced, ? (see question
mark)
?...?
308
@+ 300, 302,
314
@"..."
102
@- 300, 302,
339
@ Perl
interpolation 289
[=...=] 126
[:<:] 92
[:...:]
125-126
[.....] 126
\p{...} in java.util.regex
380
^ 111-112,
128
^, optimization 245-246
^Subject: example 94,
151-152, 154, 242, 244-245, 289
^Subject:
example, in Java 95,
393
^Subject: example, in
Perl 55
^Subject:
example, in Perl
debugger 361
^Subject: example, in
Python 97
^Subject:
example, in VB.NET
96
{min,max}
20, 140
\0
116-117
$0
300
\1 136, 300,
303
\1, in
Perl 41
$1 135-136, 300, 303
$1, introduced 41
$1, in
Java 388
$1,
in .NET 418
$1, in other
languages 136
$1, pre-match
copy 355
8859-1
encoding 29, 87, 105, 107, 121
\A 111, 127-128
\A, in
Java 373
\A,
optimization 246
@ escaping 77
\a 114-115
issues overview encoding 105
after-match variables, in
Perl 299
after-match
variables, pre-match
copy 355
Aho,
Alfred 86, 180
\p{All} 123
\p{All}, in
Perl 288
\p{all} 380
all-in-one object model 369
alternation 138
alternation, and
backtracking 231
alternation, introduced 13-14
alternation, efficiency 222, 231
alternation, greedy 174-175
alternation, hand
tweaking 260-261
alternation, order
of 175-177, 223, 260
alternation, order of, for correctness 28, 189,
197
alternation, order of,
for efficiency 224
alternation, and
parentheses 13
analogy, backtracking, bread crumbs 158-159
analogy, backtracking, stacking dishes 159
analogy, ball
rolling 261
analogy,
building a car 31
analogy, charging
batteries 179
analogy,
engines 143-147
analogy, first come, first
served 153
analogy,
gas additive 150
analogy, learning regexes, Pascal 36
analogy, learning regexes, playing rummy 33
analogy, regex as a
language 5, 27
analogy, regex as filename
patterns 4
analogy,
regex as filename patterns, regex-directed match (see
NFA)
analogy, regex as filename
patterns, text-directed
match (see DFA)
analogy, transmission 148-149, 228
analogy, transparencies (Perl's
local) 298
analogy, transparencies (Perl's
local), anchor
(also see: word boundaries; enhanced line-anchor mode)
analogy, overview 127
analogy, caret
127
analogy, dollar 127
analogy, end-of-line
optimization 246
analogy, exposing 255
analogy, line
87, 111-112, 150
anchored(...)
362
anchored
`string' 362
AND class set operations
123-124
ANSI escape sequences
79
\p{Any}
123
\p{Any}, in
Perl 288
\p{Any}, in Perl, any character (see dot)
Apache, org.apache.xerces.utils.regex
372
Apache, ORO 392-398
Apache, ORO, benchmark results 376
Apache, ORO, comparative description 374
Apache, Regexp, comparative description 375
Apache, Regexp, speed 376
appendReplacement()
388
appendTail()
389
$ARGV
79
\p{Arrows}
122
ASCII encoding 29,
105-106, 114, 121
Asian character
encoding 29
AssemblyName 429
\p{Assigned} 123-124
\p{Assigned}, in
Perl 288
\p{Assigned}, in Perl, asterisk (see star)
\p{Assigned}, in Perl, atomic grouping (also see possessive
quantifiers)
\p{Assigned}, introduced 137-138
\p{Assigned}, details 170-172
\p{Assigned}, for
efficiency 171-172, 259, 268-270
\p{Assigned}, essence 170-171
\p{Assigned}, example 198, 201, 213, 271, 330,
340-341, 346
AT&T Bell
Labs 86
auto-lookaheadification 403
automatic possessification
251
awk, after-match
data 136
awk, gensub 183
awk, history
87
awk, search-and-replace 99
awk, version
covered 91
awk, word boundaries 132
\b 65, 114-115, 400
\b, backspace and word
boundary 44, 46
\b, in
Perl 286
<B>...</B>
165-167
<B>...</B>,
unrolling 270
\b\B 240
backreferences 117, 135
backreferences, introduced with
egrep 20-22
backreferences, DFA 150, 182-183
backreferences, vs. octal
escape 406-407
backreferences, remembering
text 21
backreferences, remembering text, backspace (see \b)
backtracking 163-177
backtracking, introduction 157-163
backtracking, and
alternation 231
backtracking, avoiding 171-172
backtracking, computing
count 227
backtracking, counting 222, 224
backtracking, detecting
excessive 249-250
backtracking, efficiency 179-180
backtracking, essence 168-169
backtracking, exponential
match 226
backtracking, global
view 228-232
backtracking, LIFO 159
backtracking, of
lookaround 173-174
backtracking, neverending
match 226
backtracking, non-match
example 160-161
backtracking, POSIX NFA
example 229
backtracking, saved
states 159
backtracking, simple
example 160
backtracking, simple lazy
example 161
balanced
constructs 328-331, 340-341, 430
balancing regex issues 186
Balling, Derek xxii
Barwise, J. 85
base
character 107, 125
Basic
Regular Expressions 87-88
\p{Basic_Latin} 122
\b\B 240
beginOffset 396
benchmarking 232-239
benchmarking, comparative 248, 376-377
benchmarking, compile
caching 351
benchmarking, in
Java 234-236, 375-377
benchmarking, for naughty
variables 358
benchmarking, in
.NET 236, 404
benchmarking, with neverending
match 227
benchmarking, in
Perl 360
benchmarking,
pre-match copy 356
benchmarking, in
Python 237
benchmarking, in
Ruby 238
benchmarking,
in Tcl 239
Bennett, Mike xxi
Berkeley 86
Better-Late-Than-Never 234-236,
375
<B>...</B>
165-167
<B>...</B>,
unrolling 270
blocks 122, 288, 380, 400
BLTN 235-236, 375
BOL 362
\p{Box_Drawing} 122
Boyer-Moore 244, 247
bracket expressions 125
BRE 87-88
bread-crumb analogy 158-159
Bulletin of Math. Biophysics
85
bump-along, introduction 148-149
bump-along, avoiding 210
bump-along, distrusting 215-218
bump-along, optimization 255
bump-along, in overall
processing 241
byte matching 328
/c 129-130, 315
/c, strings 102
\p{C} 120
\C 328
¢ 122
C
comments, matching
272-276
C comments, unrolling 275-276
C comments, unrolling, caching (also see regex
objects)
C comments, benchmarking 351
C
comments, compile
242-244
C comments, in
Emacs 244
C comments,
integrated 242
C comments, in
Java 393
C comments,
in .NET 426
C comments, object-oriented 244
C comments, procedural 243
C
comments, in Tcl
244
C comments, unconditional 350
CANON_EQ (Pattern
flag) 108, 380
Capture 431
CaptureCollection
432
car analogy
83-84
caret anchor introduced
8
carriage return
109
case title 109
case folding 290, 292
case folding, inhibiting 292
CASE_INSENSITIVE (Pattern
flag) 95, 109, 380, 383
case-insensitive mode 109
case-insensitive mode, introduced 14-15
case-insensitive mode, egrep 14-15
case-insensitive mode, /i 47
case-insensitive mode, Ruby 109
case-insensitive mode, with
study 359
cast 294-295
\p{Cc} 121
\p{Cf} 121
character, base 125
character, classes 117
character, combining 107, 125, 288
character, combining, Inherited script
122
character, vs. combining
characters 107
character, control 116
character, initial character
discrimination 244-246, 249, 251-252, 257-259,
332, 361
character, machine-dependent
codes 114
character,
multiple code points
107
character, as opposed to
byte 29
character,
separating with split
322
character, shorthands 114-115
character class, introduced 9-10
character class, vs.
alternation 13
character
class, mechanics of
matching 149
character
class, negated, must match
character 11-12
character
class, negated, and
newline 118
character
class, negated, Tcl 111
character
class, positive
assertion 118
character
class, of POSIX bracket
expression 125
character
class, range 9,
118
character class, as separate
language 10
character
equivalent 126
CharacterIterator
372
charnames pragma
290
CharSequence
372, 390
CheckNaughtiness 358
\p{Cherokee} 120
Chinese text processing 29
chr 414
chunk limit, Java
ORO 395
chunk limit,
java.util.regex
391
chunk limit, Perl 323
class, vs. dot
118
class, elimination
optimization 249
class, initial class
discrimination 244-246, 249, 251-252, 257-259,
332, 361
class, and lazy
quantifiers 167
class,
set operations 123-125,
375
class, subtraction 124
Clemens, Sam 375
Click, Cliff xxii
client VM 234, 236
clock clicks 239
\p{Close_Punctuation}
121
closures 339
\p{Cn} 121, 123-124, 380,
401
\p{Co}
121
code point, introduced 106
code point, beyond
U+FFFF 108
code
point, multiple
107
code point, unassigned in
block 122
coerce 294-295
cold
VM 235
collating
sequences 126
combining
character 107, 125, 288
combining character, Inherited
script 122
com.ibm.regex, comparative
description 372
com.ibm.regex, speed 377
commafying a number example
64-65
commafying a number example, introduced 59
commafying a number example, in
Java 393
commafying a number
example, without
lookbehind 67
COMMAND.COM 7
comments 99, 134
comments, in
Java 98
comments,
matching of C comments
272-276
comments, matching of Pascal
comments 265
comments,
in .NET regex 414
COMMENTS (Pattern
flag) 99, 218, 378, 380, 386
comments and free-spacing mode
110
Communications of the
ACM 85
compile() 383
compile, caching 242-244
compile, once
(/o) 352-353
compile, on-demand 351
compile, regex
404-405
compile() (Pattern
factory) 383
Compiled (.NET) 236, 402,
404, 414, 421-422, 429
Compilers -- Principles,
Techniques, and Tools 180
CompileToAssembly 427,
429
com.stevesoft.pat, comparative description 374
com.stevesoft.pat, speed 377
conditional 138-139
conditional, with embedded
regex 327, 335
conditional, in
Java 373
conditional,
mimicking with lookaround
139
conditional, in
.NET 403
Config
module 290, 299
conflicting
metacharacters 44-46
\p{Connector_Punctuation}
121
Constable, Robert
85
Constable, Robert, forcing 310
Constable, Robert, metacharacters 44-46
Constable, Robert, regex
use 189
continuation
lines 178, 186-187
continuation lines, unrolling 270
contorting an expression
294-295
\p{Control}
121
control characters
116
Conway, Damian
339
cooking for HTML 68,
408
correctness vs.
efficiency 223-224
www.cpan.org 358
CR 109, 382
Cruise,
Tom 51
crummy
analogy 158-159
CSV parsing
example, java.util.regex 218,
386
CSV parsing example, .NET 429
CSV
parsing example, ORO
397
CSV parsing example, Perl 212-219
CSV
parsing example, unrolling 271
currency, \p{Currency} 122
currency, \p{Currency_Symbol}
121
currency, \p{Sc} 121
currency, Unicode
block 121-122
\p{Currency} 122
\p{Currency_Symbol}
121
currentTimeMillis()
236
\p{Cyrillic}
120, 122
\d 49,
119
\d, in
Perl 288
\D 49, 119
Darth 197
dash in
character class 9
\p{Dash_Punctuation}
121
DBIx::DWIW
258
debugcolor
363
debugging
361-363
debugging, with embedded
code 331-332
debugging, regex
objects 305-306
debugging, run-time 362
\p{Decimal_Digit_Number}
121
default regex
308
define-key
100
delegate 417-418
delimited text 196-198
delimited text, standard
formula 196, 273
delimiter, with
shell 7
delimiter,
with substitution 319
delimiter, with substitution, Deterministic Finite Automaton (see
DFA)
Devel::FindAmpersand
358
Devel::SawAmpersand
358
DFA, introduced 145, 155
DFA, acronym spelled
out 156
DFA, backreferences 150, 182-183
DFA, boring
157
DFA, compared with
NFA 224, 227
DFA,
efficiency 179
DFA, implementation
ease 182
DFA, lazy evaluation 181
DFA, longest-leftmost
match 177-179
DFA,
testing for 146-147
DFA, in theory, same as an
NFA 180
dialytika 108
\p{Dingbats} 122
dish-stacking analogy 159
dollar for Perl variable 37
dollar anchor 127
dollar anchor, introduced 8
dollar value example 24-25, 51-52,
167-170, 175, 194-195
DOS
7
dot 118
dot, introduced 11-12
dot, vs. character
class 118
dot, mechanics of matching 149
dot, Tcl
112
.NET 399-432
.NET, $+ 202
.NET, flavor
overview 91
.NET,
after-match data 136
.NET, benchmarking 236
.NET, JIT
404
.NET, line
anchors 128
.NET,
literal-text mode 135
.NET, MISL
404
.NET, object
model 411
.NET, regex approach 96-97
.NET, regex
flavor 401
.NET, search-and-replace 408,
417-418
.NET, URL parsing
example 204
.NET,
version covered 91
.NET, word
boundaries 132
DOTALL (Pattern
flag) 380, 382
dot-matches-all mode 110-111
doubled-word example, description 1
doubled-word example, in
egrep 22
doubled-word
example, in Emacs
100
doubled-word example, in
Java 81
doubled-word
example, in Perl 35,
77-80
double-quoted string example, allowing escaped quotes 196
double-quoted string example, egrep 24
double-quoted string example, final
regex 263
double-quoted
string example, makudonarudo 165, 169,
228-232, 264
double-quoted string example, sobering example 222-228
double-quoted string example, unrolled 262, 268
double-word finder example, description 1
double-word finder example, in
egrep 22
double-word
finder example, in Emacs
100
double-word finder example, in
Java 81
double-word finder
example, in Perl 35,
77-80
-Dr
363
dragon book 180
DWIW (DBIx)
258
dynamic regex
327-331
dynamic regex, sanitizing 337
dynamic scope 295-299
dynamic scope, vs. lexical
scope 299
/e 319-321
\e 79, 114-115
\E 290
earliest match wins 148-149
EBCDIC 29
ECMAScript (.NET) 400, 402,
406-407, 415, 421
ed
85
ed, and
backtracking 179-180
ed, correctness 223-224
ed, Perl-specific
issues 347-363
ed, regex
objects 353-354
ed, unlimited
lookbehind 133
egrep, flavor
overview 91
egrep, introduced 6-8
egrep, metacharacter
discussion 8-22
egrep, after-match
data 136
egrep,
backreference support
150
egrep, case-insensitive
match 15
egrep,
doubled-word solution
22
egrep, example
use 14
egrep,
flavor summary 32
egrep, history 86-87
egrep, regex
implementation 182
egrep, version
covered 91
egrep, word
boundaries 132
electric
engine analogy 143-147
Emacs, flavor
overview 91
Emacs,
after-match data 136
Emacs, control
characters 116
Emacs,
re-search-forward
100
Emacs, search 100
Emacs, strings as
regexes 100
Emacs,
syntax class 127
Emacs, version
covered 91
Emacs,
word boundaries 132
email address example 70-73,
98
email address example, in
Java 98
email address
example, in VB.NET
99
embedded code, local 336
embedded code, my 338-339
embedded code, regex
construct 327, 331-335
embedded code, sanitizing 337
embedded string check optimization
247, 257
Embodiments of
Mind 85
Empty 426
\p{Enclosing_Mark}
121
\p{Enclosing_Mark}, introduced 29
\p{Enclosing_Mark}, issues
overview 105
\p{Enclosing_Mark}, ASCII 29, 105-106, 114, 121
\p{Enclosing_Mark}, Latin-1 29, 87, 105, 107,
121
\p{Enclosing_Mark}, UCS-2 106
\p{Enclosing_Mark}, UCS-4 106
\p{Enclosing_Mark}, UTF-16 106
\p{Enclosing_Mark}, UTF-8 106
end() 385
END block 358
endOffset 396
end-of-string anchor optimization
246
engine, introduced 27
engine, analogy 143-147
engine, hybrid
183, 239, 243
engine, implementation
ease 182
engine, testing type 146-147
engine, testing type, with neverending match 227
engine, type
comparison 156-157, 180-182
English module 357
English vs. regex 275
enhanced line-anchor mode
111-112
enhanced line-anchor mode, introduced 69
ERE 87-88
errata xxi
Escape 427
escape, introduced 22
escape, term
defined 27
essence,
atomic grouping
170-171
essence, greediness,
laziness, and backtracking 168-169
essence, greediness, laziness, and
backtracking, NFA (see
backtracking)
eval
319
example, atomic
grouping 198, 201, 213, 271, 330, 340-341,
346
example, commafying a
number 64-65
example,
commafying a number, introduced 59
example, commafying a number, in Java 393
example, commafying a number, without lookbehind 67
example, CSV parsing, java.util.regex 218,
386
example, CSV parsing,
.NET 429
example, CSV parsing, ORO 397
example, CSV parsing, Perl 212-219
example, CSV parsing, unrolling 271
example, dollar
value 24-25, 51-52, 167-170, 175,
194-195
example, double-quoted
string, allowing escaped
quotes 196
example,
double-quoted string, egrep 24
example, double-quoted string, final regex 263
example, double-quoted string, makudonarudo 165, 169,
228-232, 264
example, double-quoted
string, sobering
example 222-228
example, double-quoted string, unrolled 262, 268
example, double-word finder, description 1
example, double-word finder, in egrep 22
example, double-word finder, in Emacs 100
example, double-word finder, in Java 81
example, double-word finder, in Perl 35, 77-80
example, email
address 70-73, 98
example, email address, in Java 98
example, email address, in VB.NET 99
example, filename 190-192
example, five
modifiers 316
example,
floating-point number
194
example, form
letter 50-51
example,
gr[ea]y 9
example, hostname 22, 73, 76, 98-99, 136-137,
203, 260, 267, 304, 306
example, hostname, egrep 25
example, hostname, Java 209
example, hostname, plucking from text 71-73,
205-208
example, hostname,
in a URL 74-77
example, hostname, validating 203-205
example, hostname, VB.NET 204
example, HTML, conversion from text 67-77
example, HTML, cooking 68, 408
example, HTML, encoding 408
example, HTML, <HR> 194
example, HTML, link 201-203
example, HTML, optional 139
example, HTML, paired
tags 165
example,
HTML, parsing 130, 315, 321
example, HTML, tag 9, 18-19, 26, 200-201, 326,
357
example, HTML, URL 74-77, 203, 205-208,
303
example, HTML, URL-encoding 320
example, IP 5,
187-189, 267, 311, 314, 348-349
example, Jeffs 61-64
example, lookahead 61-64
example, mail
processing 53-59
example, makudonarudo 165, 169,
228-232, 264
example, pathname 190-192
example, population 59
example, possessive
quantifiers 198, 201
example, postal
code 208-212
example,
regex overloading
341-345
example, stock
pricing 51-52, 167-168
example, stock pricing, with alternation 175
example, stock pricing, with atomic grouping 170
example, stock pricing, with possessive quantifier
169
example, temperature
conversion, in .NET
419
example, temperature
conversion, in Java
389
example, temperature
conversion, in Perl
37
example, temperature
conversion, Perl
one-liner 283
example,
text-to-HTML 67-77
example, this|that 132, 138, 243,
245-246, 252, 255, 260-261
example, unrolling the loop 270-271
example, URL
74-77, 201-204, 208, 260, 303-304, 306, 320
example,
URL, egrep 25
example, URL, Java 209
example, URL, plucking 205-208
example, username 73, 76, 98
example, username, plucking from text 71-73
example, username, in
a URL 74-77
example,
variable names 24
example, ZIP
code 208-212
exception, IllegalArgumentException
383, 388
exception, IllegalStateException
385
exception, IndexOutOfBoundsException
384-385, 388
exception, IOException 81
exception, NullPointerException
396
exception, PatternSyntaxException 381,
383
Explicit
(Option) 409
ExplicitCapture (.NET) 402,
414, 421
exponential match
222-228, 330, 340
exponential match, avoiding 264-265
exponential match, discovery 226-228
exponential match, explanation 226-228
exponential match, non-determinism 264
exponential match, short-circuiting 250
exponential match, solving with atomic
grouping 268
exponential
match, solving with possessive
quantifiers 268
expose
literal text 255
expression, context 294-295
expression, contorting 294-295
Extended Regular Expressions
87-88
\f
114-115
\f, introduced 44
\f, introduced, Fahrenheit (see temperature
conversion example)
failure, atomic
grouping 171-172
failure, forcing 240, 333, 335,
340-341
FF 109
file globs 4
file-check example 2, 36
filename, example 190-192
filename, patterns
(globs) 4
filename, prepending to
line 79
\p{Final_Punctuation}
121
find()
384
FindAmpersand
358
five modifiers example
316
Flanagan, David
xxii
flavor, Perl 286-293
flavor, superficial chart, general 91
flavor, superficial chart, Perl 285, 287
flavor, superficial chart, POSIX 88
flavor, term
defined 27
flex version covered
91
floating
`string' 362
floating-point number example
194
forcing failure 240, 333,
335, 340-341
foreach vs. while vs.
if 320
form letter
example 50-51
\p{Format} 121
freeflowing regex 277-281
Friedl, Alfred 176
Friedl, brothers 33
Friedl, Fumie xxi
Friedl, Fumie, birthday 11-12
Friedl, Liz 33
Friedl, Stephen xxii
fully qualified name 295
functions related to regexes in Perl
285
\G 128-131, 212,
315-316, 362
\G, advanced
example 130
\G, in
Java 373
\G,
in .NET 402
\G, optimization 246
/g 61, 130, 307, 311-312,
315, 319
/g, introduced 51
/g, with regex
object 354
garbage
collection Java benchmarking 236
gas engine analogy 143-147
gensub 183
George, Kit xxii
GetGroupNames
421-422
GetGroupNumbers
421-422
getMatch()
397
global vs. private Perl
variables 295
globs filename
4
GNU Java packages
374
GNU awk, after-match
data 136
GNU awk,
gensub 183
GNU awk, version
covered 91
GNU awk,
word boundaries 132
GNU egrep, after-match
data 136
GNU
egrep, backreference
support 150
GNU
egrep, doubled-word
solution 22
GNU
egrep, -i
bug 21
GNU
egrep, regex
implementation 182
GNU
egrep, word
boundaries 132
GNU
egrep, word boundaries, GNU Emacs (see Emacs)
GNU grep, shortest-leftmost
match 183
GNU
grep, version
covered 91
GNU sed,
after-match data 136
GNU sed, version
covered 91
GNU sed,
word boundaries 132
gnu.regexp, comparative
description 374
gnu.regexp, speed 377
gnu.rex 374
Goldberger, Ray xxii
Gosling, James 89
GPOS 362
gr[ea]y example 9
gr[ea]y example, introduced 151
gr[ea]y example, alternation 174-175
gr[ea]y example, and
backtracking 162-177
gr[ea]y example, deference to an
overall match 153, 274
gr[ea]y example, essence 159, 168-169
gr[ea]y example, favors
match 167-168
gr[ea]y example, first come, first
served 153
gr[ea]y
example, global vs.
local 182
gr[ea]y
example, in Java
373
gr[ea]y example, vs.
lazy 169, 256-257
gr[ea]y example, localizing 225-226
gr[ea]y example, quantifier 139-140
gr[ea]y example, too
greedy 152
green
dragon 180
grep, flavor
overview 91
grep, as an
acronym 85
grep, history 86
grep, regex
flavor 86
grep,
version covered 91
grep, -y
option 86
grep in Perl 324
group(), java.util.regex 385
group(), ORO 396
Group object (.NET)
412
Group object (.NET), Capture 431
Group object (.NET), creating 423
Group object (.NET), Index 424
Group object (.NET), Length 424
Group object (.NET), Success 424
Group object (.NET), ToString 424
Group object (.NET), using 424
Group object (.NET), Value 424
GroupCollection 423,
432
groupCount()
385
grouping and capturing
20-22
GroupNameFromNumber
421-422
GroupNumberFromName
421-422
groups() ORO
397
Groups
Match object method 423
\p{Gujarati} 120
Gutierrez, David xxii
\p{Han} 120
hand tweaking, alternation 260-261
hand tweaking, caveats 253
\p{Hangul_Jamo} 122
HASH(0x80f60ac) 257
\p{Hebrew} 120, 122
hex escape 116-117
hex escape, in
Java 373
hex escape,
in Perl 286
Hietaniemi, Jarkko xxii
highlighting with ANSI escape
sequences 79
\p{Hiragana} 120
history, `\+' 87
history, AT&T Bell
Labs 86
history, awk 87
history, Berkeley 86
history, ed
trivia 86
history,
egrep 86-87
history, grep 86
history, lex 87
history, Perl
88-90, 308
history, of
regexes 85-91
history,
sed 87
history, underscore in
\w 89
history, /x 90
hostname example 22, 73, 76, 98-99,
136-137, 203, 260, 267, 304, 306
hostname example,
egrep 25
hostname example, Java 209
hostname
example, plucking from
text 71-73, 205-208
hostname
example, in a URL
74-77
hostname example, validating 203-205
hostname example, VB.NET 204
$HostnameRegex 76, 136, 303,
351
hot VM 235, 375
HTML, cooking
68, 408
HTML, matching
tag 200-201
HTML
example, conversion from
text 67-77
HTML
example, cooking 68,
408
HTML example, encoding 408
HTML
example, <HR> 194
HTML example, link 201-203
HTML
example, optional
139
HTML example, paired
tags 165
HTML example,
parsing 130, 315, 321
HTML example, tag 9, 18-19, 26, 200-201, 326,
357
HTML example, URL 74-77, 203, 205-208, 303
HTML example, URL-encoding 320
HTTP newlines 115
HTTP URL example 25, 74-77, 201-209,
260, 303-304, 306, 320
http://regex.info/ xxi, 7,
345, 372
$HttpUrl
303, 305, 345, 351
hybrid regex
engine 183, 239, 243
hyphen
in character class 9
-i as -y 86
/i 134
/i, introduced 47
/i, with
study 359
/i, with study, (?i) (see: case-insensitive
mode; mode modifier)
IBM (Java package), comparative description 372
IBM (Java package), speed 377
identifier matching 24
if vs. while vs.
foreach 320
IgnoreCase
(.NET) 96, 99, 402, 413, 421
IgnorePatternWhitespace
(.NET) 99, 402, 413, 421
IllegalArgumentException 383,
388
IllegalStateException
385
implementation of engine
182
implicit
362
implicit anchor
optimization 246
Imports 407, 409,
428
\p{InArrows}
122
\p{InBasic_Latin}
122
\p{InBox_Drawing}
122
\p{InCurrency}
122
\p{InCyrillic}
122
Index, Group
object method 424
Index, Match object
method 423
IndexOutOfBoundsException
384-385, 388
\p{InDingbats} 122
indispensable TiVo 3
\p{InHangul_Jamo}
122
\p{InHebrew}
122
\p{Inherited}
122
initial class
discrimination 244-246, 249, 251-252, 257-259,
332, 361
\p{Initial_Punctuation}
121
\p{InKatakana}
122
\p{InTamil}
122
integrated handling
94-95
integrated handling, compile
caching 242
interpolation 288-289
interpolation, introduced 77
interpolation, caching 351
interpolation, mimicking 321
interpolation, in
PHP 103
INTERSECTION class set operations
124
interval 140
interval, introduced 20
interval, [X{0,0}] 140
\p{InTibetan} 122
IOException 81
IP example 5, 187-189, 267, 311, 314,
348-349
Iraq 11
Is vs. In 120,
122-123
Is vs. In, with java.util.regex
380
Is vs. In, in
.NET 401
Is vs.
In, in Perl
288
\p{IsCherokee}
120
\p{IsCommon}
122
\p{IsCyrillic}
120
\p{IsGujarati}
120
\p{IsHan}
120
\p{IsHebrew}
120
\p{IsHiragana}
120
\p{IsKatakana}
120
\p{IsLatin}
120
IsMatch (Regex object
method) 415
ISO-8859-1
encoding 29, 87, 105, 107, 121
\p{IsThai} 120
\p{IsTibetan} 122
Japanese, text
processing 29
“japhy” 246
Java 365-398
Java, benchmarking 234-236
Java, BLTN
235-236, 375
Java, choosing a regex
package 366
Java,
exposed mechanics 374
Java, fastest
package 377
Java,
JIT 235
Java, list of
packages 372
Java,
matching comments
272-276
Java, object
models 368-372
Java,
package flavor comparison
373
Java, “Perl5
flavors” 375
Java,
strings 102
Java, version
covered 91
Java, VM 234-236, 375
java.util.regex 95-96,
378-391
java.util.regex, after-match data 136
java.util.regex, code
example 383, 389
java.util.regex, comparative
description 372
java.util.regex, CSV
parsing 386
java.util.regex, dot
modes 111
java.util.regex, doubled-word
example 81
java.util.regex, line
anchors 128
java.util.regex, line
terminators 382
java.util.regex, match
modes 380
java.util.regex, object
model 381
java.util.regex, regex
flavor 378-381
java.util.regex, search-and-replace 387
java.util.regex, speed 377
java.util.regex, split 390
java.util.regex, URL parsing
example 209
java.util.regex, version
covered 91
java.util.regex, word
boundaries 132
Jeffs
example 61-64
JfriedlsRegexLibrary 428-429
JIT, Java
235
JIT, .NET 404
JRE 234
jregex comparative
description 374
\p{Katakana} 120,
122
keeping in sync
210-211
Keisler, H. J.
85
Kleene, Stephen
85
The Kleene
Symposium 85
Korean text
processing 29
Kunen,
K. 85
\p{L&} 120-121,
123
\p{L&}, in
Perl 288
\p{L} 119-120, 131, 380,
390
£ 122
\l 290
\l, character
class 10, 13
\l, identifiers 24
\p{Latin} 120
Latin-1 encoding 29, 87, 105, 107,
121
lazy 166-167
lazy, essence
159, 168-169
lazy, favors
match 167-168
lazy,
vs. greedy 169,
256-257
lazy, in
Java 373
lazy, optimization 249, 256
lazy, quantifier 140
lazy evaluation 181, 355
\L...\E 290
\L...\E, inhibiting 292
lc 290
lcfirst 290
leftmost match 177-179
Length, Group object
method 424
Length, Match object
method 423
length() ORO 396
length-cognizance optimization 245,
247
\p{Letter} 120,
288
\p{Letter_Number}
121
$LevelN 330,
343
lex 86
lex, $ 111
lex, dot 110
lex, history 87
lex, and trailing
context 182
lexer building 130, 315
lexical scope 299
LF 109, 382
Li,
Yadong xxii
LIFO
backtracking 159
limit, backtracking 237
limit, recursion 249-250
limit, recursion, line (also see string)
limit, anchor
optimization 246
limit, vs.
string 55
line
anchor 111-112
line
anchor, mechanics of
matching 150
line
anchor, variety of
implementations 87
line
feed 109
LINE
SEPARATOR 109, 121, 382
line
terminators 108-109, 111, 128, 382
line terminators, with $ and
^ 111
\p{Line_Separator}
121
link, matching 201
link, matching, Java 204, 209
list context 294, 310-311
list context, forcing 310
literal string initial string
discrimination 244-246, 249, 251-252, 257-259, 332, 361
literal text, introduced 5
literal text, exposing 255
literal text, mechanics of
matching 149
literal
text, pre-check
optimization 244-246, 249, 251-252, 257-259,
332, 361
literal-text mode
112, 134-135, 290
literal-text mode, inhibiting 292
\p{Ll} 121, 400
\p{Lm} 121, 400
\p{Lo} 121, 400
local 296, 341
local, in embedded
code 336
local, vs.
my 297
locale 126
locale, overview 87
locale, \w 119
localizing 296-297
localtime 294, 319,
351
locking in regex literal
352
“A logical calculus of the ideas imminent in nervous
activity” 85
longest
match finding 334-335
longest-leftmost match 148,
177-179
lookahead
132
lookahead, introduced 60
lookahead, auto 403
lookahead, example 61-64
lookahead, in
Java 373
lookahead,
mimic atomic grouping
174
lookahead, mimic
optimizations 258-259
lookahead, negated, <B>...</B>
167
lookahead, positive vs.
negative 66
lookaround, introduced 59
lookaround, backtracking 173-174
lookaround, in
conditional 139
lookaround, and
DFAs 182
lookaround,
doesn't consume text
60
lookaround, mimicking class set
operations 124
lookaround, mimicking word
boundaries 132
lookaround, in
Perl 288
lookbehind 132
lookbehind, in
Java 373
lookbehind,
in .NET 402
lookbehind, in
Perl 288
lookbehind,
positive vs. negative
66
lookbehind, unlimited 402
lookingAt() 385
Lord, Tom 182
\p{Lowercase_Letter}
121
LS 109, 121, 382
\p{Lt} 121, 400
\p{Lu} 121, 400
Lunde, Ken xxii, 29
\p{M} 120, 125
/m 134
m/.../ introduced
38
machine-dependent character
codes 114
MacOS 114
mail
processing example 53-59
makudonarudo example 165,
169, 228-232, 264
\p{Mark} 120
match 306-318
match, actions
95
match, context 294-295, 309
match, context, list 294, 310-311
match, context, scalar 294, 310, 312-316
match, DFA vs.
NFA 224
match, efficiency 179
match, example with
backtracking 160
match, example without
backtracking 160
match, lazy
example 161
match,
leftmost-longest 335
match, longest
334-335
match, m/.../, introduced 38
match, m/.../, introduced, mechanics (also see: greedy;
lazy)
match, m/.../, .* 152
match, m/.../, greedy introduced 151
match, m/.../, anchors 150
match, m/.../, capturing parentheses 149
match, m/.../, character classes and dot
149
match, m/.../, consequences 156
match, m/.../, literal text 149
match, modes
109-112
match, modes, java.util.regex
380
match, negating 309
match, neverending 222-228, 330,
340
match, neverending, avoiding 264-265
match, neverending, discovery 226-228
match, neverending, explanation 226-228
match, neverending, non-determinism 264
match, neverending, short-circuiting 250
match, neverending, solving with atomic grouping
268
match, neverending, solving with possessive quantifiers
268
match, NFA vs.
DFA 156-157, 180-182
match, NFA vs. DFA, position (see pos)
match, POSIX, in
Perl 335
match, shortest-leftmost 183
match, side
effects 317
match,
side effects, intertwined 43
match, side effects, Perl 40
match, speed
181
match, in a
string 27
match, tag-team 130
match, viewing
mechanics 331-332
Match Empty
426
match()
393
Match (.NET)
Success 96
Match object
(.NET) 411
Match
object (.NET), Capture 431
Match object (.NET), creating 415, 423
Match object (.NET), Groups 423
Match object (.NET), Index 423
Match object (.NET), Length 423
Match object (.NET), NextMatch 423
Match object (.NET), Result 423
Match object (.NET), Success 421
Match object (.NET), Synchronized 424
Match object (.NET), ToString 422
Match object (.NET), using 421
Match object (.NET), Value 422
Match (Regex object
method) 415
“match rejected
by optimizer” 363
match
result object model 371
match
state object model 370
MatchCollection 416
matcher() (Pattern
method) 384
Matcher
object 384
Matcher
object, reusing
387
matches, unexpected 194-195
matches, viewing
all 332
matches()
(Pattern method) 384, 390
Matches (Regex object
method) 416
MatchEvaluator
417-418
matching, delimited
text 196-198
matching,
HTML tag 200
matching, longest-leftmost 177-179
MatchObject object (.NET)
creating 416
\p{Math_Symbol} 121
Maton, William xxii, 36
MBOL 362
\p{Mc} 121
McCloskey, Mike xxii
McCulloch, Warren 85
\p{Me} 121
mechanics viewing 331-332
metacharacter, introduced 5
metacharacter, conflicting 44-46
metacharacter, differing
contexts 10
metacharacter, first-class 87, 92
metacharacter, vs.
metasequence 27
metasequence defined 27
mimic, $` 357
mimic, $' 357
mimic, $& 302, 357
mimic, atomic
grouping 174
mimic,
class set operations
124
mimic, conditional with
lookaround 139
mimic,
initial-character discrimination
optimization 258-259
mimic, named
capture 344-345
mimic,
POSIX matching 335
mimic, possessive
quantifiers 343-344
mimic, variable
interpolation 321
mimic, word
boundaries 66, 132, 341-342
minlen length
362
minus in character class
9
MISL .NET 404
\p{Mn} 121
mode modifier 109, 133-135
mode-modified span 109, 134
modes introduced with egrep
14-15
\p{Modifier_Letter}
121
\p{Modifier_Letter}, combining 69
\p{Modifier_Letter}, example with
five 316
\p{Modifier_Letter}, /g 51
\p{Modifier_Letter}, /i 47
\p{Modifier_Letter}, “locking
in” 304-305
\p{Modifier_Letter}, notation 98
\p{Modifier_Letter}, /osmosis 293
\p{Modifier_Letter}, Perl
core 292-293
\p{Modifier_Letter}, with regex
object 304-305
\p{Modifier_Symbol}
121
Mui, Linda xxii
multi-character quotes
165-166
Multiline
(.NET) 402, 413-414, 421
MULTILINE (Pattern
flag) 81, 380, 382
multiple-byte character encoding
29
MungeRegexLiteral
342-344, 346
my, binding 339
my, in embedded
code 338-339
my, vs.
local 297
MySQL, after-match
data 136
MySQL, DBIx::DWIW 258
MySQL, version
covered 91
MySQL,
word boundaries 132
\n 49, 114-115
\n, introduced 44
\n, machine-dependency 114
\p{N} 120, 390
(?n) 402
$^N 300-301, 344-346
named capture 137
named capture, mimicking 344-345
named capture, .NET 402
named
capture, with unnamed
capture 403
naughty
variables 356
naughty
variables, okay for
debugging 331
\p{Nd} 121, 380, 400
negated class, introduced 10-11
negated class, and lazy
quantifiers 167
negated
class, Tcl 111
negated class, Tcl, negative lookahead (see lookahead,
negative)
negated class, Tcl,
negative lookbehind (see
lookbehind, negative)
NEL
109, 382, 400
nervous system
85
nested constructs, .NET 430
nested
constructs, Perl
328-331, 340-341
$NestedStuffRegex 339,
346
.NET 399-432
.NET, $+ 202
.NET, flavor
overview 91
.NET,
after-match data 136
.NET, benchmarking 236
.NET, JIT
404
.NET, line
anchors 128
.NET,
literal-text mode 135
.NET, MISL
404
.NET, object
model 411
.NET, regex approach 96-97
.NET, regex
flavor 401
.NET, search-and-replace 408,
417-418
.NET, URL parsing
example 204
.NET,
version covered 91
.NET, word
boundaries 132
neurophysiologists early regex study
85
neverending match 222-228,
330, 340
neverending match, avoiding 264-265
neverending match, discovery 226-228
neverending match, explanation 226-228
neverending match, non-determinism 264
neverending match, short-circuiting 250
neverending match, solving with atomic
grouping 268
neverending
match, solving with possessive
quantifiers 268
New
Regex 96, 99, 410, 415
newline and HTTP 115
NEXT LINE 109, 382, 400
NextMatch (Match object
method) 423
NFA, first introduced 145
NFA, introduction 153
NFA, acronym spelled
out 156
NFA, and alternation 174-175
NFA, compared with
DFA 156-157, 180-182
NFA, control
benefits 155
NFA,
efficiency 179
NFA, efficiency, essence (see backtracking)
NFA, freeflowing
regex 277-281
NFA,
and greediness 162
NFA, implementation
ease 182
NFA, nondeterminism 265
NFA, nondeterminism, checkpoint 264
NFA, POSIX
efficiency 179
NFA,
testing for 146-147
NFA, theory
180
Nicholas, Ethan
xxii
\p{Nl}
121
\N{LATIN SMALL LETTER SHARP
S} 290
\N{name}
290
\N{name}, inhibiting 292
\p{No} 121
no re 'debug' 361
no_match_vars 357
nomenclature 27
non-capturing parentheses 45, 136-137,
373
non-capturing parentheses, (also
see parentheses), Nondeterministic Finite
Automaton (see NFA)
None (.NET) 415, 421
nonillion 226
nonregular sets 180
\p{Non_Spacing_Mark}
121
“normal” 262-266
null 116
null, with dot
118
NullPointerException
396
\p{Number}
120
/o
352-353
/o, with regex
object 354
Obfuscated Perl
Contest 320
object
model, Java
368-372
object model, .NET 410-411
Object Oriented Perl
339
object-oriented handling
95-97
object-oriented handling, compile caching 244
octal escape 115, 117
octal escape, vs.
backreference 406-407
octal
escape, in Java
373
octal escape, in
Perl 286
on-demand
recompilation 351
oneself example 332,
334
\p{Open_Punctuation}
121
operators Perl list
285
optimization
239-252
optimization, automatic
possessification 251
optimization, BLTN 235-236, 375
optimization, with
bump-along 255
optimization, end-of-string
anchor 246
optimization, excessive
backtrack 249-250
optimization, hand
tweaking 252-261
optimization, implicit line
anchor 191
optimization, initial character
discrimination 244-246, 249, 251-252, 257-259,
332, 361
optimization, JIT 235, 404
optimization, lazy
evaluation 181
optimization, lazy
quantifier 249, 256
optimization, leading [.*] 246
optimization, literal-string
concatenation 247
optimization, need
cognizance 252
optimization, needless class
elimination 249
optimization, needless
parentheses 248
optimization, pre-check of required
character 244-246, 249, 251-252, 257-259, 332,
361
optimization, simple
repetition, discussed
247-248
optimization, small
quantifier equivalence 251-252
optimization, state
suppression 250-251
optimization, string/line
anchors 149, 181
optimization, super-linear
short-circuiting 250
Option (.NET) 409
Option (.NET), whitespace 18
Options (Regex object
method) 421
OR class set operations
123-124
Oram, Andy xxii,
5
ordered alternation
175-177
ordered alternation, pitfalls 176
org.apache.oro.text.regex
392-398
org.apache.oro.text.regex, benchmark results 376
org.apache.oro.text.regex, comparative description 374
org.apache.regexp, comparative
description 375
org.apache.regexp, speed 376
org.apache.xerces.utils.regex
372
ORO 392-398
ORO, benchmark
results 376
ORO, comparative description 374
osmosis 293
/osmosis 293
\p{Other} 120
\p{Other_Letter} 121
\p{Other_Number} 121
\p{Other_Punctuation}
121
\p{Other_Symbol}
121
our 295,
336
overload pragma
342
\p{...}
119
\p{P}
120
\p{^...}
288
\p{All}
123
\p{All}, in
Perl 288
\p{all} 380
panic: top_env 332
\p{Any} 123
\p{Any}, in
Perl 288
Papen,
Jeffrey xxii
PARAGRAPH
SEPARATOR 109, 121, 382
\p{Paragraph_Separator}
121
parentheses, as
\(...\) 86
parentheses, and
alternation 13
parentheses, balanced 328-331, 340-341,
430
parentheses, balanced,
difficulty 193-194
parentheses, capturing 135-136, 300
parentheses, capturing, introduced with egrep
20-22
parentheses, capturing,
and DFAs 150, 182
parentheses, capturing, mechanics 149
parentheses, capturing, in Perl 41
parentheses, capturing
only 152
parentheses,
counting 21
parentheses, elimination
optimization 248
parentheses, elimination optimization,
grouping-only (see
non-capturing parentheses)
parentheses, limiting scope 18
parentheses, named
capture 137, 344-345, 402-403
parentheses, nested 328-331, 340-341, 430
parentheses, non-capturing 45, 136-137
parentheses, non-capturing, in Java 373
parentheses, non-participating 300
parentheses, with split, Java ORO 395
parentheses, with split, .NET 403, 420
parentheses, with split, Perl 326
\p{Arrows} 122
parsing regex 404
participate in match 139
Pascal 36, 59, 182
Pascal, matching comments
of 265
\p{Assigned} 123-124
\p{Assigned}, in
Perl 288
Pat (Java
Package), comparative
description 374
Pat (Java
Package), speed
377
patch 88
pathname example 190-192
Pattern, CANON_EQ 108, 380
Pattern, CASE_INSENSITIVE 95, 109,
380, 383
Pattern, COMMENTS 99, 218, 378, 380,
386
Pattern, compile() 383
Pattern, DOTALL 380, 382
Pattern, matcher() 384
Pattern, matches() 384, 390
Pattern, MULTILINE 81, 380,
382
Pattern, UNICODE_CASE 380,
383
Pattern, UNIX_LINES 380, 382
PatternSyntaxException 381,
383
\p{Basic_Latin}
122
\p{Box_Drawing}
122
\p{Pc} 121,
400
\p{C}
120
\p{Cc}
121
\p{Cf}
121
\p{Cherokee}
120
\p{Close_Punctuation}
121
\p{Cn} 121,
123-124, 380, 401
\p{Co} 121
\p{Connector_Punctuation}
121
\p{Control}
121
PCRE, lookbehind 132
PCRE, version
covered 91
\p{Currency} 122
\p{Currency_Symbol}
121
\p{Cyrillic}
120, 122
\p{Pd}
121
\p{Dash_Punctuation}
121
\p{Decimal_Digit_Number}
121
\p{Dingbats}
122
\p{Pe}
121
PeakWebhosting.com
xxii
\p{Enclosing_Mark}
121
people, Aho,
Alfred 86, 180
people,
Balling, Derek xxii
people, Barwise,
J. 85
people, Bennett, Mike xxi
people, Clemens,
Sam 375
people, Click, Cliff xxii
people, Constable,
Robert 85
people,
Conway, Damian 339
people, Cruise,
Tom 51
people, Flanagan, David xxii
people, Friedl,
Alfred 176
people,
Friedl, brothers 33
people, Friedl,
Fumie xxi
people,
Friedl, Fumie, birthday 11-12
people, Friedl,
Liz 33
people, Friedl, Stephen xxii
people, George,
Kit xxii
people, Goldberger, Ray xxii
people, Gosling,
James 89
people, Gutierrez, David xxii
people, Hietaniemi,
Jarkko xxii
people,
Keisler, H. J. 85
people, Kleene,
Stephen 85
people,
Kunen, K. 85
people, Li,
Yadong xxii
people,
Lord, Tom 182
people, Lunde,
Ken xxii, 29
people,
Maton, William xxii,
36
people, McCloskey,
Mike xxii
people,
McCulloch, Warren 85
people, Mui,
Linda xxii
people,
Nicholas, Ethan xxii
people, Oram,
Andy xxii, 5
people,
Papen, Jeffrey xxii
people, Perl
Porters 90
people,
Pinyan, Jeff 246
people, Pitts,
Walter 85
people,
Purcell, Shawn xxii
people, Reed,
Jessamyn xxii
people,
Reinhold, Mark xxii
people, Rudkin,
Kristine xxii
people,
Savarese, Daniel xxii
people, Sethi,
Ravi 180
people, Spencer, Henry 88, 182-183,
243
people, Thompson,
Ken 85-86, 110
people,
Trapszo, Kasia xxii
people, Tubby
264
people, Ullman,
Jeffrey 180
people,
Wall, Larry 88-90, 138,
363-364
people, Wilson,
Dean xxii
people,
Woodward, Josh xxii
people, Zawodny,
Jeremy xxii, 258
Perl,
$/ 35
Perl, flavor
overview 91, 287
Perl,
introduction 37-38
Perl, introduction, context (also see match,
context)
Perl, introduction,
contorting 294
Perl, efficiency 347-363
Perl, greatest
weakness 286
Perl,
history 88-90, 308
Perl, in Java
375, 392
Perl, line
anchors 128
Perl,
modifiers 292-293
Perl, motto
348
Perl, option, -0 36
Perl, option, -c 361
Perl, option, -Dr 363
Perl, option, -e 36, 53, 361
Perl, option, -i 53
Perl, option, -M 361
Perl, option, -Mre=debug 363
Perl, option, -n 36
Perl, option, -p 53
Perl, option, -w 38, 296, 326,
361
Perl, regex
operators 285
Perl,
version covered 91
Perl, warnings
38
Perl, warnings, ($^W variable) 297
Perl, warnings, use warnings 326,
363
Perl Porters 90
Perl5Util 392, 396
perladmin 299
\p{Pf} 121, 400
\p{Final_Punctuation}
121
\p{Format}
121
\p{Gujarati}
120
\p{Han}
120
\p{Hangul_Jamo}
122
\p{Hebrew} 120,
122
\p{Hiragana}
120
PHP, after-match
data 136
PHP, line anchors 128
PHP, lookbehind 132
PHP, mode
modifiers 133
PHP,
strings 103
PHP, version
covered 91
PHP, word boundaries 132
\p{Pi} 121, 400
\p{InArrows} 122
\p{InBasic_Latin}
122
\p{InBox_Drawing}
122
\p{InCurrency}
122
\p{InCyrillic}
122
\p{InDingbats}
122
\p{InHangul_Jamo}
122
\p{InHebrew}
122
\p{Inherited}
122
\p{Initial_Punctuation}
121
\p{InKatakana}
122
\p{InTamil}
122
\p{InTibetan}
122
Pinyan, Jeff 246
\p{IsCherokee} 120
\p{IsCommon} 122
\p{IsCyrillic} 120
\p{IsGujarati} 120
\p{IsHan} 120
\p{IsHebrew} 120
\p{IsHiragana} 120
\p{IsKatakana} 120
\p{IsLatin} 120
\p{IsThai} 120
\p{IsTibetan} 122
Pitts, Walter 85
\p{Katakana} 120,
122
\p{L} 119-120,
131, 380, 390
\p{L&} 120-121,
123
\p{L&}, in
Perl 288
\p{Latin} 120
\p{Letter} 120, 288
\p{Letter_Number}
121
\p{Line_Separator}
121
\p{Ll} 121,
400
\p{Lm} 121,
400
\p{Lo} 121,
400
\p{Lowercase_Letter}
121
\p{Lt} 121,
400
\p{Lu} 121,
400
plus, as
\+ 139
plus,
introduced 18-20
plus, backtracking 162
plus, greedy
139
plus, lazy 140
plus, possessive 140
\p{M} 120, 125
\p{Mark} 120
\p{Math_Symbol} 121
\p{Mc} 121
\p{Me} 121
\p{Mn} 121
\p{Modifier_Letter}
121
\p{Modifier_Symbol}
121
\p{N} 120,
390
\p{Nd} 121, 380,
400
\p{Nl}
121
\p{No}
121
\p{Non_Spacing_Mark}
121
\p{Number}
120
\p{Po}
121
\p{Open_Punctuation}
121
population example
59
pos 128-131,
313-314, 316
pos, (also see
\G), positive
lookahead (see lookahead, positive)
pos, (also see \G),
positive lookbehind (see
lookbehind, positive)
POSIX, [:...:]
125
POSIX, [.....]
126
POSIX, Basic Regular
Expressions 87-88
POSIX, bracket
expressions 125
POSIX,
character class 125
POSIX, character class and
locale 126
POSIX,
character equivalent
126
POSIX, collating
sequences 126
POSIX,
dot 118
POSIX, empty
alternatives 138
POSIX, Extended Regular
Expressions 87-88
POSIX, superficial flavor
chart 88
POSIX, in Java 374
POSIX, locale
126
POSIX, locale, overview 87
POSIX, longest-leftmost
rule 177-179, 335
POSIX
NFA, backtracking
example 229
POSIX NFA,
testing for 146-147
possessive quantifiers 140,
172-173
possessive quantifiers, automatic 251
possessive quantifiers, for
efficiency 259, 268-270
possessive quantifiers, example 198, 201
possessive quantifiers, mimicking 343-344
possessive quantifiers, optimization 250-251
postal code example 208-212
postMatch() 397
\p{Other} 120
\p{Other_Letter} 121
\p{Other_Number} 121
\p{Other_Punctuation}
121
\p{Other_Symbol}
121
£ 122
\p{P} 120
\p{Paragraph_Separator}
121
\p{Pc} 121,
400
\p{Pd}
121
\p{Pe}
121
\p{Pf} 121,
400
\p{Pi} 121,
400
\p{Po}
121
\p{Private_Use}
121
\p{Ps}
121
\p{Punctuation}
120
pragma, charnames 290
pragma, overload 342
pragma, re 361, 363
pragma, strict 295, 336,
345
pragma, warnings 326, 363
pre-check of required character
244-246, 249, 251-252, 257-259, 361
pre-check of required
character, mimic
258-259
pre-check of required character, viewing 332
preMatch() 397
pre-match copy 355
prepending filename to line
79
price rounding example
51-52, 167-168
price rounding example, with alternation 175
price rounding example, with atomic
grouping 170
price rounding
example, with possessive
quantifier 169
Principles
of Compiler Design 180
printf 40
private vs. global Perl variables
295
\p{Private_Use}
121
procedural handling
95-97
procedural handling, compile
caching 243
procmail 94
procmail, version
covered 91
Programming
Perl 283, 286, 339
promote 294-295
properties 119-121, 123-124, 288,
380
\p{S}
120
PS 109, 121, 382
\p{Ps} 121
\p{Sc} 121-122
\p{Separator} 120
\p{Sk} 121
\p{Sm} 121
\p{So} 121
\p{Space_Separator}
121
\p{Spacing_Combining_Mark}
121
\p{Symbol}
120
\p{Tamil}
122
\p{Thai}
120
\p{Tibetan}
122
\p{Titlecase_Letter}
121
publication, Bulletin of Math.
Biophysics 85
publication, Communications of the
ACM 85
publication, Compilers -- Principles,
Techniques, and Tools 180
publication, Embodiments of
Mind 85
publication, The Kleene
Symposium 85
publication, “A logical calculus of the ideas
imminent in nervous activity” 85
publication, Object Oriented
Perl 339
publication, Principles of Compiler
Design 180
publication, Programming
Perl 283, 286, 339
publication, Regular Expression Search
Algorithm 85
publication, “The Role of Finite Automata in
the Development of Modern Computing Theory”
85
\p{Unassigned}
121, 123
\p{Unassigned}, in
Perl 288
\p{Punctuation} 120
\p{Uppercase_Letter}
121
Purcell, Shawn
xxii
Python, after-match
data 136
Python, benchmarking 237
Python, line
anchors 128
Python,
mode modifiers 133
Python, regex
approach 97
Python,
strings 103-104
Python, version
covered 91
Python,
word boundaries 132
Python, \Z 111
\p{Z} 119-120, 380,
400
\p{Zl}
121
\p{Zp}
121
\p{Zs}
121
Qantas 11
\Q...\E 290
\Q...\E, inhibiting 292
\Q...\E, in
Java 373
qed 85
qed, introduced 76
qed, introduced, quantifier (also see: plus; star;
question mark; interval; lazy; greedy; possessive quantifiers)
qed, and
backtracking 162
qed, factor
out 255
qed,
grouping for 18
qed, multiple
levels 265
qed,
optimization 247, 249
qed, and
parentheses 18
qed, possessive
quantifiers 140, 172-173
qed, possessive quantifiers,
for efficiency 259,
268-270
qed, possessive
quantifiers, mimicking, mimicking 343-344
qed, possessive quantifiers,
optimization, optimization 250-251
qed, possessive quantifiers,
automatic, automatic 251
qed, question mark, as \? 139
qed, question mark, introduced 17-18
qed, question mark, backtracking 160
qed, question mark, greedy 139
qed, question mark, lazy 140
qed, question mark, possessive 140
qed, smallest preceding
subexpression 29
question
mark, as \?
139
question mark, backtracking 160
question mark, greedy 139
question mark, lazy 140
question
mark, possessive
140
question mark, possessive,
quoted string (see
double-quoted string example)
quotes multi-character
165-166
r"..." 103
$^R 302, 327
\r 49, 114-115
\r, machine-dependency 114
re 361
re 'debug' 363
re pragma 361, 363
reality check 226-228
red dragon 180
Reed, Jessamyn xxii
Reflection 429
regex, balancing
needs 186
regex, compile 179-180, 350
regex, default
308
regex, delimiters 291-292
regex, delimiters, DFA (see DFA)
regex, delimiters, encapsulation (see regex
objects)
regex, engine
analogy 143-147
regex,
vs. English 275
regex, frame of
mind 6
regex, freeflowing design 277-281
regex, history
85-91
regex, library 76, 207
regex, longest-leftmost
match 177-179
regex,
longest-leftmost match, shortest-leftmost 183
regex, mechanics 241-242
regex, mechanics, NFA (see NFA)
regex, nomenclature 27
regex, operands 288-292
regex, overloading 291, 328
regex, overloading, inhibiting 292
regex, overloading, problems 344
regex, subexpression, defined 29
regex
literal 288-292, 307
regex
literal, inhibiting
processing 292
regex
literal, locking in
352
regex literal, parsing
of 292
regex literal,
processing 350
regex literal, regex
objects 354
Regex
(.NET), CompileToAssembly 427,
429
Regex (.NET), creating, options 413-415
Regex (.NET), Escape 427
Regex (.NET), GetGroupNames
421-422
Regex (.NET), GetGroupNumbers
421-422
Regex (.NET), GroupNameFromNumber
421-422
Regex (.NET), GroupNumberFromName
421-422
Regex (.NET), IsMatch 407, 415,
425
Regex (.NET), Match 96, 408, 410, 415,
425
Regex (.NET), Matches 416, 425
Regex (.NET), object, creating 96, 410, 413-415
Regex (.NET), object, exceptions 413
Regex (.NET), object, using 96, 415
Regex (.NET), Options 421
Regex (.NET), Replace 408-409, 417-418,
425
Regex (.NET), RightToLeft 421
Regex (.NET), Split 419-420, 425
Regex (.NET), ToString 421
Regex (.NET), Unescape 427
regex objects 303-306
regex objects, efficiency 353-354
regex objects, /g 354
regex objects, match
modes 304-305
regex
objects, /o
354
regex objects, in regex
literal 354
regex
objects, viewing
305-306
regex overloading
292
regex overloading, example 341-345
http://regex.info/ xxii, 7,
345, 358
RegexCompilationInfo
429
regex-directed matching
153
regex-directed matching, and
backreferences 303
regex-directed matching, and
greediness 162
Regex.Escape 135
RegexOptions, Compiled 236, 402, 404, 414,
421-422, 429
RegexOptions, ECMAScript 400, 402,
406-407, 415, 421
RegexOptions, ExplicitCapture 402, 414,
421
RegexOptions, IgnoreCase 96, 99, 402, 413,
421
RegexOptions, IgnorePatternWhitespace 99,
402, 413, 421
RegexOptions, Multiline 402, 413-414,
421
RegexOptions, None 415, 421
RegexOptions, RightToLeft 402, 405-406,
414, 420-421, 423-424
RegexOptions, Singleline 402, 414,
421
Regexp (Java package), comparative description 375
Regexp (Java package), speed 376
regsub 100
regular expression origin of term
85
Regular Expression Search
Algorithm 85
regular
sets 85
Reinhold,
Mark xxii
removing
whitespace 199-200
Replace (Regex object
method) 417-418
replaceAll 387
replaceFirst()
387-388
reproductive organs
5
required character
pre-check 244-246, 249, 251-252, 257-259, 332,
361
re-search-forward
100
reset()
387
Result (Match object
method) 423
RightToLeft (Regex
property) 421-422
RightToLeft
(.NET) 402, 405-406, 414, 420-421,
423-424
“The Role of Finite Automata in the Development of
Modern Computing Theory” 85
Ruby, $ and
^ 111
Ruby,
after-match data 136
Ruby, benchmarking 238
Ruby, \G 131
Ruby, line
anchors 128
Ruby,
mode modifiers 133
Ruby, version
covered 91
Ruby, word boundaries 132
Rudkin, Kristine xxii
rule, earliest match
wins 148-149
rule,
standard quantifiers are greedy
151-153
rx
182
s/.../.../
50, 318-321
\S 49,
56, 119
\p{S}
120
\s 49,
119
\s, introduction 47
\s, in
Emacs 127
\s,
in Perl 288
\s, in Perl, (?s) (see: dot-matches-all
mode; mode modifier)
/s 134
Savarese, Daniel xxii
SawAmpersand 358
say what you mean 195, 274
SBOL 362
\p{Sc} 121-122
scalar context 294, 310,
312-316
scalar context, forcing 310
schaffkopf 33
scope lexical vs. dynamic 299
scripts 120-122, 288
search-and-replace, awk 99
search-and-replace, Java 387, 394
search-and-replace, .NET 408, 417-418
search-and-replace, Tcl 100
sed, after-match
data 136
sed, dot 110
sed, history
87
sed, version
covered 91
sed, word boundaries 132
\p{Separator} 120
server VM 234, 236, 375
Sethi, Ravi 180
shell 7
simple
quantifier optimization 247-248
single quotes delimiter 292,
319
Singleline
(.NET) 402, 414, 421
\p{Sk} 121
\p{Sm} 121
small quantifier equivalence
251-252
\p{So}
121
\p{Space_Separator}
121
\p{Spacing_Combining_Mark}
121
“special”
262-266
Spencer, Henry 88,
182-183, 243
split()
java.util.regex 390
split ORO 394-396
split, with capturing parentheses,
Java ORO 395
split, with capturing parentheses,
.NET 403, 420
split, with capturing parentheses,
Perl 326
split, chunk limit, Java ORO 395
split, chunk limit, java.util.regex
391
split, chunk limit, Perl 323
split, into
characters 322
split,
in Perl 321-326
split, trailing empty
items 324
split, whitespace 325
Split (Regex object
method) 419-420
standard
formula for matching delimited text 196
star, introduced 18-20
star, backtracking 162
star, greedy
139
star, lazy 140
star, possessive 140
start() 385
start-of-string anchor optimization
245-246, 255-256, 315
stclass
`list' 362
stock pricing example 51-52,
167-168
stock pricing example, with
alternation 175
stock pricing
example, with atomic
grouping 170
stock pricing
example, with possessive
quantifier 169
Strict (Option)
409
strict pragma
295, 336, 345
String, matches() 384
String, replaceAll 387
String, replaceFirst() 388
String, split() 390
String, split(),
string (also see
line)
String, split(), double-quoted (see double-quoted
string example)
String, initial string discrimination
244-246, 249, 251-252, 257-259, 332, 361
String, vs.
line 55
String, vs. line, match position (see
pos)
String, vs.
line, pos (see
pos)
StringBuffer 388
strings, C#
102
strings, Emacs 100
strings, Java
102
strings, PHP 103
strings, Python 103-104
strings, as
regex 101-105, 305
strings, Tcl
104
strings, VB.NET 102
stripping whitespace 199-200
study 359-360
study, when not to
use 359
subexpression defined 29
substitute() 394
substitution, delimiter 319
substitution, s/.../.../
50, 318-321
substring initial
substring discrimination 244-246, 249, 251-252, 257-259, 332, 361
subtraction class set operations
124
Success, Group
object method 424
Success, Match object
method 421
Success, Match object
method, Sun's regex
package (see java.util.regex)
Success, Match object
method, super-linear
(see neverending match)
super-linear
short-circuiting 250
\p{Symbol} 120
Synchronized Match object
method 424
syntax
class Emacs 127
System.currentTimeMillis()
236
System.Reflection
429
System.Text.RegularExpressions
407, 409
\t 49,
114-115
\t, introduced 44
tag matching 200-201
tag-team matching 130, 315
\p{Tamil} 122
Tcl, [:<:] 92
Tcl, flavor
overview 91
Tcl, benchmarking 239
Tcl, dot
111-112
Tcl, hand-tweaking 243, 259
Tcl, line
anchors 112, 128
Tcl,
mode modifiers 133
Tcl, regex
implementation 182
Tcl, regsub 100
Tcl, search-and-replace 100
Tcl, strings
104
Tcl, version
covered 91
Tcl, word boundaries 132
temperature conversion example, in
.NET 419
temperature
conversion example, in
Java 389
temperature
conversion example, in
Perl 37
temperature
conversion example, Perl
one-liner 283
temperature
conversion example, Perl one-liner, terminators (see line
terminators)
testing engine
type 146-147
text-directed
matching 153
text-directed
matching, regex
appearance 162
text-to-HTML
example 67-77
\p{Thai} 120
theory of an NFA 180
There's more than one way to do
it 349
this|that
example 132, 138, 243, 245-246, 252, 255,
260-261
Thompson, Ken 85-86,
110
thread scheduling Java
benchmarking 236
\p{Tibetan} 122
tied variables 299
time() 232
time of day 26
Time::HiRes 232, 358,
360
Time.new
238
Timer()
237
title case 109
\p{Titlecase_Letter}
121
TiVo 3
tokenizer building 130, 315
toothpicks scattered 100
tortilla 126
ToString, Group object
method 424
ToString, Match object
method 422
ToString, Regex object
method 421
toString ORO 396
Traditional NFA testing for
146-147
trailing context
182
trailing context, optimizations 245-247
Trapszo, Kasia xxii
Tubby 264
typographical conventions xix
\u 116, 290, 400
\U 116
\U...\E 290
\U...\E, inhibiting 292
uc 290
U+C0B5 106
ucfirst 290
UCS-2 encoding 106
UCS-4 encoding 106
Ullman, Jeffrey 180
\p{Unassigned} 121,
123
\p{Unassigned}, in
Perl 288
unconditional
caching 350
underscore in
\w history 89
Unescape 427
Unicode, overview 106-108
Unicode, block
122
Unicode, block, Java 380
Unicode, block, .NET 400
Unicode, block, Perl 288
Unicode, block, Perl, categories (see Unicode,
properties)
Unicode, character, combining 107, 122, 125,
288
Unicode, code point, introduced 106
Unicode, code point, beyond U+FFFF 108
Unicode, code point, multiple 107
Unicode, code point, unassigned in block 122
Unicode, combining
character 107, 122, 125, 288
Unicode, in
Java 380
Unicode,
line terminators 108-109,
111
Unicode, line terminators,
in Java 382
Unicode, line terminators, in Java, loose
matching (see case-insensitive mode)
Unicode, in
.NET 401
Unicode,
properties 119, 288
Unicode, properties, java.util.regex
380
Unicode, properties, list 120-121
Unicode, properties, \p{All} 123, 288
Unicode, properties, \p{Any} 123, 288
Unicode, properties, \p{Assigned} 123-124,
288
Unicode, properties, \p{Unassigned} 121, 123,
288
Unicode, script 120-122, 288
Unicode, support in
Java 373
Unicode,
Version 3.1 108, 380,
401
Unicode, Version
3.2 288
Unicode, whitespace and /x
288
UNICODE_CASE (Pattern
flag) 380, 383
UnicodeData.txt 290
unicore 290
UNIX_LINES (Pattern
flag) 380, 382
unmatch 152, 161, 163
unmatch, .* 165
unmatch, atomic
grouping 171
unrolling the
loop 261-276
unrolling the
loop, example
270-271
unrolling the loop, general
pattern 264
\p{Uppercase_Letter}
121
URL encoding 320
URL example 74-77, 201-204, 208, 260,
303-304, 306, 320
URL example, egrep 25
URL example, Java 209
URL
example, plucking
205-208
use
charnames 290
use Config 290, 299
use English 357
use overload 342
use re 361,
363
use re 'debug'
361, 363
use re
'eval' 337
use
strict 295, 336, 345
use Time::HiRes 358,
360
use
warnings 326, 363
username example 73, 76, 98
username example, plucking from
text 71-73
username
example, in a URL
74-77
using
System.Text.RegularExpressions 410
UTF-16 encoding 106
UTF-8 encoding 106
\V 364
\v 114-115, 364
Value, Group object
method 424
Value, Match object
method 422
variable names
example 24
variables,
after match, pre-match
copy 355
variables,
binding 339
variables, fully
qualified 295
variables, interpolation 344
variables, naughty 356
variables, tied 299
variables, tied, VB.NET (also see .NET)
variables, comments 99
variables, regex
approach 96-97
variables, strings 102
variables, URL parsing
example 204
verbatim
strings 102
Version 7
regex 182
Version 8
regex 182
versions covered in
this book 91
vertical
tab 109
vertical tab,
in Perl \s
288
vi after-match data
136
Vietnamese text
processing 29
virtual
machine 234-236, 375
Visual
Studio .NET 428
VM 234, 236, 375
VM, warming up
235
void context 294
VT 109
\W 49, 119
$^W 297
\w 49, 65, 119
\w, in
Emacs 127
\w,
many different interpretations
93
\w, in
Perl 288
Wall,
Larry 88-90, 138, 363-364
warming up Java VM 235
warnings 296
warnings, temporarily turning
off 297
warnings
pragma 326, 363
while vs. foreach vs.
if 320
whitespace, allowing
optional 18
whitespace, removing 199-200
wildcards filename 4
Wilson, Dean xxii
Woodward, Josh xxii
word anchor mechanics of matching
150
word boundaries
131
word boundaries, \<...\>, egrep 15
word boundaries, introduced 15
word
boundaries, in Java
373
word boundaries, many
programs 132
word
boundaries, mimicking
66, 132, 341-342
word boundaries, in
Perl 132, 288
www.cpan.org 358
www.PeakWebhosting.com
xxii
www.regex.info
358
\X 107,
125
\x 116,
400
\x, in
Perl 286
\x,
in Perl, (?x) (see: comments and
free-spacing mode; mode modifier)
/x 134, 288
/x, introduced 72
/x, history 90
Xerces
org.apache.xerces.utils.regex 372
-y old grep
86
¥ 122
Yahoo! xxi, 74, 130, 190, 205, 207,
258, 314
\z 111,
127-128, 316
\z, in
Java 373
\z,
optimization 246
\Z 111, 127-128
\Z, in
Java 373
\Z,
optimization 246
\p{Z} 119-120, 380,
400
Zawodny, Jeremy xxii,
258
ZIP code example
208-212
\p{Zl}
121
\p{Zp}
121
\p{Zs} 121