Use the lc and uc functions or the \L and \U string escapes.
$big = uc($little); # "bo peep" -> "BO PEEP" $little = lc($big); # "JOHN" -> "john" $big = "\U$little"; # "bo peep" -> "BO PEEP" $little = "\L$big"; # "JOHN" -> "john"
To alter just one character, use the lcfirst and ucfirst functions or the \l and \u string escapes.
$big = "\u$little"; # "bo" -> "Bo" $little = "\l$big"; # "BoPeep" -> "boPeep"
The functions and string escapes look different, but both do the same thing. You can set the case of either just the first character or the whole string. You can even do both at once to force uppercase (actually, titlecase; see later explanation) on initial characters and lowercase on the rest.
$beast = "dromedary"; # capitalize various parts of $beast $capit = ucfirst($beast); # Dromedary $capit = "\u\L$beast"; # (same) $capall = uc($beast); # DROMEDARY $capall = "\U$beast"; # (same) $caprest = lcfirst(uc($beast)); # dROMEDARY $caprest = "\l\U$beast"; # (same)
These capitalization-changing escapes are commonly used to make a string's case consistent:
# titlecase each word's first character, lowercase the rest $text = "thIS is a loNG liNE"; $text =~ s/(\w+)/\u\L$1/g; print $text; This Is A Long Line
You can also use these for case-insensitive comparison:
if (uc($a) eq uc($b)) { # or "\U$a" eq "\U$b" print "a and b are the same\n"; }
The randcap program, shown in Example 1-2, randomly titlecases 20 percent of the letters of its input. This lets you converse with 14-year-old WaREz d00Dz.
#!/usr/bin/perl -p # randcap: filter to randomly capitalize 20% of the letters # call to srand( ) is unnecessary as of v5.4 BEGIN { srand(time( ) ^ ($$ + ($$<<15))) } sub randcase { rand(100) < 20 ? "\u$_[0]" : "\l$_[0]" } s/(\w)/randcase($1)/ge; % randcap < genesis | head -9 boOk 01 genesis 001:001 in the BEginning goD created the heaven and tHe earTh. 001:002 and the earth wAS without ForM, aND void; AnD darkneSS was upon The Face of the dEEp. and the spIrit of GOd movEd upOn tHe face of the Waters. 001:003 and god Said, let there be ligHt: and therE wAs LigHt.
In languages whose writing systems distinguish between uppercase and titlecase, the ucfirst( ) function (and \u, its string escape alias) converts to titlecase. For example, in Hungarian the "dz" sequence occurs. In uppercase, it's written as "DZ", in titlecase as "Dz", and in lowercase as "dz". Unicode consequently has three different characters defined for these three situations:
Code point Written Meaning 01F1 DZ LATIN CAPITAL LETTER DZ 01F2 Dz LATIN CAPITAL LETTER D WITH SMALL LETTER Z 01F3 dz LATIN SMALL LETTER DZ
It is tempting but ill-advised to just use tr[a-z][A-Z] or the like to convert case. This is a mistake because it omits all characters with diacritical markings—such as diaereses, cedillas, and accent marks—which are used in dozens of languages, including English. However, correctly handling case mappings on data with diacritical markings can be far trickier than it seems. There is no simple answer, although if everything is in Unicode, it's not all that bad, because Perl's case-mapping functions do work perfectly fine on Unicode data. See the section on The Universal Character Code in the Introduction to this chapter for more information.
The uc, lc, ucfirst, and lcfirst functions in perlfunc(1) and Chapter 29 of Programming Perl; \L, \U, \l, and \u string escapes in the "Quote and Quote-like Operators" section of perlop(1) and Chapter 5 of Programming Perl
Copyright © 2003 O'Reilly & Associates. All rights reserved.