As the data structures become more complex, it helps to have higher-level constructs deal with common tasks such as selection and transformation. In this regard, Perl's grep and map operators are worth mastering.
Let's review the functionality of grep and map for a moment, without reference to references.
The grep operator takes a list of values and a "testing expression." For each item in the list of values, the item is placed temporarily into the $_ variable, and the testing expression is evaluated (in a scalar context). If the expression results in a true value (defined in the normal Perl sense of truth), the item is considered selected. In a list context, the grep operator returns a list of all such selected items. In a scalar context, the operator returns the number of selected items.[21]
[21]The value in $_ is local to the operation. If there's an existing $_ value, the local value temporarily shadows the global value while the grep executes.
The syntax comes in two forms: the expression form and the block form. The expression form is often easier to type:
my @results = grep $expression, @input_list; my $count = grep $expression, @input_list;
Here, $expression is a scalar expression that should refer to $_ (explicitly or implicitly). For example, find all the numbers greater than 10:
my @input_numbers = (1, 2, 4, 8, 16, 32, 64); my @bigger_than_10 = grep $_ > 10, @input_numbers;
The result is just 16, 32, and 64. This uses an explicit reference to $_. Here's an example of an implicit reference to $_ that's similar to pattern matching:
my @end_in_4 = grep /4$/, @input_numbers;
And now you find just 4 and 64.
If the testing expression is complex, you can hide it in a subroutine:
my @odd_digit_sum = grep digit_sum_is_odd($_), @input_numbers; sub digit_sum_is_odd { my $input = shift; my @digits = split //, $input; # Assume no nondigit characters my $sum; $sum += $_ for @digits; return $sum % 2; }
Now you get back the list of 1, 16, and 32. These numbers have a digit sum with a remainder of "1" in the last line of the subroutine, which counts as true.
However, rather than define an explicit subroutine used for only a single test, you can also put the body of a subroutine directly in line in the grep operator, using the block forms:[22]
[22]In the block form of grep, there's no comma between the block and the input list. In the earlier (expression) form of grep, there must be a comma between the expression and the list.
my @results = grep { block; of; code; } @input_list; my $count = grep { block; of; code; } @input_list;
Just like the expression form, each element of the input list is placed temporarily into $_. Next, the entire block of code is evaluated. The last expression evaluated in the block (evaluated in a scalar context) is used like the testing expression in the expression form. Because it's a full block, you can introduce variables that are local to the block. Let's rewrite that last example to use the block form:
my @odd_digit_sum = grep { my $input = $_; my @digits = split //, $input; # Assume no nondigit characters my $sum; $sum += $_ for @digits; $sum % 2; } @input_numbers;
Note the two changes: your input value comes in via $_ rather than an argument list, and the keyword return was removed. In fact, it would have been erroneous to retain the return because you're no longer in a separate subroutine: just a block of code.[23]
[23]The return would have exited the subroutine that contains this entire section of code. And yes, some of us have been bitten by that mistake in real, live coding on the first draft.
Of course, you can optimize a few things out of that routine:
my @odd_digit_sum = grep { my $sum; $sum += $_ for split //; $sum % 2; } @input_numbers;
Feel free to crank up the explicitness if it helps you and your coworkers understand and maintain the code. That's the main thing that matters.
Copyright © 2003 O'Reilly & Associates. All rights reserved.