my @sorted = map $_->[0], sort { $a->[1] <=> $b->[1] } map [$_, -s $_], glob "/bin/*";
Using the -s operator to determine the file's size is an expensive operation; by caching its value you can save some time. How much? Let's see in the next exercise's answer.
use Benchmark qw(timethese); timethese( -2, { Ordinary => q{ my @results = sort { -s $a <=> -s $b } glob "/bin/*"; }, Schwartzian => q{ my @sorted = map $_->[0], sort { $a->[1] <=> $b->[1] } map [$_, -s $_], glob "/bin/*"; }, });
On the 33-element /bin on my laptop, I (Randal) was seeing 260 iterations per second of the Ordinary implementation and roughly 500 per second of the Schwartzian implementation, so writing the complex code saved about half of the execution time. On a 74-element /etc, the Schwartzian Transform was nearly three times as fast. In general, the more items sorted, the more expensive the computed function, and the better you can expect the Schwartzian Transform to perform. That doesn't even count the burden on the monkey—er, I mean the operating system.
my @dictionary_sorted = map $_->[0], sort { $a->[1] cmp $b->[1] } map { my $string = $_; $string =~ tr/A-Z/a-z/; $string =~ tr/a-z//cd; [ $_, $string ]; } @input_list;
Inside the second map, which executes first, make a copy of $_. (If you don't, you'll mangle the data.)
sub data_for_path { my $path = shift; if (-f $path or -l $path) { return undef; } if (-d $path) { my %directory; opendir PATH, $path or die "Cannot opendir $path: $!"; my @names = readdir PATH; closedir PATH; for my $name (@names) { next if $name eq "." or $name eq ".."; $directory{$name} = data_for_path("$path/$name"); } return \%directory; } warn "$path is neither a file nor a directory\n"; return undef; } sub dump_data_for_path { my $path = shift; my $data = shift; my $prefix = shift || ""; print "$prefix$path"; if (not defined $data) { # plain file print "\n"; return; } my %directory = %$data; if (%directory) { print ", with contents of:\n"; for (sort keys %directory) { dump_data_for_path($_, $directory{$_}, "$prefix "); } } else { print ", an empty directory\n"; } } dump_data_for_path(".", data_for_path("."));
By adding a third (prefix) parameter to the dumping subroutine, you can ask it to indent its output. By default, the prefix is empty, of course.
When the subroutine calls itself, it adds two spaces to the end of the prefix. Why the end and not the beginning? Because it's comprised of spaces, either end will work. By using trailing spaces, you can call the subroutine like this:
dump_data_for_path(".", data_for_path("."), "> ");
This invocation quotes the entire output by prefixing each line with the given string. You can (in some hypothetical future version of this program) use such quoting to denote NFS-mounted directories, or other special items.
Copyright © 2003 O'Reilly & Associates. All rights reserved.