Second Edition – Mastering Perl

The Mastering Perl Module Index

Mastering Perl Second Edition has a module index, which is something I first tried in Programming Perl 4th Edition. Here’s just the index from the last review I just turned in (which means the book should be available in about a month). Continue reading “The Mastering Perl Module Index”

New to “Perl Debuggers”

I renamed this chapter to “Perl Debuggers” from “Debugging Perl”; it’s not really about the process of debugging but the tools to do it. Most of it has stayed the same, although I had to remove the unmaintained Devel::ebug::HTTP that uses old Catalyst code. That’s one of the reasons I don’t like tying such modules to frameworks. The people who like to make the frameworks like to reinvent them, leaving all of the legacy code behind.

You can read this chapter in O’Reilly Atlas.

New to “Profiling”

Devel::NYTProf, Adam Kaplan’s original creation and Tim Bunce’s ongoing gift to the Perl community, is the new star of my “Profiling” chapter. However, the static book format doesn’t do much to show it off.

Other than that, I went through all of the code and output with v5.18 to refresh it. Not much changed other than the times getting much faster almost a decade later.

You can read this chapter in O’Reilly Atlas.

New to “Advanced Regular Expressions”

This chapter has lots of new material and much of the old text is gone. I wasn’t looking forward to working on this chapter because I knew it was going to be mostly new text.

My goal of Mastering Perl has always been to present material not already well covered in some other book. This chapter did that when I first wrote it, but I then moved some of it into other books.

The stuff about regular expression references moved into Intermediate Perl, so I assume in Mastering Perl that readers already know that stuff. There went a third of the chapter.

I had a medium-sized section on YAPE::Regex::Explain, a module that turned a pattern into an english description of the pattern. That module hasn’t kept up with post-v5.10 features and has been abandoned. That’s gone.

Curiously, neither Learning Perl nor Intermediate Perl had covered non-capturing parentheses. I fixed that, so that section is gone from Mastering Perl.

But, this left room for much more exciting and advanced things.

Randal Schwartz meditated on a wickedly tight minimal JSON parser that he wrote for a particular client situation. He used advanced features including regex grammars, code execution with (?{...}), and data structure bootstrapping with $^R. I wanted to explain that regex but I needed to talk about all of the features in it. I think I did a pretty good job in the normal camelid trilogy style: I start with a simple program, find an edge case that causes a problem, modify, and repeat. I used the example of matching nested quotes to cover, one feature at a time:

(?PARNO) to refer to earlier capture groups as independent patterns
Recursion with (?PARNO), which allows matching balanced text
(?(DEFINE)...) to create and name sub patterns for use later
(?{...}) to watch what’s happening in a regex
$^N inside (?{...}) to get the text of the previous capture buffer
bootstrapping a data structure with $^R inside (?{...})

Once I cover all of those, I can show off Randal’s regex. Instead of explaining it, though, I leave that to the reader. By the time I get there, they should understand all of the features he uses.

That stuff was a bit of a bear to figure out. The docs aren’t great and there are very few examples out there. I even typed out a long StackOverflow question about it that led me to answering my own question.

Besides that, I pull out \K, another v5.10 feature, to fix my broken money commifying example. I’m not actually the person who fixed it though; I lifted that regex from Michael Carmen’s answer to my StackOverflow question about it.

After all that, I end the chapter with an expanded section on regular expression debugging, including Damian Conway’s Regexp::Debugger. That’s not very exciting in a static book, so I’ll have to make some screencasts about it here.

You can read this chapter in O’Reilly Atlas.

New to “Error Handling”

The “Error Handling” chapter is really two chapters: one on dealing with error and warning messages that Perl gives you, and creating those messages on your own for people using your code.

When I first wrote this chapter, Perl’s “exception” handling was poor. We had the basics that we still use, but we hadn’t figured out all of the special cases. Try::Tiny and TryCatch didn’t show up until 2009, about four years later. Not only did I need to cover those important developments, but the background behind them.

autodie also replaced Fatal in core.

In the first edition, I covered Carp in the “Debugging” chapter. I don’t know what I was thinking then, but I’ve moved that to this chapter, leaving a big hole in that chapter; I’ll have to figure out what to do about that, but later.

I also added my ReturnValue discussion. We treat errors and return values as different things; even worse,, we treat them as unstructured data so we need to know special things about every function to know what it’s return value means. Is undef a valid result or an error? What about 0 or the empty string? With a return value object, we can ask directly. I don’t expect anyone to use my module, but the idea of this module.

I think I also avoided the easy trap for a chapter like this: I didn’t make it just a survey of modules. We know from Perl history that best practices and favorite modules have short lifetimes compared to book publishing rates. I hope I gave people enough to think about so they could evaluate future.

You can read this chapter in O’Reilly Atlas.

New to “The Magic of Tied Variables”

The tie mechanism isn’t something people reach for often, but that’s one of the reasons I included a chapter about it in Mastering Perl. No one else is writing about it either, and in some cases it an be quite useful.

I wrote several modules as examples for this chapter, but I never uploaded them to CPAN. Now I’m putting them on CPAN so they are easy to get if you want to play with them yourself. Not only that, I didn’t like the style that I used back then, so I made some cosmetic changes and light refactorings. They include:

You can read this chapter in O’Reilly Atlas.

New to “Modules as Programs”

I keep wanting to name this chapter “Modulinos”, but that doesn’t mean much to someone who doesn’t have that idea yet, so I leave it as is.

For awhile, I’ve wanted to reduce the modulino idea to a single statement, perhaps a use statement:

use Modulino;

I think I’ve got most of the way there with Modulino::Base, which I’ve included in my update to the chapter. It was an interesting journey though. I didn’t want to inherit from it because it’s not really that sort of relationship. I could add it as a role or trait, but I didn’t want to deal with complication surrounding that. I settled on a require because the code needs to run after the compile phase when all of the methods are already compiled. That was true of the code that was already controlling that behavior:

__PACKAGE__->run(@ARGV) unless caller;

Moving all of the code into another file (and using subroutines) messed with caller a bit. Now I had to look back a couple of levels to see what was going on. That seems simple now that I’ve figured it out, but on the way I loaded Modulino::Base in many ways and looked at caller going two levels back to convince myself I was doing the right thing. As with most coding problems, the other weird things I was trying to do had clouded the problems and simplifying the module solved it.

The danger with any code written specifically for a book is that it hasn’t been used widely and its problems haven’t appeared. I know they are there. Not only that, once the book is done, the author tends to stop supporting the code. I certainly plan that because I think the idea is simple enough that no one should use a module for it. That is, no one should create an external dependency to get something so simple.

I’ve also expanded the modulino idea. Previously, it was about running as a program or loading as a module. However, we can do other things based on any condition. The module can run as a test script. Modulino::Base looks for methods that start with _test_ and runs them as subtests if CPANTEST is set to a true value. I stole that idea from Python, although it doesn’t seem to be as popular as it was when I learned that language.

Beyond that, I experimented with having a module file be a template for its own documentation. Using some switch to say I wanted to read the docs, it ran Pod::Perldoc on itself, making modifications to the Pod along the way, such as automatically inserting the right package name. I like that idea and I think there’s more behind it, but it’s not something I want to delay the book for.

You can check out this chapter in O’Reilly Atlas.

Updated scriptdist

I’m amazed how much has changed in ten years.

I have a program to create a Perl module-like distribution around a standalone program. I used it back in the day when I wasn’t creating modulinos from the start, and in 2004 I wrote about it for The Perl Journal (which is now Dr. Dobbs Journal) as Automating Distributions with scriptdist.

I don’t use this much anymore, but it’s probably still useful for people stuck with many standalone legacy programs, so I updated it to version 0.22. In previous versions, which I created ten years ago, I was still using CVS. I’ve changed the program to automatically init and populate a git repo.

New to “Symbol Tables”

“Symbol Tables” has always been a tough chapter because it’s the part of the book where I have the least experience, I think. I’m not strong on Perl internals and don’t have to think about symbol tables so much; my terminology is a bit soft and pudgey here.

For the most part, the chapter is the same without major changes. I use the word “stash” much more, flesh out some examples, and show the output of more programs.

I made a mistake in the previous edition. I had a subroutine that would check for that a package variable had been used somewhere in the program. I can do that with something like:

print "Used!\n" if *foo{ARRAY};

The typeglob will return a reference to the data for that variable, and references are always true values. If @foo hasn’t been used (so, no data yet, even if it’s the empty list), this returns undef. It returns true even if I’ve undef-ed the array; I’ve still used the array already in that case.

However, for scalars, it always returns a reference. It even returns a reference in the case that the scalar of that name has not ever been used or seen. The means *foo{SCALAR} is true for all names all the time. perlref says:

*foo{THING} returns undef if that particular THING hasn’t been used yet, except in the case of scalars. *foo{SCALAR} returns a reference to an anonymous scalar if $foo hasn’t been used yet. This might change in a future release.

Besides that, I made adjustments for imprecisions in my previous explanations and terms. I could use a good set of tech reviewers on it still.

At the end of the chapter, after showing all of the complicated stuff, I show Package::Stash, which mostly makes it easier to do the really hard stuff.

Check out this chapter in O’Reilly Atlas.

New to “Benchmarking”

Computers are much faster eight years later so my benchmark sample results are much faster. I’m regenerating those results with v5.18. I’ve tweaked many of the programs to be a bit more useful; in the previous edition I hardcoded values, such as the number of iterations. Now I can specify those on the command line.

Several of the benchmarks dealt with listing files with glob. In the first edition, I used a couple of existing directories. This time around, I spent too much time looking for suitable directories with the right numbers of files. After awhile, I gave up on that and had the programs create temporary directories with the number of files I specify:

use File::Temp qw(tempdir);
my $dir = tempdir( CLEANUP => 1 );

chdir( $dir ) or die "Could not change to $dir: $!";
foreach ( 1 .. $files ) {
    open my($fh), '>', "$0.$_.tmp" or die "Could not create a file: $!";
    print { $fh } time();
    }

After I covered Benchmark, I also mentioned Steffen Müller’s Dumbbench (see my earlier post, Playing with Dumbbench).

I updated the section on perlbench, Gisle Aas’s tool to compare benchmarks across different interpreters. Since he hasn’t updated that since I wrote the first edition, I took all of the previous releases and imported them into git so I could fix RT 73642, which recognizes the new way that perl reports its version. That’s now in my perlbench GitHub project.

Most of the other stuff is the same, with minor updates.

Check out this chapter in O’Reilly Atlas.