reimagining pod (or: breaking all my pod libraries)

After reading perlpodspec a few times and trying to reconcile all my hopes for my Pod-munging tools with the pretty restrictive rules of the spec, I have come to imagine Pod as a nested set of layers of specification, almost like the skin on an onion. Go figure.

The most basic set of Pod that I care about turns out to be very, very basic. It’s all that Pod::Eventual is probably ever going to worry about, and right now it’s four kinds of events.

Nonpod events are all the stuff in your document that isn’t Pod. In most cases, this is your Perl. Nonpod comes in big hunks; each hunk is everything from the last =cut to the next Pod event.

Command events are paragraphs (remember, Pod is paragraph-oriented, not line-oriented) that start with a command like =head1 or =over or =method.

Blank events are the blank lines between paragraphs.

Text events are everything else: the Pod paragraphs that have non-whitespace in them but that don’t start with a command.

I am currently considering making cut paragraphs, which would represent the =cut command, which has slightly different syntax than others.

So, this totally ignores the notion of verbatim paragraphs (text paragraphs where the content starts with whitespace) or =begin/=end regions or anything else like that. That is all up to the set of commands and additional semantics that is supported. I’m thinking of these, right now, as dialects. I call the one used by Perl5 “Pod5.”

Pod::Elemental represents events as node objects, which can then be reformed, nested, and sanity checked by libraries that correspond to dialects. So, given the following stream of events:

nonpod
=head1
blank
text
blank
text
blank
=begin foo
text
blank
text
blank
=end foo
blank
=cut
nonpod

…the Pod5 nester or dialect would produce the following document:

nonpod
=head1
text
text
region foo
  data
=cut
nonpod

Another nester could probably take that input and transform it into:

nonpod
=head1
  text
  text
region foo
  data
=cut
nonpod

Part of my earlier frustration with this problem was with questions of how to nest things. Pod5 only defines the notion of commands being responsible for more than their own paragraph in three places: (1) all non-cut commands switch the parsing mode from non-pod to pod; (2) =begin encloses all pod content until a matching =end; and (3) =over encloses all pod content until a matching =back.

The original Pod::Elemental::Nester tried to do other clever things as well, like associate text paragraphs with preceding headers and headers with preceding higher-ranked headers. That led to unclear questions like “does an =over list fit inside a head1? what about inside a head4?” Worse, “how do I know whether to nest a =begin :Foo region inside a header?”

Well, I can punt. Pod5’s nester only needs to handle the begin and over commands. Other nesters can be clever in their own special way. I hope to write a nester-helper (parameterized role, anyone?) that will accept a description of a hierarchy to enforce.

Right now, I think it won’t matter. In practical terms, applying Pod5’s nester will deal with everything apart from new commands like method and attr that I use with Dist::Zilla’s PodPurler plugin. Since those are dealt with by en masse extraction and reinsertion, their place in the hierarchy isn’t very interesting. I suppose the simplest strategy is to apply Pod5, then apply “nest all text under immediately preceding command,” then extract all things like methods, then finish nesting, then reinject the Pod created from the extracted methods and attributes.

I’m still feeling pretty good about the way things are progressing, but I’m definitely starting to remember the weird design issues I slammed into last time. It’s fun being forced to work through them, though. I’m looking forward to seeing how the end product will work.

Written on May 29, 2009