querylet

I work from home on Mondays. I find it’s a good way to avoid having people barge in on me while I’m trying to get in the zone. At work, my office is, I think, one of the nicest in the building, as far as atmosphere goes. I have a nice lamp, a good arrangement of furniture, and good music. People stop in to ask me questions and give me information I need, and I really do enjoy the fact that my office seems to make them feel at ease.

That said, I can’t get hard work done when people are wandering in, and people don’t always understand that a closed door might mean “go away.” So, I work from home on Mondays.

Despite that, I got an urgent call last week: one of our engineers needed to run a query, and wanted me to run it for him. This annoyed me, because not only was it my day off, but running a query is so simple! “Look,” I said, “just take half an hour to learn SQL and you’ll be able to take care of this stuff yourself.”

Well, to his credit he did, and then said, “How do I use this now?” He could have used Excel or Access, but they both ruin your SQL formatting and encourage you to use the dreadful MS query design utilities. I didn’t want to point them out, so I wrote a really simple script to run an SQL query from a text file and produce CSV output for use in Excel.

I cannot explain how exciting this was to the engineer. He seemed to think it was going to change his ability to do his job! It probably will, too: if you need to analyze data to do your job, but have never had the ability to query the data in any powerful fashion, a simple tool like an SQL query must seem amazing.

It seemed to me like it might be worth my time to give that tool a little more power. “What if you could makes the queries more abstract,” I asked, “and then give it parameters when you run it, like date ranges and lists of reactors to query?” He approved of the idea, and I thought I’d make it a three part text file: SQL query, names to use when prompting for bind values, and raw Perl to run before output, just in case.

Somewhere, though – and I really can’t recall where, now – I had an incredible fit of hubris. I came to believe that the simplest and most rational thing to do would be to write a source filter that would convert a simple syntax for querying and data munging into real Perl, enabling the engineers to write powerful reports with only a rudimentary knowledge of Perl syntax.

This became Querylet.

I’ll give an example, using a querylet that I’m using quite a bit, now.

	database: dbi:SQLite2:dbname=cpants.db
	
	query:
	SELECT dist, extracts_nicely, has_version, has_readme, has_manifest,
	  has_meta_yml, has_buildtool, no_symlinks, has_tests, proper_libs,
	  no_pod_errors, is_prereq
	FROM kwalitee
	WHERE dist IN (SELECT dist FROM dist WHERE author = '[% cpanid %]')
	ORDER BY dist

This should be pretty straightforward. It’s the beginning of the querylet, and it sets up the DBI connection and creates a query template. Right now, it uses TT2 for the query, but I’ll be changing that in the near future to use more traditional bind values – I think.

Then, I tell it how to fill in the template:

	munge query:
	cpanid => uc($ARGV[0] || 'rjbs')

So, unless I say otherwise at the command line, use my cpanid.

This is simple enough already: it queries CPANTS for all my modules, and shows me their kwalitee markers. I don’t want to see markers that I’ve never failed to get, though, so I do a bit of munging:

	delete columns where:
	not(grep { ($_||'0') ne '1' } @values)

Now columns that have nothing but 1’s will be dropped.

I also don’t want to see perfect distributions:

	delete rows where:
	not(grep { ($_||0) == 0 }
	  @$row{grep { $_!~ /dist|is_prereq/ } keys %$row})

That is, delete any row that’s never got a zero in the useful kwalitee marker columns. (I’m not worried about is_prereq, since I’ve got no prereq modules (yet).)

Now, I can run this program and get a nice CSV file of my needed improvements.

That’s not so great for the command-line, though, so I might want to use an output plugin. I can just add two lines to my querylet:

	use Querylet::Output::Text;
	output format: text

and now I get a nicely formatted plaintext table, courtesy of Text::Table.

These queries are fantastic, and I can just save my utility SQL scripts for future use. Of course, back at work it’s not the best thing ever, since even bright physicists sometimes get daunted by the command interpreter. Fortunately, I can do a little rewriting for them, too. I remove the old “munge query” line and add:

	use Querylet::CGI;
	input type: cgi
	input: cpanid
	munge query:
	cpanid => $input->{cpanid}
	

Now, the querylet will run as a CGI application. If the “cpanid” CGI input can’t be found, it will present a web form asking for it. If it can – and any other inputs we might ask for can, too – it will run the query and produce an HTML table output.

There’s a lot of polish to be done, especially as regards using TT2 to produce nicer output. I need to finalize the query munging and parameterization. Even so, though, this is clearly going to make my life a lot easier.

If you’d like to see it in action, I’ve got the above querylet in place, for now. You can see the source or a demo.

Written on September 20, 2004