Pyblosxom Hack Number 1

Published: April 11, 2008
Tags: python pyblosxom

Here's my first "pyBlosxom hack". It's not really a hack on the pyBlosxom system itself, it's more of a "usage hack", but I think it's a relatively neat one.

Back when this blog was statically rendered, I used to write the entries in Markdown, and they were stayed that one on the file system. The statially rendered HTML pages were in proper HTML, however, because I used the Markdown parser for PyBlosxom. This worked just fine for static rendering, but when I went dynamic I immediatley realised a huge problem with this set up. For some reason the Markdown parser is unbelievably slow. It took literally whole minutes for pyBlosxom to render the latest 10 entries, which is obviously completely unacceptable.

I found this quite odd at first, because I write my articles in Markdown too, and use Markdown in Python to translate them to HTML. I had always assumed that PyBlosxom used the same Markdown translation code - afterall, why would someone code a Python Markdown library if there was already one out there? But it turns out that in fact this is what's happened. The PyBlosxom renderer uses completely different - and obviously much less efficient - code to Markdown in Python.

The obvious solution to this problem would have been to wrap Markdown in Python up in whatever interface pyBlosxom uses for parsers, but I've solved it by doing something quite different which gives me a fairly powerful interface to using pyBlosxom.

I've written a python script called makeentry which does the following:

Starts up vi, my editor of choice, editing a temporary file in /tmp. I use this editing session to write an entry in Markdown. Note that I write just the entry, without the metadata that pyBlosxom would usually want at the start, like a title or tags.
Upon the vi process terminating after I finish writing the entry, it starts up aspell to spell check that file.
After spell checking the file, it (quickly!) translates the Markdown to HTML using Markdown in Python.
I then get prompted to entire a title and list of tags.
The title, tags and translated HTML entry are then all concatenated in the expected way into a file in my pyBlosxom entry directory (the filename is automatically generated from the title by converting to lowercase and replacing spaces with underscores).

This way I still get to write in Markdown, but with the following benefits over wrapping Markdown in Python up with pyBlosxom's parser interface:

I get to do do spell checking (indeed, arbitrary pre-processing) before publishing my entry.
pyBlosxom reads the entry of the disk in HTML, so no time at all is consumed doing a translation (which is faster than even the fastest Markdown translator possible).

I quite like this usage paradigm. I'm hoping that sometime not too far off I get the chance to add another level of pre-processing: Pygments is a code colouring system (written in Python, of course), which translates code in just about any modern programming language into HTML with appropriate span tags to perform syntactic code colouration. I'd really like it if I could have my makeentry script search the HTML entry for code tags nested in pre tags (using HTMLParser from the Python standard library) and automatically replace the contents with colourised code using Pygments. This would be pretty cool and shouldn't be too hard. Keep an eye out for it in the nearish future.

Feeds