Now with feeds!

Published: March 02, 2008
Tags: software feedformatter

Travels

I'm back from honeymoon! Some of you may have noticed the nifty new travel maps that are up on my homepage. I expect these will change fairly slowly over time, due to the costs of international travel. I'll try to get a working photo gallery of honeymoon shots up soon.

Web framework progress

Some of you may have noticed that there are now links from my homepage to valid RSS 1.0, RSS 2.0 and Atom 1.0 feeds for articles published on this site. These are generated by a Python module I wrote specifically for the task, which I have released on my software page as the Universal Feed Formatter (in reference to the well known, used and loved Universal Feed Parser). I was actually surprised I had to write my own module to achieve this. There is a lot of Python code for parsing various feed formats on the internet, but surprisingly few for producing the feeds themselves. I certainly couldn't find anything on the net that could take a single dictionary structure and produce files in various formats like feedformatter can. Hopefully someone else can take advantage of this convenience.

feedformatter is now integrated with the simple web framework that I mentioned in my last entry. You'll also notice that I have a working (though imperfect) sitemap up as well, again generated by the framework. With these things done, I think I've now accomplished all of my original goals for this project. The code is by no means clean or reliable, so I won't be releasing it at the moment, but it works and can be progressively polished over time. I will probably do this before I begin work on implementing some sort of commenting system for my articles.

I have been thinking, vaguely, about extending the framework to include blogging, and replacing pyblosxom with it. My reason for this is not really a direct disatisfaction with pyblosxom. It's the fact that a lot of the plugins that people write with pyblosxom do not work well (or at all!) with pyblosxom's static rendering mode (which is the only mode I will use because I refuse to dynamically render static content each time it is viewed). This deficiency is the reason that there is no pagination on this blog (yet). Eventually this will become a problem, at which point I'll either need to hack someone else's pyblosxom plugin or switch to a new blogging platform - that new platform may as well be an extension of my own framework, because that will mean one less set of templates I need to maintain to match the rest of the my site.

Web log analysis

Several months ago I installed the /www/webalizer package from pgksrc on my web server - it's a web log analyser that I run from cron every hour. It compiles basic statistics on hits to my website (most popular pages, most popular entry and exit pages, viewer country statistics based on GeoIP, etc.) and then produces HTML reports. I kept a half-hearted eye on these statistics for the first few days, but then mostly forgot about them. I revisited my stats pages earlier this week, and was pleased to see how much traffic I was apparently getting.

Intrigued, I decided to step my analyses up a bit by configuring my web server to log user agents and referring URLs in addition to the basic information already logged. Now able to see user agents, it's become clear that most of the traffic I thought I was getting was not actually from people but rather search engine crawlers. Oops. I've changed my webalizer settings now to ignore these hits, but it will be a while before I can collect meaningful statistics on the genuine human traffic.

The most interesting things the log analysis reveals at this point are

My NetBSD survival guide is the most popular page on the site. In fact, with some googling I was even able to discover that the URL for that page was given out in an OpenBSD IRC channel earlier this year! The survival guide was actually in fairly poor shape all this time, so I've put some effort into expanding and polishing it lately, given the important role it seems to play for my site. I still have a bit more to write, though, so watch that page over the next week or two for some activitiy.
More than one person has wound up at this blog page searching for information on Itojun's cause of death toward the end of last year. I did a lot of searching trying to find this out myself, and have come to the conclusion that there is not currently, and probably is not likely to ever be, a definite answer to this on the web. The only real leads I've found so far are a claim on the OpenBSD news site undeadly.org that it was a car accident and a claim on Slashdot that it was suicide - neither of these are substantiated by any kind of hard evidence. It seems clear that Itojun's family and close friends wish the cause the remain private, and I think the best thing would be for his well wishers to respect that.

Feeds