Better logging
saturday, february 23rd, 2008 5:22pm
I'm entranced with a new practice: logging to a database instead of a file.
Long ago I got into a habit of logging to files as a way of monitoring the workings of my programs. For shell scripts I piped the standard output to be appended to a file, and then just sort of stuck with that model as I learned other languages.
Though java and python each have a robust logging library built into the language, I didn't use those, instead focusing my language-learning on features that more directly enabled me to tackle whatever the task at hand. The result is that over time my shell and php and python and java programs ended up with log files that grew ever larger, requiring occasional manual paring.
Given an interest in best-practices, I've begun learning about and experimenting with built-in loggers when available, but on a current project have met my logging needs via a self-rolled approach that offers real benefits.
Problem -- atomized logging
Our easyBorrow project consists of a lightweight php web interface that quickly dumps the incoming request into a database queue, where a python controller takes over, calling a series of independent web tunnelers & other web services. The whole system consists of around a dozen independent web-services of varying degrees of complexity, each with a nicely scoped focus. Most of them also write to a separate log file, which in a way makes sense, but given that the majority of these web-services serve a single goal -- to move the user's request-processing along, the atomized nature of the logs can end up being a hassle.
If something goes wrong with a request, a 'history' table does given an indication of where to start tracking down the issue -- but then I may have to look into as many as half-a-dozen separate log files to see what exactly happened. This is one of those situations where problems don't arise often enough to tackle improving the existing architecture, but just enough to make the existing one annoying at times.
Problem -- data not exposed
Keeping this background in mind, I want to note another issue that happens maybe once every three months that had a role in this new architecture with which I'm so taken. Some two years ago I implemented an automated export of requests from our iii ILS for items held in an offsite location. Those requests get exported, then parsed, and then moved to a location where a different vendor's inventory-control software takes over and presents the workers at the offsite facility a list of items that need to be retrieved.
Occasionally, very occasionally, requests don't show up for the offsite staff and I'm asked if I can confirm that the requests actually got exported and parsed and handed off to the inventory-control software. So I look in my documentation to see where the server application log files are located -- grab them and let the folk know that yes, in fact, my part of the flow worked. When this happened last month, a co-worker noted that it would be terrific if they could view the information that I'm looking up so I wouldn't have to be bothered. Unfortunately, given the existing model, that would require folk having passwords to unix servers and isn't workable. But I've ruminated upon this, and given my current evangelism of APIs and exposing data, I've thought that if I had to do that logging over, I'd expose it via a web interface.
Solution
Now I'm working on a new project, or rather tackling one that's been on the back-burner far too long: exporting newly-cataloged item information from our closed and unfriendly iii ILS into a database where we can present users with useful new-item info and feeds. Like more and more projects these days, this one has many pieces, each of which, had I done this a year-plus ago, would have logged its inner workings to a separate file.
Now though, I'm logging the export script info, the posting script info, and the parsing script info to a single database table. And because one of the scripts lives on a server that doesn't have a library setup to interface with mysql, I'm 'writing' to the db by POSTing that script's log-entry info to a url which then saves it to the db. The log-table consists of (in addition to an unseen auto-incrementing id) a datestamp, an identifier, and the log message. The 'identifier' in this case is a simple number that allows me to group the entries from different sources together in the log. When I eventually apply this beauteous system back to easyBorrow, the identifier will be the request-number the system assigns early on in the process. The function/method in each separate script that writes to the log also takes a detail-level parameter, allowing me to specify a high-level of logging detail in development code, and a low-level in the ongoing in-production code.
This system is sweet. It means that I have only to look in one place to monitor the flow of all three scripts. So if the export cron job fires off at 2am, and the POST cron job fires off at 3am, and the parser cron job fires off at 4am, I can see the whole flow in one view.
Though all developers can write to a database in their sleep, since I'm writing to a django-managed table, it is and feels even easier. For those who haven't yet drunk framework kool-aid:
log_entry = Log()
log_entry.identifier = 'the_identifier'
log_entry.message = 'the_message'
log_entry.save()
Wrapping a function around this allows my log entries to look like:
updateLog( detail='low_detail', identifier='the_identifier', message='the_message' )
But wait, in true Ronco spirit, it gets better... Since I'm writing to a django-managed table, I automatically, without writing extra code, have a complete, useful, sortable and searchable web-interface -- with built-in authentication -- which means that not only can I view the flow of processing, I can easily allow anyone else to view the flow of processing by supplying a url.
The final sprinkle of luscious magic is that django makes it very easy to overwrite the built in save method of its objects. So I've added a bit of logic to the Log object's save method to delete entries older than X days (a configurable number I've put in a settings file). There's a bit of a hack in this solution. The absolute simplest code to write in this save method is just to query for all log-entries older than X days and delete them, which is what I've initially done. But this is unnecessarily expensive database access for every single log-entry, though mitigated by the fact that for this project, the scripts run only once a day and in production, log lightly. A better approach would be to have a separate job run once a day or week and perform the deletions, and I may implement that, though I've been mentally toying with an oddly enjoyable interim hack: to have the save method come up a random number such that it would have, say, a 1 in a 100 chance of running the delete code. Bottom line, though, is that auto-deletion is taken care of right up front.
Put all of these improvements together, and the new system offers more useful, more accessible, and better-sized logging.
Comment architecture not yet implemented -- for now email comments to birkin.diana@gmail.com; selected comments will be displayed here.