Lars Wirzenius: Random hacks, 2005
Contents
- December 18: deb-from-bzr
- August 10: Xchat whiteout
- July 01: mini-fortune
- May 11: Encrypted laptop disk, Web log software update
- March 28: Gave away SoundConverter
- March 27: RSS generation and ampersands
- February 17: SoundConverter - new developer?
- January 21: Mail processing
Sunday, December 18, 2005
Random hacks: deb-from-bzr
I wrote a little script to help me maintain packages using bzr (bazaar-ng). It builds source and binary packages and tests them with lintian, linda, and piuparts.
Wednesday, August 10, 2005
Random hacks: Xchat whiteout
I have long been reluctant to use /ignore
on IRC. I do not have a moral problem with ignoring unpleasant
people, but I do have a practical problem: it can be confusing
to see only responses to the ignored person and it is not
always clear from context that they are responses. To reduce
this confusion, it would be better if what the ignored person
writes is shown with the same foreground and background
color. I'm told irssi can do this already, but I like xchat
and I don't want to switch, so I wrote
a
whiteout plugin. I've only just started to use it, so
it may well be buggy.
Friday, July 01, 2005
Random hacks: mini-fortune
I'm slowly collecting a fortune cookie file of my own.
It is quite small, and I don't feel like setting up an index
file with strfile
, so I wrote a trivial
mini-fortune
file instead. It turned out to be
quite simple:
#!/usr/bin/python import fileinput, random def process_cookie(cookie, chosen, cookies): cookies += 1 if random.randint(1, cookies) == 1: chosen = cookie return chosen, cookies chosen = "" cookies = 0 cookie = "" for line in fileinput.input(): if line == "%%\n" or (chosen and fileinput.isfirstline()): chosen, cookies = process_cookie(cookie, chosen, cookies) cookie = "" else: cookie += line if cookie: chosen, cookies = process_cookie(cookie, chosen, cookies) print chosen,
The algorithm is explained here: Perl Cookbook: Picking a Random Line from a File. It reads through the entire file, but since all the machines I expect to run this on are relatively fast, even a few hundred kilobytes of fortune cookies (and that's a lot of them) isn't significantly slow.
Wednesday, May 11, 2005
Random hacks: Encrypted laptop disk
I re-installed esme, my laptop, again. This was my third
installation of Debian on it since I bought it just before
Christmas. The reason for the re-installation this time
was that I wanted an encrypted hard disk. Luckily, the
cryptsetup
and kernel packages in Debian made
this really, really easy. Yay for their respective
maintainers.
Here's a brief summary of what I did: first, I wiped out
everything on the hard disk, and installed Debian from
scratch. I made three partitions: a half-gigabyte one for
/boot
, a gigabyte one for swap space, and the
rest for root. I then installed a small Debian into the swap
partition, using the sarge RC3 netinst CD. Then I switched
sources.list
to use unstable, and installed
cryptsetup
and a 2.6.10 kernel (since certain
parts of my laptop hardware require that), and rebooted.
After this, I just followed the instructions in the
CryptoRoot.HowTo file in the cryptsetup
documentation directory to make the third partition
encrypted and move over the installation to that. Finally,
I converted the swap partition into an encrypted
swap space (following instructions in CryptoSwap.HowTo).
So far everything works fine. The system does not even feel any slower than it was: as long as there is no heavy disk I/O, everything is snappy. Any heavy disk I/O kills interactivity, but that happened beforehand as well. I don't do much heavy disk stuff on my laptop anyway.
End result: hopefully my laptop is a tad more secure in case it gets stolen (when turned off or with the screensaver running at least). I do have yet another password to remember, though, which is annoying, but worth it.
Random hacks: Web log software update
A few days ago I made some changes to my web log software. I don't think anyone noticed, which is at it should be. I admit that it is easy to make mistakes when writing software that generates RSS files, and that this is why so many people accidentally flood web log aggregators such as Planet Debian when they update. It is, however, annoying and avoidable.
One of the things I made into my software a couple of years ago is a limit to how many entries it puts into the RSS feed. Since feeds are supposed to be polled fairly often, having only at most, say, four days' worth of entries seems reasonable. Thus, if the software makes a mistake, there won't be all that many entries flooded anyway. Damage control is good.
I also used my web log software as the first guinea pig for testing Bazaar, one of the implementations of the arch version control system. I may put it up for others to access one day, after I figure out the details for that. It's still pretty specific for my needs, though. I don't have any intention of starting to compete in this area.
Monday, March 28, 2005
Random hacks: Gave away SoundConverter
Some time ago I decided that I didn't have the time and will to hack on my SoundConverter application. It did what I wanted it to do, but people wanted MP3 encoding support, and maybe some other things as well. Gautier Portet offered to take over the program, and he has now released two new versions. See http://soundconverter.berlios.de/ for the new versions.
I didn't manage to run version 0.7 on Debian yet, but version 0.6 works, and has MP3 encoding. After I do get it running on Debian again, I should probably offer to package it.
Sunday, March 27, 2005
Random hacks: RSS generation and ampersands
Jose
Carlos Garcia Sogo and Mike
Beattie write about ampersands in Jose's log breaking Planet Debian and its
aggregated RSS feed. I didn't find the problematic RSS
snippet anymore, but here's what I think the situation is:
RSS (at least version 2, which I use, and I think the others
as well) requires the HTML content to be entity escaped. In
other words, if you want an ampersand in the final output,
the HTML to create it must be &
, and
this must then be encoded in the RSS file as
&
.
You have to do the same escaping for the less-than and greater-than characters as well.
I struggled with this a couple of years ago when I wrote my own web log scripts. I wrote them because I wanted to have the web log pages integrated with the rest of my pages, and because I am a NIHolic, but in case they are of use for anyone, I put a tarball up. Note that they are likely to not work for you directly, but they might be helpful in looking at how RSS is generated.
For debugging RSS feeds, I found feedvalidator.org extremely useful. It doesn't respond to me now, but I hope that is temporary. It is a validator for RSS and Atom feeds, and validation is most helpful when you are unsure if your stuff is correct or not.
Thursday, February 17, 2005
Random hacks: SoundConverter - new developer?
My SoundConverter program has not seen any development for some time now. Particularly MP3 encoding support would be nice, plus various tweaks to the user interface to make it follow the GNOME Human Interface Guidelines better. I'm sure there are other things as well that could be improved. I don't seem to find interest in it, however, since the program works quite well enough for me and I am quite busy enough with other projects.
Would someone like to take over it? It is a simple thing, written in Python, using PyGTK, libglade, and GStreamer. Mail me, if you're interested.
Friday, January 21, 2005
Random hacks: Mail processing
I've now been using crm114 instead of bogofilter for a few weeks. On the whole, it seems to me that crm114 filters about as well as bogofilter, possibly a bit better, except that it has not yet learned that certain kinds of mails are not spam. Specifically, Debian bug reports and mails about DebConf5 are often classified as spam, though not always.
On the whole, then, I'm happy with crm114. It promises to not break frequently due to database on-disk format changes, the way bogofilter does. This was my main reason for switching.
While on the subject of mail processing, I thought I'd
describe what I do. All my private (non-work) mail to
various addresses is redirected to one place, since it is
easier for me to deal with just one inbox rather than
several. At this one place, my mail server, I run
procmail. Over the years, my
.procmailrc
has variously been really
complicated and really simple. I prefer simple, since things
break less that way. The first procmail rule I have is one
that makes a copy of all incoming mails in an archive
folder. This is important: as long as this rule works, all
incoming mail can be retrieved from the archive folder even
if later processing breaks.
:0c backups/mail/backup-`date +%Y-%m-%d`.maildir/
The way the rule is written, the archive folder is per-day. The next rule filters the mail through crm114:
:0fw: .msgid.lock | /usr/share/crm114/mailfilter.crm -u $HOME/.crm114/
This rule makes procmail use the output of crm114 for the remaining rules. Spam is then put in a separate folder, which I occasionally check to correct any mistakes crm114 makes.
:0 * ^X-CRM114-Status: SPAM.* spam.maildir/
crm114, as packaged for Debian, was not quite as nicely
to set up for use as bogofilter is. I ended up creating
a directory ~/.crm114
and making symlinks there
to /usr/share/doc/crm114/examples/crmfilter
.
mailfilter.cf
needed to be copied so I could
change it, of course.
Sometimes crm114 makes mistakes. When it thinks a valid
mail is spam, the mail is put into spam.maildir
,
which I read with mutt. To teach crm114
I move the mail to not-spam.maildir
and run
a script that does the teaching. I have a macro in mutt
to do the moving easily:
macro index S "<save-message>~/not-spam.maildir/\ny"
The crm114 re-education script:
#!/bin/bash find $HOME/not-spam.maildir/{cur,new} -type f | while read mail do /usr/share/crm114/mailfilter.crm -u $HOME/.crm114 --learnnonspam \ < "$mail" > /dev/null mv "$mail" $HOME/Maildir/new done
When crm114 misclassifies a spam as valid mail, it gets downloaded to my laptop by Evolution. In Evolution, I move the spam to a folder (called "is too spam" in Finnish), which a script then copies to the server and runs through crm114. Similarly for
#!/bin/sh -e base=$HOME/.evolution/mail/local/Inbox.sbd doit() { if [ -s "$base/$1" ] then echo "$1..." formail < "$base/$1" \ -I X-CRM114-Status -I X-CRM114-Action -I X-CRM114-Version \ -s ssh pieni.net /usr/share/crm114/mailfilter.crm \ -u /home/liw/.crm114/ "$2" < /dev/null fi } doit "Onpas roskaa" --learnspam
On the whole, this setup works pretty well. It is not quite as smooth as using a filter well integrated to Evolution would be (and there is one), but I need the filtering done on my mail server, since the filter also protects the mailing lists I run.