Lars Wirzenius: November, 2003

Contents

Tuesday, November 18, 2003

Quote: Yannis's Law

Yannis's Law:

Programmer productivity doubles every 6 years.

He even has the numbers to produce it. Even if he provocative, he might be onto something.


Oliotalo: Remote debugging

The wonderful world of remote embedded development: I'm debugging a piece of code, written in Hedgehog Lisp, which works fine on the test box at the office and crashes on the production box somewhere in the middle of a forest. I have no access to the debugging logs the box writes to a serial port. I do have, however, access to the logs of the web server from where the box fetches its byte code. Also, I can (most of the time) make the box send me an SMS message.

I therefore try to pinpoint the exact place where the code crashes in the production box by watching differences between timestamps for when the byte code is fetched. If the difference is less than two minutes, the program crashed. If it is ten minutes, and I get an SMS, then it worked.

Monday, November 17, 2003

Quote: Commenting code

I was reading an old thread on commenting code in comp.programming. The funniest example was this:

i++; /* TODO: make this pre-increment - more efficient */

That comment is wrong in so many ways it is impossible not to laugh at it. The funniest stupid comment I have experienced, however, is this:

/* The following line has been commented out */

Presumably someone, somewhere collects such comments on a webpage.

Sunday, November 16, 2003

Oliotalo: Long day

Spent over 24 hours at the office, starting on Thursday afternoon. One long hacking session, mostly, but towards the end it was mostly waiting for the customer to get the box installed and start testing. They had some trouble with it, so in the end they didn't get to test much. More to happen on Monday, if I don't feel too weak from the flu.

This was the first all-night hacking session since the bankruptcy of Wapit two and a half years ago. It felt pretty good, while it lasted. Combined with a flu it was rather heavier than I thought, and it's taken Friday and Saturday to recover. Not a good investment in general, I guess, though it was worth it this time.


Lodju: 1.99.7

Made a new release, 1.99.7. It should now be feature complete with regard to a 2.0 release, and hopefully there aren't any showstopper bugs, either. I'll make the actual 2.0 release in a week or so. This should give me time to update the web site, e.g., with new screenshots and stuff.


Debian: convmv

Sponsored an upload of convmv, a tool for changing the character set or encoding of filenames (not the contents of files). I'm going to need it once I start converting to UTF-8. Also fixed a bug in publib (upstream and in Debian), specifically its base64 functions. This triggered a "failure to build from source" bug, which I also fixed. All in all, relatively speaking a very productive weekend as far as Debian is concerned, though in absolute terms it's pretty much nothing.

The convmv program reminded me that I need to start looking for a modern, Unicode-capable editor in earnest.

Wednesday, November 12, 2003

Random thought: Headhunters

The economy must be getting better. I'm getting more headhunter calls this month than all of last year.


Rant: Embedded Linux

The more I spend time in the embedded area of computing, the better I understand the strong craving many people have for Linux as an embedded operating system. Analysts seem to think it is because the marketplace wants a standardized solution or because Linux is cheaper than the alternatives. I say it is because traditional embedded operating systems are written by hardware people who just barely understand programming and utterly despise programmers. No-one in their right mind wants to touch such operating systems. They're difficult to program for, and they do little to help you.

Even though Linux itself can occasionally be aggravating, at least the application programmer does not have to fight daily with code written by hardware engineers. Given that hardware people and software people are natural enemies, perhaps it is not surprising that the hardware people just love to put in nice surprises for the application programmer. Things like, say, magic and undocumented limits on how much memory you can use for static variables.

I've spent yesterday and today finding out that if a particular four kilobyte array is allocated statically, it will overlap with the C runtime stack. Allocating it dynamically makes things work. Blah. (Of course, to make things even more interesting, it might be that I've merely hidden the problem. We'll see.)


Quote: Dijkstra

Edsger W. Dijkstra is always a good source of quotations. Today, I found How do we tell truths that might hurt?, which he seems to have written in 1975. My favorite bullet point:

Simplicity is prerequisite for reliability.

I run into this all the time in my job. Whenever we do something complicated on the embedded side, things break. Whenever the box manufacturer has done something complicated, things break.

Monday, November 10, 2003

Review: To say nothing of the dog by Connie Willis

To say nothing of the dog by Connie Willis was recommended to me about as strongly as any book, ever, and by several people. This may have been the reason why I failed to get really enthusiastic about it.

The story is actually quite fun, and there is both suspension and romance, both of which I'm addicted to. It is well written and, given the implicit assumption that time travel is possible, feels quite realistic. It's just that after all the hype I was expecting something that would blow my mind, and it didn't. It's definitely worth reading, but I'm not joining the fan club.

I've got to stop listening to people's opinions about books I haven't read, I guess. I already do it for movies.


Random thought: Discussions

I seem to be unable to discuss deep or important things intelligently with most people when I get older. My patience is now very short with people who say stupid things and argue by twisting my words into something they can attack, and that's how discussions with most people end up. I prefer the kinds of discussions where there is a desire to understand the other party, even if you don't agree, and no twisting or putting words in each other's mouths. A good discussion is not a battle, but it could be brains having a romance.

Thursday, November 6, 2003

Rant: Systematic API design

Imagine this: you're designing the system software for an embedded computer. Your hardware has various interfaces, such as analog and digital inputs, serial ports, various buses popular in the embedded world, and communication devices such as a GSM modem. Your task is to add support to the library for each of the subsystems. How do you design your application programming interfaces?

What you could do, is invent a new style of interface for each subsystem. For example, for the serial ports, you might want the application programmer to write a handler, which the interrupt routine for the serial port will call when the input buffer gets full. The TCP/IP subsystem could send an inter-task message to the main application task, which then fetches the actual input string from a buffer via a system call. The I2C bus is simple, so it gets a library function that just sends a message, or queries a device, as an atomic operation. And so on.

Just for fun, you could make the GPS unit be accessible via a serial port, but make it be just slightly different from all the other serial ports. A perfect way would be to not use the buffer the application programmer has defined, and instead pass a pointer to the actual buffer to the handler function called by the interrupt routine.

Of course, to make it easier to use the GPS unit, you will provide a nice module to deal with the unit and parse its output into a format that is easier to handle than the string encodings. To make things more interesting, however, you could put the module outside the library, so that it has to be compiled manually by the application programmer. You'll make the module send the GPS location as an inter-task message. To make things absolutely hilarious, you will make it wrong to manually free the memory allocated for this inter-task message, even though all other messages must be so freed.

Additional comedy can be generated by naming your C functions and types with semi-random prefixes. Say, occasionally capitalize the prefix you use, and occasionally change the prefix into a suffix.

If you document all your interfaces, everything should be fine, yes? In an abstract way, yes, everything should be fine. In practice, your system is a nightmare to write applications for. When nothing in the system is orthogonal, every little detail must be checked from the documentation, header files, examples, or other places. Not just the first time, either, but every time. Un-orthogonal interfaces are hard to remember, as well as being hard to learn. They make development slower, and cause more bugs. Evil things. Evil.

All idiocies in this rant are completely fictional, of course. No-one would write such a system in real life. Of course not. That would be stupid. Yes, indeed it would, and so they didn't. Please?

Wednesday, November 5, 2003

Review: ANSI Common Lisp by Paul Graham

I got ANSI Common Lisp by Paul Graham as a birthday present a few weeks ago. I've been reading it on the way to work, and when eating alone in restaurants, and so on. I'm not sure I'll ever write anything in Common Lisp: at least on the surface it doesn't seem like my kind of language, being big and full of nifty but ugly tricks and stuff. However, the book is very nice, and it broadened my mind about programming languages in general, which is always a good thing. Also it gave me some ideas and inspirations for the further development of Hedgehog Lisp.

If I were studying Common Lisp, I would love the book. It is clear, logical, occasionally funny, and has good examples. It doesn't just explain what features the language has, but gives advice on how to use them properly.


Review: Computer networks, 3rd edition by Andrew S. Tanenbaum

I bought Andrew S. Tanenbaum's Computer networks, 3rd edition used, from a friend, a couple of years ago. Since then, I've had it in my bathroom and read it when attending to the final steps of digestion. As usual for Tanenbaum, the book is very well written, both by being thorough and correct, and also by being fun to read. The edition I have is perhaps a bit old by now, but on the other hand, not all that much has changed in the world of computer networks since 1996. Computers and networks have become faster, and the Internet has become bigger. Abuse is more common. Wireless networking is more common. More stuff happens on the net, but nothing really revolutionary, on the networking level, has happened. That's all right - the interesting stuff happens on the application level, which is mostly outside the scope of this book.

The book is perhaps a bit on the heavy side for a general introduction to computer networking. It is, after all, a university level textbook. For anyone already into the technical side of computers it should be simple, and even if the book is somewhat thick, it is easy enough to read, and should be a fairly quick read, if you need to learn about networking in a hurry. (It's not so quick if you read a page or two at a time, while sitting in the toilet.)


Review: Nokia 3510i

I bought a new phone yesterday. My Nokia 7110 has never been a particularly good phone, but in the past months it's been becoming worse: the lid only works part of the time, and even at its best it causes disturbances during a call. Signs of age, it is after all four years old. So I went and bought a Nokia 3510i. It's cheap, and it seems to fill all of my requirements for a phone from October 8.

It's not perfect. My biggest gripe is that except in extremely good light, the screen is unreadable unless backlit. If I wanted to, say, check the time of day, I first have to press a key or unlock the keypad to make the screen backlit. Of course, when the screen is backlit, it is pretty good.

The other big gripe is that it requires more keystrokes to start writing a new text message than the old phone. I'll live with that. Also the phone has lots of useless stuff in it, which I wish it didn't, but doesn't seem to cause any bad effects yet. The keyboard is somewhat rubbery and insensitive, but I'll with that as well.

On the other hand, it has the perfect feature that no-one calls me anymore. Ever since I moved my SIM card to the new phone, I've only received one call. This is nice.

Unless I find something really badly wrong with this phone, I'll hopefully be able to use it for the next three or four years at least. That would be nice. I hate having to replace my phone all the time.

Tuesday, November 4, 2003

Random hacks: Kirjavaliot fortune cookies

Niksu maintains Kirjavaliot, a list of humorous one-liner summaries of literary works. (In Finnish, sorry.) I wrote a small XSLT script and Makefile to convert the raw XML data file to a data file for fortune. Not the world's most useful thing, but fun.

Henri Charriere, Papillon: Vanki karkaa. Vanki karkaa. Vanki karkaa. Vanki karkaa. (_1_)

See my programs page for the code.


Photography: Short introduction to photography

My friend Janka got a camera as her birthday present. She's very happy about this, and this inspired me to write a short introduction to photography so as to not keep flooding her on IRC with opinions on photography. Perhaps it is of some use for someone, though of course the net is full of photography tutorials. At least mine tries to be very short.

Monday, November 3, 2003

Enemies of Carlotta: List name in subject and adding Reply-To

Someone has sent me a patch to prepend the name of a list to the subject of a message, and to add a Reply-To header to point back at the list. Both are unconditional.

If I were to add such features, the minimum requirement would be that they are optional and turned off by default. However, I am opposed to both features, and won't be adding them at all.

Prepending stuff to the subject has the drawback that it makes less of the subject visible in a list of messages. For example, in mutt, a popular mail program for Unix-like systems, has room for about 40 characters of the subject. Using, say, 10 or 15 characters for the name of the list leaves very little room for the actual content of the list.

The usual reason people want the prepending feature is to help putting list mails automatically in appropriate folders, or to make it easier to visually see which mails are list mails and which are private ones. Unfortunately, it is bad for either. Suppose you write mail to a list, and the list does the prepend thing and the subject becomes "[cat-lovers] Hair styles". When people respond to your mail, the subject becomes "Re: [cat-lovers] Hair styles". This is what the subject is, regardless of whether the replies are via the list or directly to you. This is why subject tags are not good for sorting, automatic or visual.

A better way is for the mailing list software to add a special header to identify the list: List-ID: cat-lovers@liw.iki.fi. There is even an RFC for this: RFC 2919. And it's not just any old RFC, it is one in the standards track. Enemies of Carlotta naturally supports it already.

The Reply-To problem is thornier. The real problem is that mail software has no way to know which addresses are lists, and which are not, and whether non-list addreses are subscribed to the lists or not. On Usenet, things are easier: newsgroup names are in the Newsgroups header, and the e-mail address in the From header. On Usenet, if you want to send a public reply, the software knows how to send one only to the newsgroup. Vice versa: if the reply is meant to be private, the group does not get a copy.

Suppose a mail arrives via a mailing list. It might have the following address headers:

From: arthur@example.com
To: list@example.com

Typical mail user software have two different replies: reply to author, and reply to group. The former sends a reply to the address in the From header, the latter to every address it can find. If you want to make a private reply, the former works well. For a public reply, the latter is what people usually use, and it results in the following:

From: you@example.com
To: arthur@example.com, list@example.com

If Arthur is subscribed to the mailing list, he gets two copies, which is wasteful and can be annoying. It is especially annoying if the copies arrive at different times, which can easily happen.

To fix this, many people like to propose that the list add a Reply-To header, which causes only the addresses in that header to be used for replies. Unfortunately, it has problems. The addresses in the Reply-To header should be used when constructing a reply to the message. The headers might look like this:

From: arthur@example.com
To: list@example.com
Reply-To: list@example.com

If you reply to such a mail, the headers will look like this:

From: you@example.com
To: list@example.com

For a public reply, these headers are perfect, but for a private reply their are disastrous. This is the biggest problem with a list-specified Reply-To: it makes it very easy to mistakenly send a private reply to the list, which can be extremely embarrassing. Especially on a list that has a public web archive.

There are other, smaller problems, such as what should happen if the message already had a Reply-To header when the list software got it? Imagine, for example, that you send a mail from your work address, but want private replies to be sent to your home address. Reply-To is the header to use for this. If the list overrides the existing header, then replies will go to the wrong place. If, on the other hand, the list does not override, then for this message replies will work differently than for all other messages on the list: where a simple reply will go to the list for other messages, for this message they go somewhere else. This can be quite confusing.

There is no good solution to this mess. The best interim solution, not only in my opinion, is to not add a Reply-To automatically, and ask people to remove extra addresses from the recipients when making public replies. This is a bit extra work for everyone, but has the least likelyhood of causing really embarrassing mistakes.

In the long run, the best solution is to make mail software aware of which addresses are list addresses and which are not. For example, Evolution, which I use, has a third reply function: reply to list. It automatically deduces from the headers (using List-ID and other such headers) which address is the list address, and then sends the reply only there. It still doesn't work perfectly, for example in the presence of messages cross-posted to several lists, but it works pretty well.