Lars Wirzenius: Enemies of Carlotta, 2003

Contents

Monday, November 3, 2003

Enemies of Carlotta: List name in subject and adding Reply-To

Someone has sent me a patch to prepend the name of a list to the subject of a message, and to add a Reply-To header to point back at the list. Both are unconditional.

If I were to add such features, the minimum requirement would be that they are optional and turned off by default. However, I am opposed to both features, and won't be adding them at all.

Prepending stuff to the subject has the drawback that it makes less of the subject visible in a list of messages. For example, in mutt, a popular mail program for Unix-like systems, has room for about 40 characters of the subject. Using, say, 10 or 15 characters for the name of the list leaves very little room for the actual content of the list.

The usual reason people want the prepending feature is to help putting list mails automatically in appropriate folders, or to make it easier to visually see which mails are list mails and which are private ones. Unfortunately, it is bad for either. Suppose you write mail to a list, and the list does the prepend thing and the subject becomes "[cat-lovers] Hair styles". When people respond to your mail, the subject becomes "Re: [cat-lovers] Hair styles". This is what the subject is, regardless of whether the replies are via the list or directly to you. This is why subject tags are not good for sorting, automatic or visual.

A better way is for the mailing list software to add a special header to identify the list: List-ID: cat-lovers@liw.iki.fi. There is even an RFC for this: RFC 2919. And it's not just any old RFC, it is one in the standards track. Enemies of Carlotta naturally supports it already.

The Reply-To problem is thornier. The real problem is that mail software has no way to know which addresses are lists, and which are not, and whether non-list addreses are subscribed to the lists or not. On Usenet, things are easier: newsgroup names are in the Newsgroups header, and the e-mail address in the From header. On Usenet, if you want to send a public reply, the software knows how to send one only to the newsgroup. Vice versa: if the reply is meant to be private, the group does not get a copy.

Suppose a mail arrives via a mailing list. It might have the following address headers:

From: arthur@example.com
To: list@example.com

Typical mail user software have two different replies: reply to author, and reply to group. The former sends a reply to the address in the From header, the latter to every address it can find. If you want to make a private reply, the former works well. For a public reply, the latter is what people usually use, and it results in the following:

From: you@example.com
To: arthur@example.com, list@example.com

If Arthur is subscribed to the mailing list, he gets two copies, which is wasteful and can be annoying. It is especially annoying if the copies arrive at different times, which can easily happen.

To fix this, many people like to propose that the list add a Reply-To header, which causes only the addresses in that header to be used for replies. Unfortunately, it has problems. The addresses in the Reply-To header should be used when constructing a reply to the message. The headers might look like this:

From: arthur@example.com
To: list@example.com
Reply-To: list@example.com

If you reply to such a mail, the headers will look like this:

From: you@example.com
To: list@example.com

For a public reply, these headers are perfect, but for a private reply their are disastrous. This is the biggest problem with a list-specified Reply-To: it makes it very easy to mistakenly send a private reply to the list, which can be extremely embarrassing. Especially on a list that has a public web archive.

There are other, smaller problems, such as what should happen if the message already had a Reply-To header when the list software got it? Imagine, for example, that you send a mail from your work address, but want private replies to be sent to your home address. Reply-To is the header to use for this. If the list overrides the existing header, then replies will go to the wrong place. If, on the other hand, the list does not override, then for this message replies will work differently than for all other messages on the list: where a simple reply will go to the list for other messages, for this message they go somewhere else. This can be quite confusing.

There is no good solution to this mess. The best interim solution, not only in my opinion, is to not add a Reply-To automatically, and ask people to remove extra addresses from the recipients when making public replies. This is a bit extra work for everyone, but has the least likelyhood of causing really embarrassing mistakes.

In the long run, the best solution is to make mail software aware of which addresses are list addresses and which are not. For example, Evolution, which I use, has a third reply function: reply to list. It automatically deduces from the headers (using List-ID and other such headers) which address is the list address, and then sends the reply only there. It still doesn't work perfectly, for example in the presence of messages cross-posted to several lists, but it works pretty well.

Friday, October 24, 2003

Enemies of Carlotta: Fame and fortune

The Debian developers were discussing Nethack default key bindings and then suddenly turned around and started to talk mailing list software. My head spins when threads switch topics that fast. This time it's spinning faster than usual, since they mentioned Enemies of Carlotta. Fame! Fortune!

Sunday, September 7, 2003

Enemies of Carlotta: 1.0.3

Made a new release, 1.0.3 of EoC. A most embarrassing, though non-fatal bug fixed.

Wednesday, June 25, 2003

Enemies of Carlotta: Protecting lists from spam

I've been thinking about protecting mailing lists from spam. EoC needs this. Already the few lists I run on liw.iki.fi are spam targets. I have come to the following decisions:

  1. The goal is to optimize for usefulness and usability, not to minimize spam.
  2. Depending on the type of the list, different kinds of anti-spam measures may be considered. For example, manual moderation may work for announcement lists with no time critical content, but will really hamper a high-volume discussion list.
  3. Automatic spam recognition is never perfect. Relying on automation will let in some spam and prevent some valid messages from getting to the list. This is usually not acceptable.
  4. The list owner, or moderator, should not be burdened too much, otherwise they will become stressed and quit. Sharing responsibilities between several people helps, but it also brings some additional work in the form of team coordination.
  5. No one solution will work for every list. EoC should be flexible and adaptable to differing circumstances.
  6. Spammers adapt to circumvent spam filtering. EoC should not implement spam recognition itself, but interface with external tools.
  7. Simplistic measures may work, but usually not well and will bring new problems.

The conclusion is that EoC should be run from procmail, and procmail should use whatever existing tool the list administrators prefer to recognize spams automatically. Suspected spam can then be either forced to be moderated or in extreme cases deleted.

I use bogofilter for my personal spam filtering. It works pretty well, but requires constant education: whenever it misclassifies valid mail as spam, or spam as valid mail, it needs to be taught. Since bogofilter is most efficient when it learns automatically based on its own guesses, all errors need to be fixed, otherwise a mistake will be reinforced in the future. This would require constant monitoring by the list moderator, which is not a good idea.

Another popular tool is SpamAssassin, which uses a complicated set of rules instead of automatic learning. This should be easier for a mailing list. It requires updating SpamAssassin every once in a while, to keep up with the changing characteristics of spam, but that is much less of a chore than teaching bogofilter daily.

My plan at the moment, therefore, is to set up SpamAssassin to protect my lists on liw.iki.fi and see how well it works. When SpamAssassin recognizes a mail as spam, my procmailrc will force the message to be moderated (EoC should already have sufficient functionality for this). This should provide the least interference for discussion lists. It requires the list owner (me) to deal with every spam manually, but I can live with that for now. Once I have some experience with this setup, I can see whether there is a spamicity level where messages to the list (but not the owner) can be deleted automatically.

Friday, June 20, 2003

Enemies of Carlotta: Documentation improvements

Fixed some manual page bugs that had been reported ages ago. I've let EoC development grind to a halt, due to extreme stress at work. Not good. The stress, I mean. It's OK for a free software hobby project to have an occasional silent period.

I'll need to improve the documentation for getting EoC and Postfix to work together, since it seems to be confusing at the moment. After that, I'll make a new release and start thinking about and discussing spam prevention on mailing lists.

EoC doesn't seem to generate any serious bug reports. Of course, that might be because no-one else uses it for real. It would be nice to get some real feedback from real users of the software occasionally.