Saturday, March 31, 2007

Critical but very boring techie stuff


This week has been filled with very boring but critical technical stuff: working with Peter on the roll-out for the new email upgrade and getting a version control system working. Plus of course all the normal stuff of going through accounts with David. Plus of course working with a programmer from Egypt who is over here for a month working on a project. Plus of course all the normal maintenance of the system. Plus communicating with our partners about what is going on.

We never get closer than about three months away from the top of our to-do list and mostly we are way further behind than that. One of our partners today suggested that we should stop doing anything else and just focus on our to-do list till its cleared.

This is a great idea in theory, but in practise would be a disaster. The reason is this: Although Peter and I can do all this boring techie stuff it doesn't fulfil us and we see it as boring techie stuff too. So, if we just focussed on that and never did anything interesting we would get so bored we would give up. So we have to try to balance boring with some interesting to keep us motivated. Which means the boring keeps piling up. And since we lost John last Autumn the to-d0 list is increasing rather than diminishing.

So what is this boring techie stuff: Email is still an essential tool for our people and for communicating with people who want to know more about Jesus. But... and here's the but... spam is killing it off. As I mentioned in a previous post up to 96% of all email is now spam. So, because we handle about 50,000 real emails a month we have to also deal with up to one-and-a-quarter million spam emails per month. Yes, you did read correctly that is about 1,250,000 junk emails to deal with! Per month. Actually is not quite that number because the 'real email' count includes a few junk emails that slip through the net.

The aim is to get rid of most of them before the end user sees them. I get approximately 1,300 emails per month, so if we didn't have a spam filter running on the spam percentages above I could get more than 16,000 emails per month. Say it took me 5 seconds per email to open it and check if it was spam and then delete it because it was spam, that would take me about 80,000 seconds or 22 hours of clicking, reading and deleting. I would get almost nothing done at all! As the filtering works fairly well I can spend less than 30 minutes per week clicking, reading and deleting spam. [As an aside our top user of the email system received over 6,700 emails this month - most users receive low hundreds of emails per month.]

Any improvement we can make gives me and the 300 other people more time to do something useful. The effective cost is horrendous. 300 people all wasting time. Amongst us we are probably losing the equivalent of about 50 man hours per week in clicking, reading and deleting spam. If someone were paid the minimum wage in the UK of £5.50 this would be costing over £14,000 per year. Alongside this is the cost of Peter, Alex and I implementing tools to try to reduce this to as low a level as possible. Spam is costing the world economy millions of millions of pounds per year. It's a total waste.

Anyhow... the upgrade we have implemented is called 'greylisting'. It's a clever idea based on the fact that almost every spammer is running a special program to deliver all their junk around the world as fast as possible and is not running a proper MTA. MTA stands for 'Mail Transfer Agent' and it's the program that runs on a server to handle email between users. MTAs are designed so that if they cannot deliver on first attempt they keep trying for a few days... but spammer programs don't do this.

So what greylisting does is this: When someone new tries to deliver to us our MTA responds with 'We have a temporary problem please try back in a few minutes' and we log the message as attempted delivery. A spammer program goes away and doesn't come back because its only aim is to deliver as many as possible as fast as possible... a few failures doesn't matter. An MTA on the other hand does try back in a few minutes and when it does so, we match it with the previous attempt and this time our MTA says 'OK, we'll take it this time' and we also log the MTA as valid so that next time it tries with another email from [hopefully] a valid user then we will accept it immediately. [There is also a special website all about greylisting if you are interested].

The greylisting system isn't perfect, but it does kill most spam and the only impact is to slightly delay email from valid users the very first time they try to send it. Alongside this there were two other upgrades called SPF and Domain Keys. Read the links if you want to know more. They make greylisting look positively simple to understand.

The second techie thing I was involved with is setting up a version control system. Version control systems allow us to have a repository for all the programs we or others write and all the configuration files we use. Every time we make a change to a file or program we update the version control system. It stores the changes and because it knows of the changes we make each time we can rewind what we have done and other members of staff can see what the changes were that were made between different versions. All sounds clever and is something we should have implemented a year or two ago, but have been too over-run with work to do so.

The one we will be using is called Subversion or SVN for short. Subversion is a pun on a tree of versions... The internet is full of horrid puns, which I suppose is the techie way of staying sane. For instance there is a robot program to use with Subversion called CIA. Why CIA? Because CIA monitors and informs on subversion .

That same partner [from paragraph 2] and I wondered about spam... we've never seen anything remotely helpful in any spam message so cannot understand the mentality of people who send them. But then my wife was explaining the other night about the forums for support for Google AdSense, where people were saying they had bought a site and put Google AdSense on it and couldn't understand why they were not making any money [hmmm... maybe because you have no content on the site?] . Now we know why people send spam, it's these same people who are clicking on the spam and expecting miraculous growth in certain parts of their anatomy. Pity its not growth in their brain cells.

One problem is that although there have been laws in existence for the last 5 years to protect us from spam [EC Directive 2002/58/EC], almost nobody manages to sue the spammer. This month only the second company in the UK was successfully sued for damages resulting from sending spam. And the amounts in both cases were piteously small: the first award was £300 and the second £1350. What we need is a significant number of spammers behind bars for many years to deter people from trying it on.

Ok, so light relief time: Found this wonderful spam cartoon on http://www.ChristianLinksExchange.com


No comments: