
A highly unscientific signal-to-noise experiment in spam analysis.
I’ve heard stories of people getting innumberable amounts of spam mail. I’ve heard the horror stories of Hotmail accounts filling up faster than a person can refresh the page. I’ve never quite understood the validity of these statements though. I wanted to explore how bad it really was from my perspective, as I had never done before. The results were not pretty.
To truely understand how bad spam is, I decided to stop doing what most normal people do – deleting it. I saved all of my spam email for an entire week. I wanted to see just how bad it really was.
But first, a little background on the popularity of my email address.
My email, “jeff @ creatimation.net”, has been around for well over 2 years now. Back in the day, I didn’t know better (ah, youth) so I just posted it all over the place. Needless to say, this account is now spammed to death. (That’s why I have no issue posting it here.) I get every kind of email you can imagine. The best so far was one selling me a bigger penis, bigger breasts, better self-confidence, help to quit smoking, and the secrets to better sex all in a single MP3! Sweet!
However, I have used other various addresses over time for this that and the other thing. And one of the nice features of owning a domain is that you can setup a “catch-all” email system. With a catch-all, any email that is sent to the domain is forwarded to a (or several) account(s). This way I can just say, “Yes, Mr. President, my email is ‘Mr.Jeff.Sir @ creatimation.net’…” and the message will still get to me. Very handy.
But this means that all those spammers looking for “contacts@” or “info@” or “jack-asses-who-love-to-spam-me@” will still get their emails to me. What a bummer!
My other address, which I won’t post, has not been harvested yet. I’ve kept better tabs on it so far, and it’s doing all right.
Down to the results.
| Email Account |
Real Emails |
Spam Emails |
Ratio of Real Email |
| jeff at cre/atimation.nt |
22 |
819 |
2% |
| jefzorsemail at the/code[]pro.cm |
20 |
158 |
13% |
| Totals |
42 |
977 |
4% |
96% of the email I recieve is spam
Astounding. I never thought it was that bad really – I just deleted it when I checked my mail. But that spam really does build up – and it’s not just annoying either. It’s a real serious problem. When I recieve spam, it’s using my bandwidth. If a text message takes up, oh say, 3k, that’s not so bad. That’s about 126k for real email transmission – totally fine.
At 3k a message, spam would be 2.8Mb of transmission. That’s not such a bad figure, except that spam is filled with viruses, html, images, and a plethora of other assorted crap. So let’s change the message size estimate to, 70k (and that’s conservative).
70k * 977 = 66.7Mb of crap, uh, I mean “Unsolicited Email”
Not only is that wasting my server bandwidth, but if anyone had to sort through all of this email over a dialup connection they would simply give up email. Which brings me to my next point.
Why do I even bother to continue using email?
Seriously, with a signal-to-noise ratio of 4%, why should I even continue using it?
Quite simple, as far as I know, no better system exists yet. Email is the defacto standard of internet communication. “What’s your email” is almost as popular a phrase as “Let’s trade numbers.” The only medium that is nearly as popular is IM systems, lead at the fore by AOL Instand Messenger – or simply: AIM. However, since there is no “offline” feature in AIM (despite it being in ICQ which is now owned by AOL), it couldn’t be used in the same way – nor would attachments work as well.
Finally, the 4% of email I recieve is actually very important and useful.
Here’s what I’m doing about it
I am not sitting still on this matter either. People have lots of complicated methods for filtering spam. Address filters, word catches, fake sender detection, etc etc. All of these things work on the server’s side preventing you from ever seeing the spam at all.
Those types of solutions are all well and good – but I don’t like them. Paranoid as I am, I worry that those filters are going to catch something that was a real email. Dang it.
Instead, I download all my email – right into my trash can. That’s right, all my email goes straight into the trash, then I have a filter in my email client that automatically brings the email into my inbox, if – and only if – you are in my address book. Of course, if I get email from someone new I run the risk of lossing it, however it’s much easier to find this missed email in my trash can then in the abyss of the server side filtered email.
Oh yeah, these results are skewed…
…in favor of spam.
Yeah, I said that right. See, I explained earlier how I can have mail sent to any made up account on my server and I still get it. Well for “accounts” that get a lot of spam, I just filter all the email from that account out at the server side. I know, I said I was against server side filtering – but for the email address “junk aht creatimation.net” I can live without getting the email. Anywho, these results are skewed in the favor of spam because the numbers would be much, much higher on the spam side if it weren’t for this practice. So I’m going to give this a real try and see what happens to the numbers once I take these filters out of place. We’ll be able to get a real feel for the damage spam is doing.
Check back in a week for the horrifying conclusion to “When Spam Attacks!”