More about Battling Spam

analysis

Sep 21, 20075 mins

There's a lively discussion going on in response to my Possible Solutions for Spam posting, which was a follow-up to my essay on how I'm currently battling spam. Most of the discussion is in the comments to Possible Solutions for Spam, so please have a look there. Some of it is in email, however, and I now have permission to post some of that here. Before I do that, let me note that the Heisenberg Principle is a

There’s a lively discussion going on in response to my Possible Solutions for Spam posting, which was a follow-up to my essay on how I’m currently battling spam. Most of the discussion is in the comments to Possible Solutions for Spam, so please have a look there. Some of it is in email, however, and I now have permission to post some of that here.

Before I do that, let me note that the Heisenberg Principle is at work. I observed my spam situation, and now it has changed: spammers are targeting my personal domain, mheller.com, at an even higher level. It looks deliberate.

Maybe they are trying a DDoS (distributed denial of service) attack, in retaliation for the fact that I’m talking publicly about putting them out of business. They are [expletive deleted] morons. It’s barely a minor annoyance. I only mention it because there has been just enough of an uptick for me to notice.

On to the positive developments. Several commercial anti-spam vendors have offered to help, and several people have offered home-grown recipes. Most of the solutions offered won’t work for me given my current situation: I get mail from 8 distinct domains, and don’t have full control of the servers for any of them. I could change that for mheller.com, and run my own mail server and pre-filter, but I’d prefer not to do that.

One vendor has offered a hosted solution that might help me: Data Infocom Ltd has a hosted email pre-filtering service that operates at the SMTP protocol level. Their anti-spam product is called SpamJadoo; in addition to the hosted service, they also offer an installable product. If and when we get this set up, I’ll let you know how it works.

SpamJadoo’s approach sounds a lot like Ronald’s. Here’s some additional information that Ronald sent in email, explaining how the system works:

A syn packet arrives for port 25 on the system. It will be redirected to a firewall chain. The first one is in my case MAILLIST which includes 3 chains.

MAILWHITE //all ip source ADDRESS in here will be allowed through and the system gets out of the way. No further checks. Normally I have these guys also in my SpamAssassin white list. So no CPU time is wasted on those.

MAILGREY //ip source was not in MAILWHITE and therefore needs checking. If ip source passes the checks, an entry will be made here and all further packets can just pass by without the system doing anything. Except for close of connection, in which case the entry will be removed and the DB will be updated accordingly. SpamAssassin has the final word.

MAILDROP //the system starts to monitor network behavior and run checks in the background to see if this is a mail server trying to connect. Default DROP packets. For setup and testing just change this to ACCEPT and you can monitor how the system works without interrupting anything.

Now to the checks, btw most of this is configurable. But there are some defaults which work for me.

First bots by their very nature want to send lots of spam. So they will send a lot of syn packets in a very short time and maybe repeat that, then they give up. Mail servers on the other hand have a very distinct network behavior, try more slowly, wait, try again, wait a little longer … Time and number of packets fully configurable, if you communicate with some weird mail servers that you don’t want to put in your whitelist.

Check ip if in allowed Country, I block mostly any country except US. Even if people travel they will send mail through their companies relay. Configurable to one hearts content.

Check against a blacklist, dns names, regex dns names, ip addresses, cdir ranges. Block or not configurable. I use it for open relays and to cut down on checks, normally ISP client dns names.

Check against open connections in Greylist. Configurable number of open connections from servers I don’t expect mail from. So you got to stay in line and wait until it’s your turn, kills also some stupid DoS attempts, if they got so far.

If nothing sets the block flag, put into Greylist and wait for end of connection.
I have a rolling 3 weeks, configurable, history of all network connection attempts and flags set. And since all of this is in a PostGresql DB, you can run reports to your hearts content. Also helpful if there are any complaints. But even if somebody has an infected machine. Normally human email is send to the company/ISP relay server, which passes the tests. Only if the bot tries to send you mail directly
will it fail.
BTW I use the raw arin, apnic, ripe … ip delegation numbers to see where an ip comes from. If an ip has a DNS name from a blocked country but comes from an US ip block for example that is also blocked in my
case.

The system also blocks stealth monitoring, spammers really like to know if your system is up and running. Syn/RST and games they like to play.

I run it on Fedora 6, but it should compile and run on anything since the 2.4 kernel. I tried to write not to close to the kernel.
I used to limit the bandwidth for the Greylist, but that was a waste of system resources in my case. But if you get a lot of big mails from unexpected sources and want to keep your bandwidth open for “good”
customers. That’s another trick one can use.

Most likely I have forgotten some more tricks possible.

It’s possible that Ronald will release his source code; he has been looking it over, adding a GPL 2 license, and cleaning it up. He definitely doesn’t want to support it, so we shall have to see what he decides.

Software Development

by Martin Heller

Contributing Writer

Follow Martin Heller on X

Martin Heller is a contributing writer at InfoWorld. Formerly a web and Windows programming consultant, he developed databases, software, and websites from his office in Andover, Massachusetts, from 1986 to 2010. From 2010 to August of 2012, Martin was vice president of technology and education at Alpha Software. From March 2013 to January 2014, he was chairman of Tubifi, maker of a cloud-based video editor, having previously served as CEO.

Martin is the author or co-author of nearly a dozen PC software packages and half a dozen Web applications. He is also the author of several books on Windows programming. As a consultant, Martin has worked with companies of all sizes to design, develop, improve, and/or debug Windows, web, and database applications, and has performed strategic business consulting for high-tech corporations ranging from tiny to Fortune 100 and from local to multinational.

Martin’s specialties include programming languages C++, Python, C#, JavaScript, and SQL, and databases PostgreSQL, MySQL, Microsoft SQL Server, Oracle Database, Google Cloud Spanner, CockroachDB, MongoDB, Cassandra, and Couchbase. He writes about software development, data management, analytics, AI, and machine learning, contributing technology analyses, explainers, how-to articles, and hands-on reviews of software development tools, data platforms, AI models, machine learning libraries, and much more.

Show me more

Topics

About

Policies

Our Network

More

More about Battling Spam

More from this author

Running agents with Amazon Bedrock AgentCore

Generative AI and the future of databases

AI-assisted software development with Amazon Q Developer

Agentic coding with Google Jules

OpenAI Codex rivals Claude Code

A brief history of AI

Qwen Code is good but not great

Retrieval-augmented generation with Nvidia NeMo Retriever

Show me more

Oracle adds pre-built agents to Private Agent Factory in AI Database 26ai

Stop worrying: Instead, imagine software developers’ next great pivot

JetBrains launches AI coding agent management platform

How to build desktop apps in Typescript with Electrobun

Write and run assembly in Python with Copapy

Run AI Models Locally on Your PC — No Cloud Required (LM Studio Guide)