I think that in order to defeat the similar spam attack that you mentioned, it is not the training method that is important, but what functions you train. I use Fidelis Assis OSBF-Lua , which is a very successful filter: it holds winning contests for spam filters. He uses Bayesian training, but I believe that three principles are the real reason for his success:
He trains not on a single word, but on sparse bitrams: a pair of words divided by 0 into 4 words “don't care”. Spammers must post their message somewhere, and rare Broadramas are very good at understanding them. He even finds spam spam!
He provides additional training on message headers because it is difficult for them to mask spammers. Example: a message that occurs on your network and never passes through a host outside the network is probably not spam.
If a spam filter has low confidence in its classification, it requests information from a person. (In practice, he adds the “Please teach me this message” header field, a person can ignore the request.) This means that when spammers change new methods, your filter will develop to match.
This combination of methods is extremely effective.
Disclaimer: I worked with Fidelis to refactor some software so that it could be used for other purposes, such as categorizing regular mail into groups, or perhaps one fine day, trying to detect spam in blog comments and other places.
Norman ramsey
source share