You can use a similarity metric (dis), such as edit distance . For example, the editing distance between vi.agra and viagra is 1.
Then you determine that the given word matches the spam word if the editing distance between them is below a certain threshold, for example, 2.
, - /[^a-zA-Z0-9-\s]/ . , - viZagra , viagra.