Unsatisfactory detection of spam postings

Hello

Like every other two or three times in a week I just now deleted a message after marking it as spam. It was one of the usual football trikot commercials. Around two years ago it was a wave of fitted kitchen sale pages.

Because of the fact, that I did that for the (literally) thousandth time, I fear, that the buttons "mark as spam (and delete)" do nothing except deleting the message (I actually did not dig into the code).

I would expect to spot the message as spam automatically when sending it because of the similarity to the known spam messages and not, that a moderator/admin has to mark it every single annoying time again as spam.

Stop Forum Spam, Akismet and Bad behavior are optional external methods. No one is bound to create accounts and/or keys of these services. The IP and badword filters are optional internal methods. But they are only accessible for the admins and not for moderators. So to maintain these lists is itself a hassle for typically one person. Does anyone of you maintain a badword and/or IP filter list?

Is there anything we can do (not only for this forum)? Anything, that's simple to use. Anything, that we can put in more than only the admins hands.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Hej,

I would expect to spot the message as spam automatically when sending it because of the similarity to the known spam messages and not, that a moderator/admin has to mark it every single annoying time again as spam.

If services like Stop Forum Spam, Akismet or Bad behavior don't classify the message as spam, it will not marked as spam. The option report (and delete) spam sends the message to one of the services (and delete the message). These spam-messages are some kind of training data but we have no influence of the decision of new postings and, thus, spam is not marked automatically.

No one is bound to create accounts and/or keys of these services.

To use Akismet, you need a key; Stop Forum Spam is limited to the number of requests; ...

Does anyone of you maintain a badword and/or IP filter list?

Yes.

Is there anything we can do (not only for this forum)? Anything, that's simple to use. Anything, that we can put in more than only the admins hands.

Enable the list for moderators, too. ;-)

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Hello

I would expect to spot the message as spam automatically when sending it because of the similarity to the known spam messages and not, that a moderator/admin has to mark it every single annoying time again as spam.

The option report (and delete) spam sends the message to one of the services (and delete the message). These spam-messages are some kind of training data but we have no influence of the decision of new postings and, thus, spam is not marked automatically.

I know that, but I don't understand why it's not classified as spam. Since more than one year we encounter at an average of two to six spam messages of this kind per week. I mark every single of these postings as spam when I get them. I assume, that also you and Alfie mark these postings as spam and I assume, this forum is not the only site, where these spam messages occure.

If these messages were signalled as spam to Akismet or SFS one time after the other (what I assume), it is unintelligible for me, why they are not detected automatically.

No one is bound to create accounts and/or keys of these services.

To use Akismet, you need a key; Stop Forum Spam is limited to the number of requests; ...

Do you know the situation in case of Bad Behavior? Seems to be a local based solution.

Does anyone of you maintain a badword and/or IP filter list?

Yes.

Do you have the impression that it improves the spam prevention rate?

Is there anything we can do (not only for this forum)? Anything, that's simple to use. Anything, that we can put in more than only the admins hands.

Enable the list for moderators, too.

That's an intentionally good idea. But that means to open (parts of) the admin area to the moderators, which is actually not the case.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Hello,

why it's not classified as spam.

This is part of the external algorithm, thus, I cannot answer this question.

Do you know the situation in case of Bad Behavior?

No.

Do you have the impression that it improves the spam prevention rate?

No, because postings that contain words, which are flagged as bad words, are blockt and not marked as spam.

That's an intentionally good idea. But that means to open (parts of) the admin area to the moderators, which is actually not the case.

...and maybe, we have more problems as before in case of false-positives.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Hello

why it's not classified as spam.

This is part of the external algorithm, thus, I cannot answer this question.

I know, I know. It was only a rhetorical question. :-)

Do you know the situation in case of Bad Behavior?

No.

Do you have the impression that it improves the spam prevention rate?

No, because postings that contain words, which are flagged as bad words, are blockt and not marked as spam.

Yes. Without a counter you'll never know, if the bad words and the IP filter works.

That's an intentionally good idea. But that means to open (parts of) the admin area to the moderators, which is actually not the case.

...and maybe, we have more problems as before in case of false-positives.

Well, it's always possible to abuse the given power as a moderator or simply to make errors. It's as well possible to overtune an automatism. In both cases false positives will occure. Nothing we or someone else could prevent.

When I was a operator of a big forum with many users and many postings per day or week, I would wish to split the work between me and a trustworthy moderator or, in a really big forum, a team of moderators. Therefore I have to give away parts of my authority. The question is, how far to go.

As a moderator here in this forum I collected experiences and my conclusion is, that I as a moderator can not access all functions I think I need. I think it would be useful to give moderators access to the blocklists (reading, writing (but not deleting datasets)). That would need a new structure with one row per bad word/IP and (IMHO) an additional column for the user name of the inserting user and (maybe) a datetime field.

Additionally I tend to add a mark-as-spam- button/-link to the options for admins and moderators in the side bar. It is annoying for me to have to open an obvious spam posting (with loading images etc.) to report it as spam and to remove it.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Hi,

I think it would be useful to give moderators access to the blocklists (reading, writing (but not deleting datasets)). That would need a new structure with one row per bad word/IP and (IMHO) an additional column for the user name of the inserting user and (maybe) a datetime field.

Maybe, it will reduce spam but I believe, we should focus on something like honey-pots to reduce spam. A blacklist isn't an effective solution, i.e. ANALysis, AnsPORN, ...

/Micha

P.S. Do you still review my changes?

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Hello

I think it would be useful to give moderators access to the blocklists (reading, writing (but not deleting datasets)). That would need a new structure with one row per bad word/IP and (IMHO) an additional column for the user name of the inserting user and (maybe) a datetime field.

Maybe, it will reduce spam but I believe, we should focus on something like honey-pots to reduce spam. A blacklist isn't an effective solution, i.e. ANALysis, AnsPORN, ...

You are right, but (no "yes" without a "but") to insert a honey pot into the registration process doesn't prevent spamming as a non registered visitor.

[edit]Misread your words in the issue. You pointed not only to the registration form but also to the posting form. Sorry[/edit]

I thought about a bayes filter based on this article about it (german language) but the article lost several links including the one to the script sources. :-(

I opened a thread in the SelfHTML-Forum about this issue.

P.S. Do you still review my changes?

Yes, I did (meanwhile). I requested only one further change (set ENGINE=InnoDB into the queries to create the new tables).

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

RSS Feed of thread

my little forum

Unsatisfactory detection of spam postings (Technics)