Bots or humans? (General)

by Auge ⌂ @, Sunday, November 04, 2018, 17:28 (572 days ago) @ Alfie


@Alfie: As first, thank you for your IMHO clear and understandable description of the situation.

Now @all:

Are you talking about registrations (without activation) or “true” spammers? The former ones should be deleted within 24 hours anyway.

There is a second "or". Beside the never activated registrations [1] and the registraions of "real" or "true" spammers, who activate the account within seconds or minutes and start with their first postings [2], there is a third group of registrations that get activated but stay silent for weeks or months without a further login.

One as a forum operator is not able to distinct between a silent reader or lurker of the forums content or a spammer who hits the forum after a siginificant amount of time after the registration. In the second case it happens often with several to hundreds of entries within a few minutes [3] and one will only know then, that this is a spammer.

The captchas available are fairly useless …

Agree. I had an “improved” version of the math-captcha (where the numbers to add are given as words) for ages. Useless.

Alfie discussed this with Milo and me in a Github pull request before. We came to the same conclusion (I hope, I'm speaking for all of us here). Captchas are useless and/or inaccessible in several cases.

Lesson learned: It takes bots just a fraction of a millisecond to break the captcha.

According to my server-logs it takes humans (yep, and the true spammers as well) about one minute register. I’m considering to throw a nice

exit(header("HTTP/1.0 403 Forbidden"));

if the registration takes less than one second – and switch off all other filters

That seems to be an adequate replacement for the not working checks but I see one problem with (only) it. Forum operators without "our knowledge" will ask why captchas got removed and will demand the reimplementation of captchas and further actions. While there is no enjoyment in explaining the uselessness of captchas again and again, I would not count alone on timing checks but …

I have now (query a recent local copy of StopForumSpam’s banned IPS, query remotely SFS and BotScout).

… find it useful to side the action with further procedures like checks against local copies of lists of banned IPs and e-mail-addresses from providers like Stop Forum Spam. Problem here may be forum hosters, who forbids script-based requests to other, foreign servers.

Would be great if reCAPTCHA was an option, as I find it to be one of the most effective ways to limit spam accounts these days.

What about accessibility and Google’s questionable data protection?

There is by design no data protection within a Google service because data is the currency we have to give for Google's services. And in the case of a ReCaptcha on a forums page it is not the decision of the visitor to give her/his own personal data, it is our decision as developers and forum operators to force our visitors to give the data to use our sites and forums or not to use our sites and forums.

That's IMHO a nogo as subject of the jurisdiction of the EU and the GDPR [4]. We developers are subjects of the EU jurisdiction because we live in the EU and IMHO we should respect the GDPR, what also makes the data of users, living outside the EU, a bit more safer as a welcome side effect.

And as the third point, not named by Alfie, there might be existing further comparable services. There might be lawful barriers for Google's services in several countries. Implementing the service of A, that might not being accessible in every case is crap in itself and contributing to A's monopoly while impeding service provider B and C from participation is not a good decision at all.

To come back to Alfies argumentaion, none of the captchas (Google's or others) is known to be accessible for all people, using the internet. I strongly encourage us to take accessibility into account for the further development. ReCaptcha (or what else) would be IMHO a step into the opposite direction.

Tschö, Auge

[1]: Normally accounts can't get activated because of a noit existing e-mail-address.
[2]: In this forum the spammers start often with two entries. Often they never come back after these two entries.
[3]: Alfie and I was able to observe such an attack with around 300 entries in 30 minutes a few years ago.
[4]: A data controller may not refuse service to users who decline consent to processing that is not strictly necessary in order to use the service. (Article 7(4)) (from Wikipedia: GDPR, section "Lawful basis for processing")

Trenne niemals Müll, denn er hat nur eine Silbe!

Complete thread:

 RSS Feed of thread

powered by my little forum