
Facebook crawler (Technics)

by Alfie ⌂, Vienna, Austria, Wednesday, June 12, 2024, 07:51 (88 days ago)

Dear all,

two days ago my forum was flooded with requests from the facebook-crawler. Likely someone linked to the forum on Facebook (can’t check cause I don’t have a FB account and no intentions to get one). First I saw only a bizarre high number of online ‘users’ and at the end my server gave up (likely due to too many database-connections) and responded with an HTTP 500 (Internal Server Error).

Search for the user-agent string facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) in your server’s access.log to check. My daily logs grew from ≈10MB to more than 150MB. A Google-search showed that I was not alone…
The crawler is aggressive and doesn’t give a shit about the robots.txt. Therefore,

User-agent: FacebookBot
Disallow: /

does not help.

Finally I used the workaround suggested at stackoverflow.

Alfie (Helmut Schütz)
BEBA-Forum (v1.8β)

Complete thread:

 RSS Feed of thread