Do we have to have a robot.txt - Google (Technics)

by Auge, Monday, June 22, 2009, 23:10 (5414 days ago) @ Göran B

Hello

The database will be not indexed. Any robot will only index the content in your site. This could be the forum (and the sites of the threads) too.


I don't think it's true. All postings in our forum have been indexed by Google and can be find doing a Google search.

Only to make it clear:

The Googlebot (like other bots too) is a HTTP-client like any other browser. 'He' will find a website over a link and will follow all accessible links (with all potential parameters) on the pages of the site, except a robots.txt forbid it[1] for some or all directories or a meta-tag with the name-attribute with the value "robots" and the content-attribute with one of the following values [index|follow|noindex|nofollow|all] (italic values forbid the access for the page itself (noindex) or the linked pages (nofollow)).

So the robot follow the links in a page and accesses other pages that way. The robot will not read the database itself but only the pages wich contains the values of the database. In the case of mlf a robot will (if it's not forbidden) find and index the webpages with the postings but not the postings in the database.

That is actually causing us a problem. We have a database with 155.000 entries and when the Googlebot starts to search through the database, the forum canät be used by other users for about two hours. This happens every day.

I don't know how to avoid the daily access for search robots. Maybe the Google Webmaster Tools (in the case of googlebot) give any possibilities to control it? Other search engines should give comparable programs to control it.

[1] Attention a robot may follow the instructions of the robots.txt or the meta-element 'robots' but he/they must not!

Tschö, Auge


Complete thread:

 RSS Feed of thread