Still having a problem with time outs... (General)

by BrianC, Wednesday, April 18, 2007, 17:13 (6217 days ago)

Dear Developers,

We've been using My Little Forum on a very busy site now for about ten months (4276 Postings in 819 Threads and an average of about 250 unique visitors a day). I reported on your earlier forum that late in 2006 we had a security problem and the accusation was made that it was caused by MLF. We temporarily shut down MLF and used PHPBB for a few weeks but the general consensus of our members was that this was far inferior to MLF from a users' point of view. In the end our service provider kicked us off the server we were originally using (which also hosted a lot of other commercial websites) and they set us up on our own dedicated server basically so that we could go back to using MLF.

My own prognosis is that the security breach in fact probably did not occur through MLF and we certainly have not had a repeat of that particular problem. (For those who were following that discussion, whoever got into the server rewrote the code on every php and html file adding a couple of lines at the end of each file which caused visitors browsers to surreptiously visit other sites without their knowledge. It seemed to be some con job to increase the traffic to certain designated sites. They didn't seem to be stealing information from the server but just using the visitors to pages on the server for nefarious purposes. It was only inconvenient to users in that their "back" buttons on their browsers no longer worked properly but started taking them to these sites they had no idea they had visited.)

The foregoing is a long way of saying that we think your work is absolutely brilliant, Alex. There is very high praise for MLF from our users — some of whom have a great deal of experience on all sorts of different forums on the net.

The ONLY drawback we have not yet been able to solve is this ... and it is a real bugger when it occurs. It seems to be an intermittent problem and does not occur all the time.

Many of the posters to our forum write lengthy posts. We have some chit chat but they are often fairly lengthy commentaries that may take some time (like half an hour or an hour to compose). Sometimes after working on a post for an hour one finally reaches the end and hits the "OK-submit" button and the next thing one finds is that you have lost all your work and you are back at the log-in screen. We all try to remember to do a Control-A/Control-C before pressing the submit button but invariably people forget that from time to time and that's usually the times when the forum has this little spack attack and logs you out destroying all your good work at the same time. (The other alternative of course is writing the posts in a word processing program and then just pasting it in to the message field before one presses submit. Again some of us try to remember to do that but it doesn't always happen when one just wants to add a quick comment and that gets extended into a long reply and one forgets.)

Does anyone have any thoughts as to what the problem might be here?

Also out of interest I made a number of modifications to the BB-code function to enable users to include coloured text and to blockquote paragraphs, use a smaller font size for parts of the text, and a couple of other things.

You're most welcome to see our forum in operation — it's mainly to do with spirituality and theology — at www.catholica.com.au/forum.

If anyone among the developers has thoughts on the time out problem I would be deeply appreciative.

One other problem I am working on is that Google has a bit of trouble indexing the material on our forum. Google doesn't seem to like pages stored in a database for some reason. Has anyone else been doing work on this problem of having one's material indexed by Google and other search engines?

Cheers, BrianC

locked
9056 views

Still having a problem with time outs...

by done, Wednesday, April 18, 2007, 18:34 (6217 days ago) @ BrianC

One other problem I am working on is that Google has a bit of trouble indexing the material on our forum. Google doesn't seem to like pages stored in a database for some reason. Has anyone else been doing work on this problem of having one's material indexed by Google and other search engines?

http://en.wikipedia.org/wiki/Rewrite_engine
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html

locked
8600 views
Avatar

Still having a problem with time outs...

by Alex ⌂, Wednesday, April 18, 2007, 19:40 (6217 days ago) @ done

One other problem I am working on is that Google has a bit of trouble indexing the material on our forum. Google doesn't seem to like pages stored in a database for some reason. Has anyone else been doing work on this problem of having one's material indexed by Google and other search engines?

http://en.wikipedia.org/wiki/Rewrite_engine
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html

Hm, as far as I know google doesn't care about the url. This search for example finds postings in the forum.

Alex

locked
8566 views

search engines

by BrianC, Thursday, April 19, 2007, 07:16 (6216 days ago) @ Alex
edited by Alfie, Thursday, April 19, 2007, 10:55

Thanks, Alex and done,

I've just tried some tests on our own forum with Google search and the results today are much better than a few months ago when I made a number of Google-recommended changes that are now causing our database to be indexed much more deeply. It's still a long way from comprehensive though so I'll read this further information you've provided, done, with much interest. We've got about 21 pages of posts each one about 40 strings deep. Confining the Google searches to short phrases that are likely to be quite particular to our discussion forum, Google searches about the first 4 pages exceedingly well. The performance falls off for older posts and for how deeply down a string one selects the particular search term from.

One of the tricks I learned from Google was placing a file labelled "robots.txt" in the forum folder with a list of files that need to be excluded if the robot is to search the database. In our case the list of exclussions is:

User-agent: *
Disallow: /img/
Disallow: /lang/
Disallow: /contact.php
Disallow: /email.php
Disallow: /register.php
Disallow: /user.php
Disallow: /admin.php
Disallow: /login.php
Disallow: /posting.php
Disallow: /board.php
Disallow: /board_entry.php
Disallow: /mix.php
Disallow: /mix_entry.php
Disallow: /inc.php
Disallow: /timedifference.php
Disallow: /db_settings.php
Disallow: /delete_cookie.php
Disallow: /info.php
Disallow: /more_smilies.php
Disallow: /upload.php
Disallow: /search.php
Disallow: /rss.php

I had to do a bit of tinkering with that list over a period of a month or so but it seems to have led to a significantly better, albeit still not perfect, indexing of our forum database by Google.

Cheers, Brian

locked
9451 views
Avatar

search engines

by Alex ⌂, Thursday, April 19, 2007, 08:37 (6216 days ago) @ BrianC
edited by Alfie, Thursday, April 19, 2007, 10:55

One of the tricks I learned from Google was placing a file labelled "robots.txt" in the forum folder with a list of files that need to be excluded if the robot is to search the database. In our case the list of exclussions is:

That's interesting! What Google doesn't like might be the "double content" caused by the different views and order possibilities (different URLs with the same content). In the current version I reduced as much as possible the query strings for page numbers, categories and the order of threads in the URLs but maybe a robots="noindex" in all pages except for the index page and the opened messages might also be beneficial.

Alex

locked
8344 views
Avatar

search engines

by Alfie ⌂, Vienna, Austria, Thursday, April 19, 2007, 10:54 (6216 days ago) @ Alex
edited by Alfie, Thursday, April 19, 2007, 11:00

@Brian: You 'recycled' my robots.txt ;-)

What Google doesn't like might be the "double content" caused by the different views and order possibilities (different URLs with the same content).

Yes, that's exactly the problem. Details are given in the english forum, and more deeply in the german one.

In the current version I reduced as much as possible the query strings for page numbers, categories and the order of threads in the URLs ...

That's nice (long URIs are often 'broken' if copied from a browser's address-line and sent via e-mail), but concerning search engines a 'maximum depth' of 4 parameters (and the like) are simply a myth.
Since only one view will 'exist' in the new version, I would not expect double content to be an issue any more.

... but maybe a robots="noindex" in all pages except for the index page and the opened messages might also be beneficial.

Oops, please as an option only (if at all)!
Just imagine how the crawler would work:
- The index page contains only the headings of threads and links (is rated as a 'link collection' which crawlers don't like, because they want to find keywords for their database)
- In order to crawl the individual posts they have to follow these links, which may take a while (Google is a daily guest in my forum for about 6 hours)

The variant with robots.txt + modified scripts (rel="nofollow" at corresponding links) is working perfectly.
Since robots.txt and rel="nofollow" are non-binding conventions. it's always a good idea to use both of them.

I'm using my forum in mix-view and changed the default order of posts to last-answer. By this not only new posts, but also replies to older ones are displayed in the entry page (which is crawled first).
forum.php, line 41

  if (empty($order)) $order="last_answer";

instead of

  if (empty($order)) $order="time";

It's also a good idea to change the subject line according to the content of the post. Everybody should suggest such a procedure is his/her forum's FAQs. If you want to get better search results, admins and mods should edit subject lines as well - although this may not be an option for pretty large installations. ;-)

--
Cheers,
Alfie (Helmut Schütz)
BEBA-Forum (v1.8β)

locked
8453 views
Avatar

search engines

by Alex ⌂, Thursday, April 19, 2007, 11:42 (6216 days ago) @ Alfie

Since only one view will 'exist' in the new version

We'll see. The ponit is: I don't have a need for the "board view" and at the moment I don't feel like doing it. ;-)

... but maybe a robots="noindex" in all pages except for the index page and the opened messages might also be beneficial.

Oops, please as an option only (if at all)!

What I ment was to allow indexing the index page and the entries. Everything else (complete thread (double entry), login, register page etc.) shouldn't be important to Google or am I wrong?

Alex

locked
8382 views
Avatar

search engines

by Alfie ⌂, Vienna, Austria, Thursday, April 19, 2007, 19:30 (6216 days ago) @ Alex

Hi Alex!

Since only one view will 'exist' in the new version


We'll see. The ponit is: I don't have a need for the "board view" and at the moment I don't feel like doing it. ;-)

I will not miss it either :clap:

... but maybe a robots="noindex" in all pages except for the index page and the opened messages might also be beneficial.

Oops, please as an option only (if at all)!


What I ment was to allow indexing the index page and the entries. Everything else (complete thread (double entry), login, register page etc.) shouldn't be important to Google or am I wrong?

Double content is only an issue if multiple views are possible. Even as I have added rel="nofollow" to many links, Google still gives them a quick gaze.
Just an example from my server-logs
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) *g
/forum/board_entry.php?id=675 (<- this is fine)
/forum/contact.php?id=675&page=0&
category=all&order=last_answer&view=mix (<- this is stupid and considered double content)

P.S.: Pasting of the :clap: emoticon is not working (maybe the same error like in versions up to 1.7.1)
P.S.: What is the maximum word-lenght?

--
Cheers,
Alfie (Helmut Schütz)
BEBA-Forum (v1.8β)

locked
8661 views
Avatar

Still having a problem with time outs...

by Alex ⌂, Wednesday, April 18, 2007, 20:02 (6217 days ago) @ BrianC

Does anyone have any thoughts as to what the problem might be here?

The script uses the PHP built-in session mechanism so I think this is caused by the server settings (specially session.gc_maxlifetime). There are ways to overwrite the php server settings by the script (or by .htaccess file) but in case of the session settings this is a bit problematic.

Also out of interest I made a number of modifications to the BB-code function to enable users to include coloured text and to blockquote paragraphs, use a smaller font size for parts of the text, and a couple of other things.

The BB Code function is one of the changes in the current version (see this test posting).

Alex

locked
8439 views

Still having a problem with time outs...

by BrianC, Thursday, April 19, 2007, 07:20 (6216 days ago) @ Alex

Thanks also for this Alex information on the timeouts. I'll read it with much interest and report back. I'll also be interested in the material on BB code. The changes I made were pretty basic and I suspect you'll have a much neater solution than what I came up with.

Cheers and thanks, BrianC

locked
8528 views

RSS Feed of thread