Profile My Little Forum (General)

by danielb987, Monday, March 20, 2017, 12:29 (2565 days ago)

Another thread diskusses how to improve the speed of the forum. The first question to ask in order to solve that is how long time each sql query takes. This process is called profiling.

The way this works is to replace each call to mysqli_query with a call to profile_mysqli_query. You then use the parts of the forum you want to test and after that use the script profile_result.php to see the result.

To use the script, create the profile tables using the sql queries in profile_mlf.sql.

Then add the statement "include('profile_mysqli.php');" to the top of the index.php file.

And then replace all occurencies of mysqli_query with profile_mysqli_query.

I have a complete installation of My Little Forum version 2.4.2 with profiling. Se below.

The function profile_mysqli_query is in the file profile_mysqli.php.txt. Please note that you have to change the variable $forum_folder in the top of the file to the location on the server of the forum.

To show the result, run the script profile_result.php.

"Show profile data per file and line" shows a summary of all sql queries of each row in a file. A particular row may have several different queries, for example different id, but all queries on the same row are added up.

"Show profile data per query" shows every single query. So even if a row has several identical sql queries, each of them are shown individual.

"Clear profile data" clears the profile tables. No question is asked!

The script is provided as freeware. Feel free to change it as you like. This is a quick hack and may have bugs. Use at own risk.

Important note!!!
This script is slow and must not be run on a public forum! The profiler adds extra sql queries to each sql query, with the result of a lot of overhead. But it is nessesary to find which sql queries that are causing the problem.

Best regards,
Daniel

Profile My Little Forum

by danielb987, Monday, March 20, 2017, 12:30 (2565 days ago) @ danielb987

Complete installation of My Little Forum version 2.4.2 with profiling:
mjtrafik.se/profile_my_little_forum/profile_mlf.zip

Profile My Little Forum (reformatted)

by danielb987, Monday, March 20, 2017, 12:55 (2565 days ago) @ danielb987
edited by Auge, Tuesday, March 21, 2017, 13:42

[edit]Proloque: I reformatted the SQL-queries for improvoed readability.

Auge
[/edit]

My forum has 192 000 entries. If I simply reloads the main page of the forum, one single query takes 33 ms, while the rest takes less than 1 ms. However, if I go to an entry, then clear the profile data, and then go to the main page of the forum by clicking on the head line, I get a number of VERY slow SQL queries.

includes/index.inc.php, line 187. Time 1 062 milliseconds!

SELECT
  id,
  pid,
  tid,
  name,
  user_name,
  ft.user_id,
  UNIX_TIMESTAMP(ft.TIME) AS TIME,
  UNIX_TIMESTAMP(ft.TIME + INTERVAL 0 MINUTE) AS TIMESTAMP,
  UNIX_TIMESTAMP(last_reply) AS last_reply,
  subject,
  category,
  rst.user_id AS req_user 
FROM mlf2_entries AS ft
  LEFT JOIN mlf2_userdata ON mlf2_userdata.user_id=ft.user_id
  LEFT JOIN mlf2_read_entries AS rst ON rst.posting_id = ft.id
    AND rst.user_id = 0
  WHERE spam=0
    AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)
  ORDER BY ft.TIME DESC
  LIMIT 40

includes/main.inc.php, line 336. Time 581 milliseconds!

SELECT
  COUNT(*)
FROM mlf2_entries
  WHERE pid = 0
    AND spam = 0
    AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)

includes/index.inc.php, line 69. Time 578 milliseconds!

SELECT
  id,
  tid
FROM mlf2_entries
  WHERE pid = 0
    AND spam = 0
    AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)
  ORDER BY sticky DESC,
    TIME DESC
  LIMIT 0, 30

includes/main.inc.php, line 339. 528 milliseconds!

SELECT
  COUNT(*)
FROM mlf2_entries
  WHERE category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)
    AND spam = 0

includes/main.inc.php, line 344. 517 milliseconds!

SELECT
  COUNT(*)
FROM mlf2_entries
  WHERE spam = 1

Best regards,
Daniel

Profile My Little Forum - some suggestions

by danielb987, Monday, March 20, 2017, 14:03 (2565 days ago) @ danielb987

Some suggestions to improve response time

includes/index.inc.php, line 187. Time 1 062 milliseconds!

This query has several left joins. Since we are only interested in the latest 40 entries in the table mlf2_entries, I suggest that you first do a sql query from mlf2_entries. Then you use a for loop in php to do the rest of the queries. In my database, mlf2_entries has 193 000 rows, mlf2_read_entries has 5 000 rows and mlf2_userdata has 2 000 rows.

Doing left join can be very expensive.

----------------------------

includes/main.inc.php, line 344. Time 517 milliseconds!

SELECT COUNT(*) FROM mlf2_entries WHERE spam = 1

The query only seems useful for admins and moderators. The query is always run, even if I'm not logged in. I suggest that it is only run for admins and moderators.

----------------------------

includes/main.inc.php, line 336. Time 581 milliseconds!
includes/main.inc.php, line 339. Time 528 milliseconds!

These are two sql queries that counts the rows in mlf2_entries. Each has it's own where-clause, but they only returns a single number, the count of rows.

From my point of view, I don't see why these must be exactly correct. The number of posts and the number of threads on the bottom of the page is fun statistics, but it is in no way important. Who cares if the number of entries are 320 000 or 350 000? From my point of view, the statistics like the number of entries and number of topics is something that can be updated once per day and doesn't need to be more accurate than that, at least not for a big forum

For a small forum, there may be a different thing. A web master of a forum with 1000 entries and 100 topics may want to have the number more accurate, but for big forums, I don't see the point of having to update it more than once a day.

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Monday, March 20, 2017, 17:36 (2565 days ago) @ danielb987

Hi,

thank you for your detailed analysis.

SELECT COUNT(*) FROM mlf2_entries WHERE spam = 1
includes/main.inc.php, line 336. Time 581 milliseconds!
includes/main.inc.php, line 339. Time 528 milliseconds!

I think, the spam-column (and the user_id) should be an indexed column. Can you add this index to your database and run the analysis again?

ALTER TABLE `mlf2_entries` ADD INDEX(`spam`);
ALTER TABLE `mlf2_entries` ADD INDEX(`user_id`);

The number of posts and the number of threads

... can be static (i.e. updated by a SQL trigger)


/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Profile My Little Forum - some suggestions

by danielb987, Monday, March 20, 2017, 18:27 (2565 days ago) @ Micha

I think, the spam-column (and the user_id) should be an indexed column. Can you add this index to your database and run the analysis again?

ALTER TABLE `mlf2_entries` ADD INDEX(`spam`);
ALTER TABLE `mlf2_entries` ADD INDEX(`user_id`);

-

That was a huge improvement of two of the queries!

-

includes/main.inc.php, line 344:

SELECT COUNT(*) FROM mlf2_entries WHERE spam = 1

This query went from 500 millisecond to 0,5 milliseconds!

----

includes/main.inc.php, line 336:

SELECT COUNT(*) FROM mlf2_entries WHERE pid = 0 AND spam = 0

This query went from 600 milliseconds to 150 milliseconds!

----

Before the improvement, the top five queries took about 3,2 seconds in total. After the improvement, the top five queries took about 2,2 seconds in total.

Best regards,
Daniel

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Monday, March 20, 2017, 22:12 (2565 days ago) @ danielb987

Hi,

That was a huge improvement of two of the queries!

Okay. Thank you! I believe, we should check all columns in WHERE-conditions and/or the column-conditions in JOIN statements (e.g. <foo> ON <bar>). These columns must have an INDEX.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Profile My Little Forum - some suggestions

by Auge ⌂, Tuesday, March 21, 2017, 08:03 (2564 days ago) @ Micha

Hello

Okay. Thank you! I believe, we should check all columns in WHERE-conditions and/or the column-conditions in JOIN statements (e.g. <foo> ON <bar>). These columns must have an INDEX.

I don't know, if it is useful, to set every column, that is/could be membered in a JOIN, as an index. To manage the indexes takes also executing time. But the often in JOINs used
columns should be indexed. The first column, that came in my mind while reading this thread is the column category of the entries table.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Tuesday, March 21, 2017, 08:30 (2564 days ago) @ Auge

Hi,

To manage the indexes takes also executing time.

Of course, but it seems to be small. The column user_id was not indexed but, I believe, it is an important and often used column for joining - why this column is not set as FOREIGN KEY?

The first column, that came in my mind while reading this thread is the column category of the entries table.

This column is set as key e.g.

KEY category (category)

. Isn't it similar to an INDEX?

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Profile My Little Forum - some suggestions

by Auge ⌂, Tuesday, March 21, 2017, 13:14 (2564 days ago) @ Micha

Hello

To manage the indexes takes also executing time.

Of course, but it seems to be small.

I didn't want to argue against the use of INDEX. I only wants to advice against the creation of indexes for every single column, that is in use for JOINs.

The column user_id was not indexed but, I believe, it is an important and often used column for joining

Yes, you are right without a doubt.

- why this column is not set as FOREIGN KEY?

At the time, when Alex defined the table structure of mlf (mlf1, the structure for mlf2 based on that), he and also no one else of us was aware of FOREIGN KEYS to tie the tables to each other and other the abilities of a MySQL-server (that evolved a lot since then). So this feature was never implemented in mlf.

The first column, that came in my mind while reading this thread is the column category of the entries table.


This column is set as key e.g.

KEY category (category)

. Isn't it similar to an INDEX?

Yes, it is and yes, it is. :-)

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Tuesday, March 21, 2017, 17:41 (2564 days ago) @ Auge

Hi,

I only wants to advice against the creation of indexes for every single column, that is in use for JOINs.

I think, we should do it. Indexing all columns that are used for constrains or that are used to combine two (or more) tables because we join the tables (or restrict the result) by these columns, so, they are more important than other ones. Lets take a look to the slow SQL statement:

SELECT
  id,
  pid,
  tid,
  name,
  user_name,
  ft.user_id,
  UNIX_TIMESTAMP(ft.TIME) AS TIME,
  UNIX_TIMESTAMP(ft.TIME + INTERVAL 0 MINUTE) AS TIMESTAMP,
  UNIX_TIMESTAMP(last_reply) AS last_reply,
  subject,
  category,
  rst.user_id AS req_user 
FROM mlf2_entries AS ft
  LEFT JOIN mlf2_userdata ON mlf2_userdata.user_id=ft.user_id
  LEFT JOIN mlf2_read_entries AS rst ON rst.posting_id = ft.id
    AND rst.user_id = 0
  WHERE spam=0
    AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)
  ORDER BY ft.TIME DESC
  LIMIT 40

The following columns are important to join or restrict the output:

  • mlf2_userdata.user_id
  • mlf2_entries.id
  • mlf2_entries.category
  • mlf2_entries.user_id
  • mlf2_entries.spam
  • mlf2_entries.time
  • mlf2_read_entries.posting_id
  • mlf2_read_entries.user_id


Following this line of reasoning, we can optimise this query by:

mlf2_userdata.user_id is primary key, thus, nothing to do
mlf2_entries.id is primary key, thus, nothing to do
mlf2_entries.category is indexed, thus, nothing to do

mlf2_entries.user_id is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`user_id`);

(as already suggested)

mlf2_entries.spam is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`spam`);

(as already suggested)

mlf2_entries.time is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`time`);

mlf2_read_entries.posting_id and mlf2_read_entries.user_id are joined primary keys. They have to indexed separately, see links above.

ALTER TABLE `mlf2_read_entries` ADD INDEX(`user_id`);
ALTER TABLE `mlf2_read_entries` ADD INDEX(`posting_id`);

That's my suggestion. I cannot check the benefit, because I don't have a forum that has so many threads/postings.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Profile My Little Forum - some suggestions

by danielb987, Tuesday, March 21, 2017, 20:30 (2564 days ago) @ Micha

The test I currently do is that I go to an entry, then clear the profile data, then go to the main page and then show the result here. The reason for me to do this test is that there are many people reading the forum but only a few posting to the forum. At 20:30 local time in Sweden, my forum has 60 visitors but there have only been 36 posts in the last 24 hours. So it is more important that the forum is fast to read than it is fast to post.

mlf2_entries.user_id is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`user_id`);

(as already suggested)

This I have already done in my database.

mlf2_entries.spam is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`spam`);

(as already suggested)

This I have already done in my database.

------

mlf2_entries.time is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`time`);

By adding this index alone, together with the two indexes above which I have added previously, the query in includes/index.inc.php at line 187 went from 1 000 milliseconds to 22 milliseconds! A very huge improvement!

----

mlf2_read_entries.posting_id and mlf2_read_entries.user_id are joined primary keys. They have to indexed separately, see links above.

ALTER TABLE `mlf2_read_entries` ADD INDEX(`user_id`);
ALTER TABLE `mlf2_read_entries` ADD INDEX(`posting_id`);

I cannot see any benefit from these two indexes at this point, but my table mlf2_read_entries is small. I installed a new MLF forum and imported the data from an old MLF forum, so this table was clean after the update.

When the table starts to grow much bigger, the indexes will probably be much more important than now.

----------

After these new indexes, the status for me is as follows:

includes/index.inc.php, line 69: about 600 milliseconds

includes/main.inc.php, line 339: about 600 milliseconds

includes/main.inc.php, line 336: about 120 milliseconds

includes/index.inc.php, line 187: about 22 milliseconds
(this was the query that before took over 1 000 milliseconds)

Best regards,
Daniel

Avatar

Profile My Little Forum - some suggestions

by Auge ⌂, Tuesday, March 21, 2017, 21:22 (2564 days ago) @ danielb987

Hello

mlf2_entries.time is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`time`);


By adding this index alone, together with the two indexes above which I have added previously, the query in includes/index.inc.php at line 187 went from 1 000 milliseconds to 22 milliseconds! A very huge improvement!

Wow

mlf2_read_entries.posting_id and mlf2_read_entries.user_id are joined primary keys. They have to indexed separately, see links above.

ALTER TABLE `mlf2_read_entries` ADD INDEX(`user_id`);
ALTER TABLE `mlf2_read_entries` ADD INDEX(`posting_id`);


I cannot see any benefit from these two indexes at this point, but my table mlf2_read_entries is small. I installed a new MLF forum and imported the data from an old MLF forum, so this table was clean after the update.

When the table starts to grow much bigger, the indexes will probably be much more important than now.

That's what I and apparently Milo expect. In a 2.4.x forum with many active registered users, the table will grow and it will grow fast. Because of the SELECTs for the forum entries, which JOINs this table in most cases and on both columns, it is good to have an INDEX on both fields separately.

SELECTFROM mlf2_entries AS ft
  LEFT JOIN mlf2_userdata ON mlf2_userdata.user_id=ft.user_id
  LEFT JOIN mlf2_read_entries AS rst ON rst.posting_id = ft.id
    AND rst.user_id = 4

After these new indexes, the status for me is as follows:

includes/index.inc.php, line 69: about 600 milliseconds

includes/main.inc.php, line 339: about 600 milliseconds

includes/main.inc.php, line 336: about 120 milliseconds

includes/index.inc.php, line 187: about 22 milliseconds
(this was the query that before took over 1 000 milliseconds)

That's much better, even the two queries with more than 600ms execution time are still slow.

Thank you for your collaboration.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Profile My Little Forum - some suggestions

by danielb987, Wednesday, March 22, 2017, 04:58 (2563 days ago) @ Auge

mlf2_read_entries.posting_id and mlf2_read_entries.user_id are joined primary keys. They have to indexed separately, see links above.

ALTER TABLE `mlf2_read_entries` ADD INDEX(`user_id`);
ALTER TABLE `mlf2_read_entries` ADD INDEX(`posting_id`);


I cannot see any benefit from these two indexes at this point, but my table mlf2_read_entries is small. I installed a new MLF forum and imported the data from an old MLF forum, so this table was clean after the update.

When the table starts to grow much bigger, the indexes will probably be much more important than now.


That's what I and apparently Milo expect. In a 2.4.x forum with many active registered users, the table will grow and it will grow fast. Because of the SELECTs for the forum entries, which JOINs this table in most cases and on both columns, it is good to have an INDEX on both fields separately.

I wanted to see what would happen with more data in the table mlf2_read_entries so I wrote a simple php script that reads mlf2_userdata and mlf2_entries and then add valid but random data to the mlf2_read_entries table. Both user_id, entry_id and time is random. Before, the mlf2_read_entries table had 4 800 rows, now it has 190 000 rows.

I have also removed the two suggested indexes so I now only have (user_id, posting_id) as primary key in the table and no other indexes in this table.

The strange thing is that the sql query in includes/index.inc.php at line 187 still only takes about 22 milliseconds. I thought the reason is that the query checks for user_id = 0, and there is no user with user_id = 0 in the table.

----

So in order to check my idea, I logged in to the forum using my ordinary user with user_id 7843 and then tried again. But no difference. I still only have about 22 milliseconds on this query. There are 200 rows in the table with user_id = 7843.

----

The query I am talking about here is the following query:

SELECT
  id,
  pid,
  tid,
  name,
  user_name,
  ft.user_id,
  UNIX_TIMESTAMP(ft.TIME) AS TIME,
  UNIX_TIMESTAMP(ft.TIME + INTERVAL 0 MINUTE) AS TIMESTAMP,
  UNIX_TIMESTAMP(last_reply) AS last_reply,
  subject,
  category,
  rst.user_id AS req_user
FROM mlf2_entries AS ft
  LEFT JOIN mlf2_userdata ON mlf2_userdata.user_id=ft.user_id
  LEFT JOIN mlf2_read_entries AS rst ON rst.posting_id = ft.id
    AND rst.user_id = 7843
  WHERE spam=0
    AND category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)
  ORDER BY ft.TIME
  DESC LIMIT 40

Best regards,
Daniel

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Wednesday, March 22, 2017, 20:58 (2563 days ago) @ danielb987

Hi,

I thought the reason is that the query checks for user_id = 0, and there is no user with user_id = 0 in the table.

This user_id is reserved for unregistered users. Depending on your forum settings, this kind of user-type does/doesn't exist.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Wednesday, March 22, 2017, 08:49 (2563 days ago) @ danielb987

Hi,

So it is more important that the forum is fast to read than it is fast to post.

Yes, your are right. The indexing-process will decelerate the INSERT/UPDATE/DELETE statements but, I believe, it is a quite good compromise.

By adding this index alone, together with the two indexes above which I have added previously, the query in includes/index.inc.php at line 187 went from 1 000 milliseconds to 22 milliseconds! A very huge improvement!

Wow that's more than I expected. Fine.

----------

After these new indexes, the status for me is as follows:

includes/index.inc.php, line 69: about 600 milliseconds

includes/main.inc.php, line 339: about 600 milliseconds

includes/main.inc.php, line 336: about 120 milliseconds

includes/index.inc.php, line 187: about 22 milliseconds
(this was the query that before took over 1 000 milliseconds)

We will take a closer look to this queries and give some suggestions for indexing.

regards
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Wednesday, March 22, 2017, 20:50 (2563 days ago) @ danielb987

Hi Daniel,

thank you for supporting us! I received your mail. I will answer you as soon as I have a reply from Auge.

After these new indexes, the status for me is as follows:

includes/index.inc.php, line 69: about 600 milliseconds

Okay, this query is related to the mlf2_entries


mlf2_entries.pid is already indexed --> nothing to optimize.
mlf2_entries.spam is already indexed --> nothing to optimize.
mlf2_entries.sticky is already indexed --> nothing to optimize.

mlf2_entries.time is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`time`);


mlf2_entries.last_reply is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`last_reply`);

includes/main.inc.php, line 339: about 600 milliseconds
includes/main.inc.php, line 336: about 120 milliseconds

Are you sure? mlf2_entries.pid, mlf2_entries.category as well as mlf2_entries.spam are indexed.

includes/index.inc.php, line 187: about 22 milliseconds
(this was the query that before took over 1 000 milliseconds)

Okay, I think, this one is optimized. ;-)

Kind regards,
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Profile My Little Forum - some suggestions

by danielb987, Wednesday, March 22, 2017, 21:51 (2563 days ago) @ Micha

Hi Micha,

mlf2_entries.time is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`time`);


mlf2_entries.last_reply is not indexed:

ALTER TABLE `mlf2_entries` ADD INDEX(`last_reply`);

The computer that I use for these tests is busy with another task at the moment but I will test it tomorrow.

includes/main.inc.php, line 339: about 600 milliseconds
includes/main.inc.php, line 336: about 120 milliseconds


Are you sure? mlf2_entries.pid, mlf2_entries.category as well as mlf2_entries.spam are indexed.

Yes. Both the queries at line 336 and 339 do a COUNT(*).

This page says that COUNT(*) is slow on InnoDB tables:
http://stackoverflow.com/questions/1332624/speeding-up-row-counting-in-mysql

Two quotes from that page:
InnoDB can be very slow to perform count(*) type queries, cause it is designed to allow for multiple concurrent views of the same data. So at any point in time, its not enough to go to the index to get the count.

InnoDB does not keep an internal count of rows in a table. (In practice, this would be somewhat complicated due to multi-versioning.) To process a SELECT COUNT(*) FROM t statement, InnoDB must scan an index of the table, which takes some time if the index is not entirely in the buffer pool.

----

The difference between the query at row 336 and the query at row 339 is that the query at row 336 returns COUNT = 28 000 and the query at line 339 returns COUNT = 192 000. This explains why the query at line 336 is five times faster.

Best regards,
Daniel

Profile My Little Forum - some suggestions

by danielb987, Tuesday, March 21, 2017, 20:36 (2564 days ago) @ Micha
edited by Micha, Wednesday, March 22, 2017, 08:17

That's my suggestion. I cannot check the benefit, because I don't have a forum that has so many threads/postings.

Micha: Could you please send me an email? I have address <REMOVED BY MILO TO AVOID SPAM>

Best regards,
Daniel

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Wednesday, March 22, 2017, 08:20 (2563 days ago) @ danielb987

Hi,

it is good to hear that the indexing of the columns improve the performance.

I removed your mail-address in your posting to avoid spam. ;-)

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Profile My Little Forum - some suggestions

by danielb987, Wednesday, March 22, 2017, 08:32 (2563 days ago) @ Micha

I removed your mail-address in your posting to avoid spam. ;-)

Thanks!

/ Daniel

Avatar

Profile My Little Forum - some suggestions

by Auge ⌂, Tuesday, March 21, 2017, 21:08 (2564 days ago) @ Micha

Hello

That's my suggestion.

After reading I think, your suggestion is reasonable. I altered the table structure of my testing installation.

I cannot check the benefit, because I don't have a forum that has so many threads/postings.

"Not so many posts." pfft.

Your forums with a few hundred threads and a few thousand entries is so much more, than I can offer (13 entries in 7 threads). ;-)

A further field, which is often used in WHERE-clauses is the field user_type from the table mlf_userdata. Maybe the column user_name could be indexed too (user lists, search in user area …).

ALTER TABLE `mlf2_userdata` ADD INDEX(`user_type`);
ALTER TABLE `mlf2_userdata` ADD INDEX(`user_name`);

Any comments?

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

Profile My Little Forum - some suggestions

by Micha ⌂, Wednesday, March 22, 2017, 08:22 (2563 days ago) @ Auge

Hi,

Any comments?

sounds logical. +1

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Profile My Little Forum - some suggestions

by danielb987, Wednesday, March 22, 2017, 08:35 (2563 days ago) @ Auge

ALTER TABLE `mlf2_userdata` ADD INDEX(`user_type`);
ALTER TABLE `mlf2_userdata` ADD INDEX(`user_name`);

Should I test this? If so, how? Which operations on the forum, like viewing a thread or going to the main page, would use these indexes?

/ Daniel

Avatar

Profile My Little Forum - some suggestions

by Auge ⌂, Wednesday, March 22, 2017, 09:17 (2563 days ago) @ danielb987
edited by Auge, Friday, March 31, 2017, 14:34

Hello

ALTER TABLE `mlf2_userdata` ADD INDEX(`user_type`);
ALTER TABLE `mlf2_userdata` ADD INDEX(`user_name`);


Should I test this?

It would be nice to test this.

Which operations on the forum, like viewing a thread or going to the main page, would use these indexes?

As I wrote:

A further field, which is often used in WHERE-clauses is the field user_type from the table mlf_userdata. Maybe the column user_name could be indexed too (user lists, search in user area …).

- The check for access to restricted categories depends on the user type.
- The user list can be ordered by the user type and by the user name.
- The search for a user name depends (how surprisingly ;-)) on the user name.

With exception of the first example these are more or less rare cases. But these colums are unique (user_name) respectively a quasi index (user_type) that is used here and there in WHERE-clauses or JOINs.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

Profile My Little Forum - some suggestions

by Auge ⌂, Tuesday, March 21, 2017, 22:21 (2564 days ago) @ Micha

Hello

I begun the work on the queries for the installation and update scripts. Please do not merge this until the last commits are added (not yet done).

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Avatar

Profile IN() statement in queries

by Micha ⌂, Saturday, March 25, 2017, 11:17 (2560 days ago) @ danielb987
edited by Auge, Monday, March 27, 2017, 06:50

Hi,

the profiling tool works fine but I have to modify the profile_mysqli.php to use it on a windows system (WAMPP).

$file = str_replace($forum_folder, "", $file);
$file = str_replace("\\", "/", $file);

Okay, I take a look to that queries and it seems, that all (or most of them) queries contains an IN() statement.

category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27)
 

I made some enquiries how to improve the IN() statement. Someones said, the IN() query is faster than an OR but slower than an INNER JOIN statement.

A query like:

SELECT id, tid FROM mlf2_entries 
WHERE 
pid = 0 AND 
spam = 0 AND 
category IN (0, 4, 5, 6, 7, 8, 9, 25, 14, 23, 19, 22, 24, 10, 21, 11, 12, 13, 15, 16, 26, 18, 20, 27) 
ORDER BY sticky DESC, TIME DESC LIMIT 0, 40

can rewritten to

SELECT `id`, `tid` FROM `mlf2_entries` 
INNER JOIN (
SELECT 0 AS `cat_id` 
UNION ALL SELECT 4
UNION ALL SELECT 5 
UNION ALL SELECT 6
UNION ALL SELECT 7
UNION ALL SELECT 8
UNION ALL SELECT 9 
UNION ALL SELECT 25
UNION ALL SELECT 14
UNION ALL SELECT 23 
UNION ALL SELECT 19 
UNION ALL SELECT 22
UNION ALL SELECT 24 
UNION ALL SELECT 10 
UNION ALL SELECT 21
UNION ALL SELECT 11 
UNION ALL SELECT 12
UNION ALL SELECT 13
UNION ALL SELECT 15
UNION ALL SELECT 16
UNION ALL SELECT 26
UNION ALL SELECT 18
UNION ALL SELECT 20
UNION ALL SELECT 27 ) AS `tmp_table` 
ON `category` = `cat_id` 
WHERE 
`pid` = 0 AND 
`spam` = 0 
ORDER BY sticky DESC, TIME DESC LIMIT 0, 40

and corresponding PHP-code index.inc.php line ~69ff

 $sqlQuery = "SELECT `id`, `tid` FROM `".$db_settings['forum_table']."` INNER JOIN (SELECT ".$category_ids[0]." AS `cat_id`";
 for ($c=1; $c<count($category_ids); $c++) {
  $sqlQuery .= " UNION ALL SELECT ".$category_ids[$c];
 }
 $sqlQuery .= ") AS `tmp_table` ON `category` = `cat_id` WHERE `pid` = 0 AND `spam` = 0 ORDER BY sticky DESC, time DESC LIMIT 0, 40";
 $result=profile_mysqli_query($connid, $sqlQuery);
 $result=profile_mysqli_query($connid, "SELECT id, tid FROM ".$db_settings['forum_table']." WHERE pid = 0".$display_spam_query_and." AND category IN (".$category_ids_query.") ORDER BY sticky DESC, ".$db_thread_order." ".$descasc." LIMIT ".$ul.", ".$settings['threads_per_page']) or raise_error('database_error',mysqli_error($connid));

I test it and it seems(!) that the INNER JOIN is a little bit faster that the query that uses an IN() condition. Can someone confirm this behaviour?


Furthermore, I have take a look to the get_category_ids($categories) function. I think, we can improve it by

/**
 * returns all available catgory ids
 *
 * @return array
 */
function get_category_ids($categories) {
 if(!$categories)
  return false;
 
 return array_keys($categories);
}


We can also improve the get_categories() function for forums without restricted categories because, if the IN() statement contains ALL available categories, we don't need to restrict the query. One suggestion:

/**
 * returns all available categories
 *
 * @return array
 */
function get_categories($compact = false) {
 global $settings, $connid, $db_settings;
 $count_result = profile_mysqli_query($connid, "SELECT COUNT(*) FROM ".$db_settings['category_table']);
 list($category_count) = mysqli_fetch_row($count_result);
 mysqli_free_result($count_result);
 
 if($category_count) {
  if (empty($_SESSION[$settings['session_prefix'].'user_id']))
   $result = profile_mysqli_query($connid, "SELECT id, category FROM ".$db_settings['category_table']." WHERE accession = 0 ORDER BY order_id ASC");
  elseif (isset($_SESSION[$settings['session_prefix'].'user_id']) && isset($_SESSION[$settings['session_prefix'].'user_type']) && $_SESSION[$settings['session_prefix'].'user_type'] == 0)
   $result = profile_mysqli_query($connid, "SELECT id, category FROM ".$db_settings['category_table']." WHERE accession IN (0, 1) ORDER BY order_id ASC");
  elseif (isset($_SESSION[$settings['session_prefix'].'user_id']) && isset($_SESSION[$settings['session_prefix'].'user_type']) && ($_SESSION[$settings['session_prefix'].'user_type'] == 1 || $_SESSION[$settings['session_prefix'].'user_type'] == 2))
   $result = profile_mysqli_query($connid, "SELECT id, category FROM ".$db_settings['category_table']." WHERE accession IN (0, 1, 2) ORDER BY order_id ASC");
 
  if(!$result) 
   raise_error('database_error',mysqli_error($connid));
 
  if ($compact && mysqli_num_rows($result) == $category_count)
   return false;
 
  $categories[0]='';
  while ($line = mysqli_fetch_array($result)) {
   $categories[$line['id']] = htmlspecialchars($line['category']);
  }
  mysqli_free_result($result);
  return $categories;
 }
 return false;
}

If $compact == true and all categorises are selected, the function returns false. In index.inc.php line ~56 this value is evaluated. I only found a single call to this function in main.inc.php line 121 and set it to $categories = get_categories(true);. The improvement was significant.

Duo to the fact that IN() is faster than OR, I replaced the OR statements by an equivalent IN, e.g. accession IN (0, 1, 2). The columns order_id and accession should be indexed too.


regards
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Profile IN() statement in queries

by danielb987, Sunday, March 26, 2017, 06:09 (2559 days ago) @ Micha

the profiling tool works fine but I have to modify the profile_mysqli.php to use it on a windows system (WAMPP).

Good to hear. The script was quick and dirty so it's not suprising that you had to change it for Windows.

I test it and it seems(!) that the INNER JOIN is a little bit faster that the query that uses an IN() condition. Can someone confirm this behaviour?

I'm not a database expert so I don't know.

Regards,
Daniel

Avatar

Profile IN() statement in queries

by Micha ⌂, Sunday, March 26, 2017, 09:54 (2559 days ago) @ danielb987

Hi,

Good to hear.

It's very helpful.

I'm not a database expert so I don't know.

I change the index.inc.php near line 74. The original code-line reads:

$result=profile_mysqli_query($connid, "SELECT id, tid FROM ".$db_settings['forum_table']." WHERE pid = 0".$display_spam_query_and." AND category IN (".$category_ids_query.") ORDER BY sticky DESC, ".$db_thread_order." ".$descasc." LIMIT ".$ul.", ".$settings['threads_per_page']) or raise_error('database_error',mysqli_error($connid));

I add in front of this code the following code (including the line above):

 $sqlQuery = "SELECT `id`, `tid` FROM `".$db_settings['forum_table']."` INNER JOIN (SELECT ".$category_ids[0]." AS `cat_id`";
 for ($c=1; $c<count($category_ids); $c++) {
  $sqlQuery .= " UNION ALL SELECT ".$category_ids[$c];
 }
 $sqlQuery .= ") AS `tmp_table` ON `category` = `cat_id` WHERE `pid` = 0 AND `spam` = 0 ORDER BY sticky DESC, time DESC LIMIT 0, 40";
 $result=profile_mysqli_query($connid, $sqlQuery);
 // Original-Code to compare both ones
 $result=profile_mysqli_query($connid, "SELECT id, tid FROM ".$db_settings['forum_table']." WHERE pid = 0".$display_spam_query_and." AND category IN (".$category_ids_query.") ORDER BY sticky DESC, ".$db_thread_order." ".$descasc." LIMIT ".$ul.", ".$settings['threads_per_page']) or raise_error('database_error',mysqli_error($connid));

In your profiling script/tool, both queries are now listed, and the first one seems to be a litte bit faster.

regards
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Profile My Little Forum

by Micha ⌂, Monday, April 03, 2017, 18:04 (2551 days ago) @ danielb987

Hi,

this forum (http://mylittleforum.net/forum/) runs fast (today). Was something changed?

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

Avatar

Profile My Little Forum

by Auge ⌂, Tuesday, April 04, 2017, 08:27 (2550 days ago) @ Micha

Hi,

Back from the campaign?

this forum (http://mylittleforum.net/forum/) runs fast (today). Was something changed?

The HTML-source states, that this forum is still on version 2.3.5 (2017-04-04 10:26 CEST). So it has (or had) "only" a good time.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Profile My Little Forum

by danielb987, Tuesday, April 04, 2017, 10:13 (2550 days ago) @ Auge

The HTML-source states, that this forum is still on version 2.3.5 (2017-04-04 10:26 CEST). So it has (or had) "only" a good time.

I have noticed that several forums that is using My Little Forum has very old versions of the software. With very old, I mean 5 - 10 years since last update of the software.

A simple solution would be to have a text file with the version number, like "2.4.2" at the web server:
http://mylittleforum.net/version.txt

And when a forum administrator log in to the forum, the forum checks the compares the version of the forum with the file http://mylittleforum.net/version.txt and displays that for the administrator.

It's not very often an administrator needs to go to the admin page of the forum, and then they do that, the administrators need to manually check the current version of the forum. This obviously doesn't work very well then not even My Little Forums own forum is up to date.

I understand that it requires more administration of the developers to keep this file up to date, but it would be very useful for administrators.

I did a google search on "powered by my little forum" with the quotes and found several forums with old versions, where many was older than 2.3.5 and therefore not compatible with PHP 7.

Best regards,
Daniel

Avatar

Profile My Little Forum

by Auge ⌂, Tuesday, April 04, 2017, 11:25 (2550 days ago) @ danielb987

Hello

I have noticed that several forums that is using My Little Forum has very old versions of the software. With very old, I mean 5 - 10 years since last update of the software.

I also found a bunch of active forums with very old versions, down to beta versions of 2.0.

A simple solution would be to have a text file with the version number, like "2.4.2" at the web server:
http://mylittleforum.net/version.txt

What's the benefit of it? One has to request this file to check it's content. The forum owners of the outdated installations never (or at least seldom) checked the project page for updates since the installation of their forum. I don't expect them to do this in the future with a new file/URL.

And when a forum administrator log in to the forum, the forum checks the compares the version of the forum with the file http://mylittleforum.net/version.txt and displays that for the administrator.

We have such a check since version 2.4. An admin of a more or less up-to-date version will be informed about the new version or, if that failes, has a direct link for the check in the menu page of the admin area. But, to be honest, wether our nor your solution reaches the admins of outdated versions.

They will at any time hit the trap of an incompatible PHP version. Either they will actualise their installation, in this case they have then a version with actualisation notices, or they change the forum software to another product or they disable their forums at this point.

I reflected about contacting all admins of forums I'm not convinced if it is a good idea to urge them.

It's not very often an administrator needs to go to the admin page of the forum, and then they do that, the administrators need to manually check the current version of the forum.

The check for the latest version is part of the "daily actions" that in most cases take action during the night (if the forum is visited at this time). Only if the three methods (CURL library, file_get_contents(), fsockopen()) failes, the admin has to execute the check manually.

This obviously doesn't work very well then not even My Little Forums own forum is up to date.

This case is different from the others. Alex is the original author of the software and the domain owner of mylittleforum.net. He is the only one, who is able to upgrade the forum software on this domain. And at the moment he is not reachable (for whatever reason). Even when Milo and me force the development at the moment, we have no administrative access to the project domain.

That said …

I understand that it requires more administration of the developers to keep this file up to date, but it would be very useful for administrators.

… I have to confirm that it is useful without any doubt. But (IMHO) on the other hand the idea of the file on the webspace is useless in this form because of the restrictions for assess, the additional administration effort and it is not necessary in the future because of the solution we have for versions from 2.4.0 on.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Profile My Little Forum

by danielb987, Tuesday, April 04, 2017, 14:52 (2550 days ago) @ Auge

And when a forum administrator log in to the forum, the forum checks the compares the version of the forum with the file http://mylittleforum.net/version.txt and displays that for the administrator.


We have such a check since version 2.4. An admin of an actual version will be informed about the new version or, if that failes, has a direct link for the check in the menu page of the admin area. But, to be honest, wether our nor your solution reaches the admins of outdated versions.

I was not aware of that. When I updated my forum, I couldn't update it so I had to install a new forum and then import the data. In my forum, the table "mlf2_temp_infos" is empty and therefore the admin menu page shows the text "Actual available release: Check the releases list for the actual available version". This gave me the impression that the forum doesn't check the latest version of MLF version on the server.

I did a test and found that when the next forum comes out, the table "mlf2_temp_infos" will be updated and then I will see the latest version in the admin page.

The function "checkUpdate()" in the file includes/functions.inc.php at line 2670 checks if the latest version is newer than the installed version and if not, returns false. I think this test is unnecessary since it is only run once a day and until a new version is released, the forum administrators get the impression that they have to check version manually.

I suggest that you always update the table "mlf2_temp_infos" with the latest version. A quick SQL query run once per day doesn't slow down the web site.

A simple solution would be to have a text file with the version number, like "2.4.2" at the web server:
http://mylittleforum.net/version.txt


What's the benefit of it? One has to request this file to check it's content. The forum owners of the outdated installations never (or at least seldom) checked the project page for updates since the installation of their forum. I don't expect them to do this in the future with a new file/URL.

I didn't know that you already had implemented that functionality. See above.

This obviously doesn't work very well then not even My Little Forums own forum is up to date.


This case is different from the others. Alex is the original author of the software and the domain owner of mylittleforum.net. He is the only one, who can actualise the forum software on this domain. And at the moment he is not reachable (for whatever reason). Even when Milo and me force the development at the moment, we have no administrative access to the project domain.

I'm sorry, I didn't know that. I apologize.

Best regards,
Daniel

Avatar

Profile My Little Forum

by Auge ⌂, Wednesday, April 05, 2017, 06:45 (2549 days ago) @ danielb987

Hello

We have … a check [for the available version] since version 2.4.


I was not aware of that. … In my forum, the table "mlf2_temp_infos" is empty and therefore the admin menu page shows the text "Actual available release: Check the releases list for the actual available version".

Ah, your table is empty. Mine is also empty, but I know why it's empty in my forum. It's because my hoster blocks every attempt to contact external servers with a PHP script. Thatswhy I wrote about the possibility of "the failing of 'the three methods'".

It would be interesting, why the check fails in your forum.

This gave me the impression that the forum doesn't check the latest version of MLF version on the server.

That's reasonable.

I did a test and found that when the next forum comes out, the table "mlf2_temp_infos" will be updated and then I will see the latest version in the admin page.

Oh, yes you are right. The check inside the function checkUpate writes only to the database, when the data of the installed and as new reported versions are different. That's not very plausible for the user.

I suggest that you always update the table "mlf2_temp_infos" with the latest version. A quick SQL query run once per day doesn't slow down the web site.

Yes. Better would be (IMHO) not only to compare the installed with the reported version numbers, but also to compare the stored, yesterday reported and the newly reported number and only white to the database, when …

- … these two numbers are different or
- … there is no reported value or
- … if the newly reported number is different from the number of the installed version.

Besides from that, the query should and does (?) run once per day. Please report the value of the setting next_daily_actions and the time, when you read it from the enhanced settings (date, time, timezone).

This obviously doesn't work very well then not even My Little Forums own forum is up to date.


Alex … is the only one, who can actualise the forum software on this domain. And at the moment he is not reachable …


I'm sorry, I didn't know that. I apologize.

There is no reason to apologise at all.

Tschö, Auge

--
Trenne niemals Müll, denn er hat nur eine Silbe!

Profile My Little Forum

by danielb987, Wednesday, April 05, 2017, 23:33 (2549 days ago) @ Auge

I suggest that you always update the table "mlf2_temp_infos" with the latest version. A quick SQL query run once per day doesn't slow down the web site.


Yes. Better would be (IMHO) not only to compare the installed with the reported version numbers, but also to compare the stored, yesterday reported and the newly reported number and only white to the database, when …

- … these two numbers are different or
- … there is no reported value or
- … if the newly reported number is different from the number of the installed version.

I agree.

Besides from that, the query should and does (?) run once per day. Please report the value of the setting next_daily_actions and the time, when you read it from the enhanced settings (date, time, timezone).

It does. I manually edited the database and set the field "next_daily_actions" in the table "mlf2_settings" to zero and forced a run of "checkUpdate()".

Best regards
Daniel

Avatar

Profile My Little Forum

by Micha ⌂, Tuesday, April 04, 2017, 20:36 (2550 days ago) @ Auge

Hi,

Back from the campaign?

Yes. ;-)

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

RSS Feed of thread