0

I need help optimizing a query for large data table when I test manually it shows fast , but in my slow query log it is logged as taken 10s+ for some reason .

SELECT  q.id, q.village_id, q.to_player_id, q.to_village_id,
        q.proc_type, TIMESTAMPDIFF(SECOND, NOW(),q.end_date) remainingTimeInSeconds
    FROM table
   

I expect to output the results that's time is ended , meaning time left must be 0 or less , in ASC order .

it order by the end time itself ,because when we have many attacks we must arrange them according to which attack suppose to arrive first ,then process them one by one

10
  • I've removed the PHP and PDO tags because there's no PHP code referred to in the question. If you think PHP is relevant to the question, please include the relevant code. Commented Oct 17, 2023 at 0:56
  • In the select clause, the TIMESTAMPDIFF(SECOND, NOW(),q.end_date) which calculates the remaining seconds is straightforward and easy to understand. However, the search condition TIMESTAMPDIFF(SECOND, NOW(),(q.end_date - INTERVAL (q.execution_time*(q.threads-1)) SECOND)) <= 0 is quite confusing without elaboration for those of us who are not sport enthusiasts. And when you say I expect to output the results that's time is ended , meaning time left must be 0 or less , couldn't it be done in a simpler manner than the one in the WHERE clause ? Commented Oct 17, 2023 at 2:02
  • And if you can calculate the ramaining seconds using the simple TIMESTAMPDIFF(SECOND, NOW(),q.end_date) , why can't you use q.end_date <= now() as your search condition. Without clarification, it's all guess work. Please elaborate. Commented Oct 17, 2023 at 2:14
  • thanks for reply , I have edited the post with explain Commented Oct 17, 2023 at 2:53
  • 1
    <calculation involving one or more columns> <= 0 has to be executed for every single row and can't be shortcut with an index. That means the whole table is loaded in to memory every single time. Where as <column> <= <calculation> could use a range seekp to eliminate the irrelevant rows. In your case the calculation will always use a column from the table, and so will always scan the whole table. You should create a calculated column with the timestamp that the row 'expires' (q.end_date - INTERVAL (q.execution_time*(q.threads-1))), index it, then use WHERE x <= NOW() Commented Oct 18, 2023 at 0:47

2 Answers 2

1

You need to make the search criteria sargable.

That is, do a calculation on end_date, execution_time, threads, store the result in the table and index it.

Then the optimiser can quickly determine which rows are relevant, without scanning the whole table every time.

The following automatically generates and expiry column and keeps it up to date if the row is modified.

Then that column is indexed and used in a much simpler WHERE clause.

ALTER TABLE
  p_queue
ADD COLUMN
  expiry TIMESTAMP GENERATED ALWAYS
    AS (end_date - INTERVAL (execution time * (threads-1)))
  STORED
;

ALTER TABLE
  p_queue
ADD INDEX
  (expiry)
;

SELECT
  q.id, q.player_id, q.village_id, q.to_player_id, q.to_village_id,
  q.proc_type, q.building_id, q.proc_params, q.threads,
  q.execution_time, TIMESTAMPDIFF(SECOND, NOW(),q.end_date) remainingTimeInSeconds
FROM
  p_queue q
WHERE
  q.expiry <= NOW()
ORDER BY
  q.expiry ASC
Sign up to request clarification or add additional context in comments.

4 Comments

If there is no way to optimize the original search condition to be sargable, then I suppose a generated column with index is the best approach. Thank you.
@blabla_bingo Is there a known maximum execution time and number of threads? If so, you can do WHERE <your calculation> AND end_date <= NOW() + INTERVAL (<MAX_EXECUTION_TIME> * (<MAX_THREADS> - 1)). The latter half could use an index on end date to reduce the search space, and be sargable (using an index on end date). Depending on your data and the max values, that may eliminate enough rows to make the non-sargable search "less painful".
thank you MatBailie , I will test you and tell you
thank you very much , it works and it is a lot faster .. but with one big issue , order is different that original query it shows as date time , where the original output results in seconds ( I think it called UNIX_TIMESTAMP)
0

TIMESTAMPDIFF(SECOND, NOW(),q.end_date) <= 0 can be optimized by changing to to q.end_date >= NOW(), but the longer expression cannot be optimized.

See sargable (See @MatBailie's comment.)

The table would need INDEX(end_date).

For the more complex test, this might help: INDEX(end_date, execution_time, threads))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.