0

So, we have a table on about 5M records and we are currently having some performance issues due to poor indexing. One of the issues is a report that uses the following query:

select o.name,
       s.name,
       eb.uuid,
       eb.series,
       eb.number,
       eb.link          as LINK,
       eb.latest_status AS ESTADO,
       eb.cancel_ticket AS TICKET,
       eb.created_at,
       eb.updated_at
from electronic_bills eb 
         left join stores s on eb.venue_uuid = s.uuid
         left join organizations o on eb.organization_uuid = o.uuid
where eb.created_at >= @DATE_CHECK
  AND eb.latest_status = 'canceled'
  AND eb.cancel_ticket IS NULL
  AND eb.deleted_at IS NULL
AND s.uses_system = 1
order by eb.updated_at desc;

A simple report to know if tickets have been canceled on provider.

That query, unless specified FORCE INDEX (idx_electronic_bills_created_status_cancel_deleted) will use the following index idx_electronic_latest_status_cancel_ticket

Definitions as follows:

  KEY `idx_electronic_latest_status_cancel_ticket` (`latest_status`,`cancel_ticket`),
  KEY `idx_electronic_bills_created_status_cancel_deleted` (`created_at`,`latest_status`,`cancel_ticket`,`deleted_at`)

And a explain analyze for each outputs:

-> Filter: ((eb.created_at >= <cache>((@DATE_CHECK))) and (eb.deleted_at is null))  (cost=49911.55 rows=1218) (actual time=44.558..481.221 rows=97 loops=1)
    -> Index lookup on eb using idx_electronic_latest_status_cancel_ticket (latest_status='canceled', cancel_ticket=NULL), with index condition: (eb.cancel_ticket is null)  (cost=49911.55 rows=90168) (actual time=0.202..447.274 rows=44847 loops=1)



-> Index range scan on eb using idx_electronic_bills_created_status_cancel_deleted over ('2025-02-01 05:00:00' <= created_at AND 'canceled' <= latest_status AND NULL <= cancel_ticket AND NULL <= deleted_at), with index condition: ((eb.created_at >= <cache>((@DATE_CHECK))) and (eb.latest_status = 'canceled') and (eb.cancel_ticket is null) and (eb.deleted_at is null))  (cost=1.32 rows=1) (actual time=27.784..268.523 rows=97 loops=1)

So, I'm trying to understand how the optimizer works in this case and why is it picking a less efficient index.

Also, would appreaciate any reference to help me understand better this.

Thanks!

We know we can force the index using FORCE INDEX but that let us wonder if we are doing anything wrong.

2
  • How MySQL chooses indexes can be inscrutible. It's often dependent on the actual contents of the table. Commented Mar 7 at 18:48
  • What would be a correct approach in this case? I'm looking for some guidance on this topic too. Commented Mar 8 at 5:33

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.