SQLite queryslow when using index

Question

I have a table indexed on a text column, and I want all my queries to return results ordered by name without any performance hit. Table has around 1 million rows if it matters.

Table -

CREATE TABLE table (Name text)

Index -

CREATE INDEX "NameIndex" ON "Files" (
    "Name" COLLATE nocase   ASC
);

Query 1 -

select * from table where Name like "%a%"

Query plan, as expected a full scan -

SCAN TABLE table

Time -

Result: 179202 rows returned in 53ms

Query 2, now using order by to read from index -

select * from table where Name like "%a%" order by Name collate nocase

Query plan, scan using index -

SCAN TABLE table USING INDEX NameIndex

Time -

Result: 179202 rows returned in 672ms

Used DB Browser for SQLite to get the information above, with default Pragmas.

I'd assume scanning the index would be as performant as scanning the table, is it not the case or am I doing something wrong?

Another interesting thing I noticed, that may be relevant -

Query 3 -

select * from table where Name like "a%"

Result: 23026 rows returned in 9ms

Query 4 -

select * from table where name like "a%" order by name collate nocase

Result: 23026 rows returned in 101ms

And both has them same query plan -

SEARCH TABLE table USING INDEX NameIndex (Name>? AND Name<?)

Is this expected? I'd assume the performance be the same if the plan was the same.

Thanks!

EDIT - The reason the query is slower was because I used select * and not select name, causing SQLite to go between the table and the index.

The solution was to use clustered index, thanks @Tomalak for helping me find it - create table mytable (a text, b text, primary key (a,b)) without rowid The table will be ordered by default using a + b combination, meaning that full scan queries will be much faster (now 90ms).

Tomalak · Accepted Answer · 2022-01-21 20:01:51Z

2

A LIKE pattern that starts with % can never use an index. It will always result in a full table scan (or index scan, if the query can be covered by the index itself).

It's logical when you think about it. Indexes are not magic. They are sorted lists of values, exactly like a keyword index in a book, and that means they are only only quick for looking up a word if you know how the given word starts. If you're searching for the middle part of a word, you would have to look at every index entry in a book as well.

Conclusion from the ensuing discussion in the comments:

The best course of action to get a table that always sorts by a non-unique column without a performance penalty is to create it without ROWID, and turn it into a clustering index over a the column in question plus a second column that makes the combination unique:

CREATE TABLE MyTable (
    Name   TEXT COLLATE NOCASE,
    Id     INTEGER,
    Other  TEXT,
    Stuff  INTEGER,
    PRIMARY KEY(Name, Id)  -- this will sort the whole table by Name
) WITHOUT ROWID;

This will result in a performance penalty for INSERT/UPDATE/DELETE operations, but in exchange sorting will be free since the table is already ordered.

edited Jan 21, 2022 at 20:01

answered Jan 21, 2022 at 18:15

Tomalak

340k68 gold badges547 silver badges635 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Tomalak Over a year ago

@AdirHudayfi How does the performance change when you change the ordered queries so they can be fulfilled from the index? e.g. select Name from table where Name like "%a%" order by Name collate nocase

Tomalak Over a year ago

@AdirHudayfi No, you did not. You mentioned times for select * queries, that's not what I was talking about.

Tomalak Over a year ago

@AdirHudayfi That would be my conclusion. (That's why covering indexes are great, when you can make them, and SELECT *-queries are generally bad because they optimize poorly)

Tomalak Over a year ago

@AdirHudayfi An index always contains a copy of the column data (deduped, but still) so indexes that contain all columns will increase the database size and slow down INSERT/UPDATE/DELETE operations. If you want your table to always sort by one particular column, you can make a clustered index, i.e. you ensure that the base table itself is always ordered by a particular column. For that you create the table WITHOUT ROWID, and set up a primary key with your defined order. This will slow down INSERTs somewhat, but in exchange sorting will be free.

Tomalak Over a year ago

@AdirHudayfi You can make a two-part primary key that consists of Name and a unique number (or any other existing property that makes the combination unique), and then cluster by this key sorting by Name first.

|

Collectives™ on Stack Overflow

SQLite queryslow when using index

1 Answer 1

11 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

11 Comments

Your Answer

Sign up or log in

Post as a guest

Related