Optimize query with OFFSET on large table

Question

I have table

create table big_table (
id serial primary key,
-- other columns here
vote int
);

This table is very big, approximately 70 million rows, I need to query:

SELECT * FROM big_table
ORDER BY vote [ASC|DESC], id [ASC|DESC]
OFFSET x LIMIT n  -- I need this for pagination

As you may know, when x is a large number, queries like this are very slow.

For performance optimization I added indexes:

create index vote_order_asc on big_table (vote asc, id asc);

and

create index vote_order_desc on big_table (vote desc, id desc);

EXPLAIN shows that the above SELECT query uses these indexes, but it's very slow anyway with a large offset.

What can I do to optimize queries with OFFSET in big tables? Maybe PostgreSQL 9.5 or even newer versions have some features? I've searched but didn't find anything.

@NeilMcGuigan This method have some disadvantages, for example only previous and next pages , but it seems I forced to choice this way — Oto Shavadze
– Oto Shavadze, Commented Dec 6, 2015 at 9:00

Erwin Brandstetter · Accepted Answer · 2024-07-17 01:24:30Z

43

A large OFFSET is always going to be slow. Postgres has to order all rows and count the visible ones up to your offset. To skip all previous rows directly you could add an indexed row_number to the table (or create a MATERIALIZED VIEW including said row_number) and work with WHERE row_number > x instead of OFFSET x.

However, this approach is only sensible for read-only (or mostly) data. Implementing the same for table data that can change concurrently is more challenging. You need to start by defining desired behavior exactly.

I suggest a different approach:

Row value comparison for "keyset pagination"

See:

SQL syntax term for 'WHERE (col1, col2) < (val1, val2)'

SELECT *
FROM   big_table
WHERE  (vote, id) > (vote_x, id_x)  -- ROW values
ORDER  BY vote, id                  -- match ROW value comparison
LIMIT  n;

Where vote_x and id_x are from the last row of the previous page (for both DESC and ASC). Or from the first if navigating backwards.

(vote, id) needs to be UNIQUE to get deterministic results.

Comparing row values is supported by the index you already have - a standard SQL feature, but not every RDBMS supports it.

CREATE INDEX vote_order_asc ON big_table (vote, id);

Or for descending order:

SELECT *
FROM   big_table
WHERE  (vote, id) < (vote_x, id_x)  -- ROW values
ORDER  BY vote DESC, id DESC
LIMIT  n;

Can use the same index.
I suggest you declare your columns NOT NULL or acquaint yourself with the NULLS FIRST|LAST construct:

Sort by column ASC, but NULL values first?

Note two things in particular:

ROW values

The ROW values in the WHERE clause cannot be replaced with separated member fields.
WHERE (vote, id) > (vote_x, id_x) cannot be replaced with:

WHERE  vote >= vote_x
AND    id   > id_x

That would rule out all rows with id <= id_x, while we only want to do that for the same vote and not for the next. A correct equivalent would be:

WHERE (vote = vote_x AND id > id_x) OR vote > vote_x

... which doesn't play along with indexes as nicely, and gets increasingly complicated for more columns.

Would be simple for a single column, obviously. That's the special case I mentioned at the outset.

Aligned direction

The technique does not work for mixed / opposing directions in ORDER BY like:

ORDER  BY vote ASC, id DESC

You may be able to invert the value in a multicolumn expression index on:

(vote, (id * -1))

And use the same expression in ORDER BY:

ORDER  BY vote ASC, (id * -1) ASC

Improve performance for order by with columns from many tables

Blog post by Markus Winand:

"Pagination done the PostgreSQL way"

edited Jul 17, 2024 at 1:24

answered Dec 15, 2015 at 13:57

Erwin Brandstetter

669k160 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Dolphin Over a year ago

how about jump to a random page?@Erwin Brandstetter

Erwin Brandstetter Over a year ago

@Dolphin: In a read-only setting, rather add an indexed row number. For read-mostly, a materialized view with a row number. Under concurrent write load, "pages" are a moving target. You might "jump" to a page number with OFFSET, but jumping back might send you to a different page ...

TheRealChx101 Over a year ago

Interesting approach. So it seems to me that the primary id could be used in the filtering as well(?). Initially the parameter value would be (0 or negative), and then when clicking getting the next page, simply passing the largest primary ID value from the current page(?)

thepiyush13 · Accepted Answer · 2015-12-05 20:40:49Z

-5

Have you tried partioning the table ?

Ease of management, improved scalability and availability, and a reduction in blocking are common reasons to partition tables. Improving query performance is not a reason to employ partitioning, though it can be a beneficial side-effect in some cases. In terms of performance, it is important to ensure that your implementation plan includes a review of query performance. Confirm that your indexes continue to appropriately support your queries after the table is partitioned, and verify that queries using the clustered and nonclustered indexes benefit from partition elimination where applicable.

http://sqlperformance.com/2013/09/sql-indexes/partitioning-benefits

answered Dec 5, 2015 at 20:40

thepiyush13

1,3211 gold badge8 silver badges9 bronze badges

2 Comments

user330315 Over a year ago

That is for SQL Server, not for Postgres

thepiyush13 Over a year ago

PostgreSQL supports basic table partitioning : postgresql.org/docs/current/interactive/ddl-partitioning.html

Collectives™ on Stack Overflow

Optimize query with OFFSET on large table

2 Answers 2

Row value comparison for "keyset pagination"

ROW values

Aligned direction

3 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Row value comparison for "keyset pagination"

ROW values

Aligned direction

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related