0

In databases such as PostgreSQL, doing exact counts like select count (*) from table where condition perform a full table scan.

PostgreSQL will need to scan either the entire table or the entirety of an index that includes all rows in the table.

What theoretical data structures or databases (analytics databases?) allow making this fast? I understand that this is not possible using a conventional index, since indexes only allow for identifying individual relevant rows quickly, but we are performing an aggregate function and not interested in any particular row. (More generally, this invites the question about whether data structures playing the role of indexes exist for aggregate functions.)

2
  • Your assumption is not entirely correct, your copy/paste already confirms this: or the entirety of an index. However, it's not as effective as other implementations. We use triggers to maintain statistics, which works excellently and is very fast. Commented Jul 22 at 22:19
  • It's very easy for simple B+tree indexes. Postgres has complicated data structures to handle multi-version concurrency control, however, which make it not so easy for them to determine how many rows you can currently see in a way that would be consistent with your transaction's isolation level. Commented Jul 28 at 20:06

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.