152

Which way to count a number of rows should be faster in MySQL?

This:

SELECT COUNT(*) FROM ... WHERE ...

Or, the alternative:

SELECT 1 FROM ... WHERE ...

// and then count the results with a built-in function, e.g. in PHP mysql_num_rows()

One would think that the first method should be faster, as this is clearly database territory and the database engine should be faster than anybody else when determining things like this internally.

6
  • 2
    Oh, I found a similar question (stackoverflow.com/questions/1855226/…). But then, I use SELECT 1 and not SELECT *. Is there a difference? Commented Feb 20, 2011 at 22:00
  • i don't know, but it is conceivable that these two answers are identical -- the mysql query optimizer may do the same thing on each. that said the former is less ambiguous than the latter. why don't you write some benchmarks and test it out? Commented Feb 20, 2011 at 22:06
  • Uhm, let's assume I'm trying to enhance SO's search engine visibility by asking a similar question in different words ;) Commented Feb 20, 2011 at 22:37
  • 1
    The difference is the amount of data sent over to the PHP side. The more columns you have, the slower SELECT * gets relative to SELECT 1, because all columns are retrieved instead of just the number 1. When you run mysql_query(), for instance, the entire result set is sent to PHP from MySQL, regardless of what you do with that data. Commented Feb 26, 2013 at 8:28
  • Asking a question like this is a great way to gain insight or new ideas, but ultimately if you actually have a specific scenario where you want more speed, you will have to run tests to see what is the fastest. Commented Dec 17, 2015 at 20:29

13 Answers 13

147

When you COUNT(*) it takes in count column indexes, so it will be the best result. MySQL with MyISAM engine actually stores row count, it doesn't count all rows each time you try to count all rows. (based on primary key's column)

Using PHP to count rows is not very smart, because you have to send data from MySQL to PHP. Why do it when you can achieve the same on the MySQL side?

If the COUNT(*) is slow, you should run EXPLAIN on the query, and check if indexes are really used, and where they should be added.


The following is not the fastest way, but there is a case, where COUNT(*) doesn't really fit - when you start grouping results, you can run into a problem where COUNT doesn't really count all rows.

The solution is SQL_CALC_FOUND_ROWS. This is usually used when you are selecting rows but still need to know the total row count (for example, for paging). When you select data rows, just append the SQL_CALC_FOUND_ROWS keyword after SELECT:

SELECT SQL_CALC_FOUND_ROWS [needed fields or *] FROM table LIMIT 20 OFFSET 0;

After you have selected needed rows, you can get the count with this single query:

SELECT FOUND_ROWS();

FOUND_ROWS() has to be called immediately after the data selecting query.


In conclusion, everything actually comes down to how many entries you have and what is in the WHERE statement. You should really pay attention on how indexes are being used, when there are lots of rows (tens of thousands, millions, and up).

Sign up to request clarification or add additional context in comments.

17 Comments

Correction: MyISAM stores row count. Other storage engines like InnoDB do not store row counts and will count all rows each time.
Do you know which will be fastest when you simply want to find out whether there is a row: SELECT 1 FROM ... LIMIT 1 or SELECT COUNT(*) FROM ...?
It's probably useful to note that if you need the data anyway and only want a count for pagination/etc. it is more efficient to get the data then count the rows in your program.
It's irrelevant whether the engine stores row counts. The question clearly states there's a WHERE clause.
FOUND_ROWS and SQL_CALC_FOUND_ROWS are depreciated and won't be used in versions beyond MySQL 8.0.17. dev.mysql.com/doc/refman/5.0/en/…
|
99

After speaking with my team-mates, Ricardo told us that the faster way is:

show table status like '<TABLE NAME>' \G

But you have to remember that the result may not be exact.

You can use it from command line too:

$ mysqlshow --status <DATABASE> <TABLE NAME>

More information: http://dev.mysql.com/doc/refman/5.7/en/show-table-status.html

And you can find a complete discussion at mysqlperformanceblog

3 Comments

For InnoDB, this is an approximation.
This is great to know when needing rough idea of the number of rows in very large tables where count(*) can literally take hours!
Using SHOW TABLE STATUS (or the equivalent SELECT in information_schema) is fast, but it does not handle a WHERE clause. It is precise for MyISAM, but imprecise (sometimes off by a factor of 2) for InnoDB.
36

This query (which is similar to what bayuah posted) shows a nice summary of all tables count inside a database: (simplified version of stored procedure by Ivan Cachicatari which I highly recommend).

SELECT TABLE_NAME AS 'Table Name', TABLE_ROWS AS 'Rows' FROM information_schema.TABLES WHERE TABLES.TABLE_SCHEMA = '`YOURDBNAME`' AND TABLES.TABLE_TYPE = 'BASE TABLE'; 

Example:

+-----------------+---------+
| Table Name      | Rows    |
+-----------------+---------+
| some_table      |   10278 |
| other_table     |     995 |

2 Comments

It gives me a result. But the results from count(1) and this one are different. This way gives a less number always than count query. Any thoughts?
Just a note to the readers. This method is extremely fast but it is only applicable when you can work with approximated number of rows since the value stored in information_schema is not the same as the one returned by SELECT count(*) FROM in case InnoDB is used. If you need strict value then keep in mind that this method gives strict value only with MyISAM tables. With InnoDB the number of rows is rough approximation.
13

I've always understood that the below will give me the fastest response times.

SELECT COUNT(1) FROM ... WHERE ...

3 Comments

show table status like '<TABLE NAME>' This will be much faster.
@deep - but not relevant if you have a WHERE clause. And, for InnoDB, it is only an estimate.
Table with 1mil+ rows, count(*) vs count(1) take the same time.
9

Try this:

SELECT
    table_rows "Rows Count"
FROM
    information_schema.tables
WHERE
    table_name="Table_Name"
AND
    table_schema="Database_Name";

5 Comments

@lepe I'm sorry. I meant, it's really nice if someone who did downvoting give some explanation why he/she do that, so everybody can learn something about it.
This will give you an approximate answer quickly. If you need an exact answer, you need to perform select count(*) from table_name or something else. dba.stackexchange.com/questions/151769/…
@Programster Thank you. It's better than leave me in the dark for almost a year.
@bayuah I'm not sure what you meant by your last comment. I can only assume you think I am the one who down-voted your answer, which I'm not.
@Programster No, I'm sorry, I didn't mean that. I meant thank you for your explanation, so I can conjecture what maybe Downvoter thought when he/she do that.
8

If you need to get the count of the entire result set you can take following approach:

SELECT SQL_CALC_FOUND_ROWS * FROM table_name LIMIT 5;
SELECT FOUND_ROWS();

This isn't normally faster than using COUNT albeit one might think the opposite is the case because it's doing the calculation internally and doesn't send the data back to the user thus the performance improvement is suspected.

Doing these two queries is good for pagination for getting totals but not particularly for using WHERE clauses.

3 Comments

Intersting. Does that work across the most common database systems? MySQL, Postgres, SQLite...?
This is actually often not faster than using COUNT(*) at all. See stackoverflow.com/questions/186588/…
You should be VERY careful when using this function. Its reckless use once brought our entire production environment to a grinding halt. It is VERY resource intensive, so use with care.
7

I did some benchmarks to compare the execution time of COUNT(*) vs COUNT(id) (id is the primary key of the table - indexed).

Number of trials: 10 * 1000 queries

Results: COUNT(*) is faster 7%

VIEW GRAPH: benchmarkgraph

My advice is to use: SELECT COUNT(*) FROM table

1 Comment

FYI there is also a common way to count with COUNT(1), would be interesting to see some benchmarks there...
6

EXPLAIN SELECT id FROM .... did the trick for me. and I could see the number of rows under rows column of the result.

Comments

5

Perhaps you may want to consider doing a SELECT max(Id) - min(Id) + 1. This will only work if your Ids are sequential and rows are not deleted. It is however very fast.

1 Comment

Be careful: servers sometimes use auto increment value > 1 (for backup reasons), so this solution is good but you should check your DB configuration first.
4

This is the best query able to get the fastest results.

SELECT SQL_CALC_FOUND_ROWS 1 FROM `orders`;
SELECT FOUND_ROWS();

In my benchmark test: 0.448s

enter image description here

This query takes 4.835s

SELECT SQL_CALC_FOUND_ROWS * FROM `orders`;
SELECT FOUND_ROWS();

enter image description here

count * takes 25.675s

SELECT count(*) FROM `orders`;

enter image description here

Comments

2

If you don't need super-exact count, then you can set a lower transaction isolation level for the current session. Do it like this:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT count(*) FROM the_table WHERE ...;
COMMIT; /* close the transaction */

It is also good to have an index that matches the WHERE condition.

It really speeds up the counting for big InnoDB tables. I checked it on a table with ~700M rows and heavy load, it works. It reduced the query time from ~451 seconds to ~2 seconds.

I took the idea from this answer: https://stackoverflow.com/a/918092/1743367

2 Comments

As READ_UNCOMMITTED suggests this way your counts may be not exactly precise and you'll see the dirty read effects. For example if you are trying to COUNT(*).. WHERE is_maked_for_deletion = FALSE and another transaction it trying to mark lots of rows for deletion simultaneously, your counts will see some of these markings even if that changing transaction is going to be rolled back. So the suggested approach is great if you just want the rough counts and nobody is likely to try changing MANY records simultaneously.
If you're using AWS Aurora and making a query to a read-only replica, this method won't work. "These DB instances ignore any SET TRANSACTION ISOLATION LEVEL statements and continue using the REPEATABLE READ isolation level." (source: docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/… )
0

I handled tables for the German Government with sometimes 60 million records.

And we needed to know many times the total rows.

So we database programmers decided that in every table is record one always the record in which the total record numbers is stored. We updated this number, depending on INSERT or DELETE rows.

We tried all other ways. This is by far the fastest way.

1 Comment

and what are the details of how you updated that row? Whichs means though a faulty design to a table, where all rows would require a wasted int to come along for the ride.
0

A count(*) statement with a where condition on the primary key returned the row count much faster for me avoiding full table scan.

SELECT COUNT(*) FROM ... WHERE <PRIMARY_KEY> IS NOT NULL;

This was much faster for me than

SELECT COUNT(*) FROM ...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.