Thread: COUNT and Performance ...
This patch adds a note to the documentation describing why the performance of min() and max() is slow when applied to the entire table, and suggesting the simple workaround most experienced Pg users eventually learn about (SELECT xyz ... ORDER BY xyz LIMIT 1). Any suggestions on improving the wording of this section would be welcome. Cheers, ------ ORDER and LIMIT work pretty fast (no seq scan). In special cases there can be another way to avoid seq scans: action=# select tuple_count from pgstattuple('t_text');tuple_count ------------- 14203 (1 row) action=# BEGIN; BEGIN action=# insert into t_text (suchid) VALUES ('100000'); INSERT 578606 1 action=# select tuple_count from pgstattuple('t_text');tuple_count ------------- 14204 (1 row) action=# ROLLBACK; ROLLBACK action=# select tuple_count from pgstattuple('t_text');tuple_count ------------- 14203 (1 row) If people want to count ALL rows of a table. The contrib stuff is pretty useful. It seems to be transaction safe. The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz): root@actionscouts:~# time psql action -c "select tuple_count from pgstattuple('t_text');"tuple_count ------------- 14203 (1 row) real 0m0.266s user 0m0.030s sys 0m0.020s root@actionscouts:~# time psql action -c "select count(*) from t_text"count -------14203 (1 row) real 0m0.701s user 0m0.040s sys 0m0.010s I think that this could be a good workaround for huge counts (maybe millions of records) with no where clause and no joins. Hans <http://kernel.cybertec.at>
On Sun, 2003-02-02 at 03:55, Hans-Jürgen Schönig wrote: > If people want to count ALL rows of a table. The contrib stuff is pretty > useful. It seems to be transaction safe. Interesting -- I didn't know about the contrib stuff. I'll update the docs patch. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Neil Conway wrote: >On Sun, 2003-02-02 at 03:55, Hans-Jürgen Schönig wrote: > > >>If people want to count ALL rows of a table. The contrib stuff is pretty >>useful. It seems to be transaction safe. >> >> > >Interesting -- I didn't know about the contrib stuff. I'll update the >docs patch. > >Cheers, > >Neil > > Personall I think a system table containing information about all tables in the database would be best. The contrib stuff provides a lot of useful information which would be perfect for a system view (dead tuples, number of records, ...). I guess many people would consider that to be useful. This would not fix the COUNT problem but it would provide the most essential information people need: The number of records in the table. Maybe this is worth discussing. Regards, Hans -- *Cybertec Geschwinde u Schoenig* Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/1/913 68 09; +43/664/233 90 75 www.postgresql.at <http://www.postgresql.at>, cluster.postgresql.at <http://cluster.postgresql.at>, www.cybertec.at <http://www.cybertec.at>, kernel.cybertec.at <http://kernel.cybertec.at>
Hans-Jürgen Schönig <postgres@cybertec.at> writes: > In special cases there can be another way to avoid seq scans: > [ use pgstattuple() ] But pgstattuple does do a sequential scan of the table. You avoid a lot of the executor's tuple-pushing and plan-node-traversing machinery that way, but the I/O requirement is going to be exactly the same. > If people want to count ALL rows of a table. The contrib stuff is pretty > useful. It seems to be transaction safe. Not entirely. pgstattuple uses HeapTupleSatisfiesNow(), which means you get a count of tuples that are committed good in terms of the effects of transactions committed up to the instant each tuple is examined. This is in general different from what count(*) would tell you, because it ignores snapshotting. It'd be quite unrepeatable too, in the face of active concurrent changes --- it's very possible for pgstattuple to count a single row twice or not at all, if it's being concurrently updated and the other transaction commits between the times pgstattuple sees the old and new versions of the row. > The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz): I think your test case is small enough that the whole table is resident in memory, so this measurement only accounts for CPU time per tuple and not any I/O. Given the small size of pgstattuple's per-tuple loop, the speed differential is not too surprising --- but it won't scale up to larger tables. Sometime it would be interesting to profile count(*) on large tables and see exactly where the CPU time goes. It might be possible to shave off some of the executor overhead ... regards, tom lane
On Sun, 2003-02-02 at 13:04, Tom Lane wrote: > I think your test case is small enough that the whole table is resident > in memory, so this measurement only accounts for CPU time per tuple and > not any I/O. Given the small size of pgstattuple's per-tuple loop, the > speed differential is not too surprising --- but it won't scale up to > larger tables. Good observation. When the entire table is in cache, pgstattuple about 4 times faster than count(*) on my machine. When the table is too large to fit into cache, the performance difference drops to 8% in favour of pgstattuple: nconway=# select count(*) from big_table; count ---------8388612 (1 row) Time: 26769.99 ms nconway=# SELECT tuple_count FROM pgstattuple('big_table');tuple_count ------------- 8388612 (1 row) Time: 24658.87 ms Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
> > >But pgstattuple does do a sequential scan of the table. You avoid a lot >of the executor's tuple-pushing and plan-node-traversing machinery that >way, but the I/O requirement is going to be exactly the same. > > I have tried it more often so that I can be sure that everything is in the cache. I thought it did some sort of "stat" on tables. Too bad :(. >>If people want to count ALL rows of a table. The contrib stuff is pretty >>useful. It seems to be transaction safe. >> >> > >Not entirely. pgstattuple uses HeapTupleSatisfiesNow(), which means you >get a count of tuples that are committed good in terms of the effects of >transactions committed up to the instant each tuple is examined. This >is in general different from what count(*) would tell you, because it >ignores snapshotting. It'd be quite unrepeatable too, in the face of >active concurrent changes --- it's very possible for pgstattuple to >count a single row twice or not at all, if it's being concurrently >updated and the other transaction commits between the times pgstattuple >sees the old and new versions of the row. > > Interesting. I have tried it with concurrent sessions and transactions - the results seemed to be right (I could not see the records inserted by open transactions). Too bad :(. It would have been a nice work around. >>The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz >> > >I think your test case is small enough that the whole table is resident >in memory, so this measurement only accounts for CPU time per tuple and >not any I/O. Given the small size of pgstattuple's per-tuple loop, the >speed differential is not too surprising --- but it won't scale up to >larger tables. > >Sometime it would be interesting to profile count(*) on large tables >and see exactly where the CPU time goes. It might be possible to shave >off some of the executor overhead ... > > regards, tom lane > > I have tried it with the largest table on my testing system. Reducing the overhead is great :). Thanks a lot, Hans -- *Cybertec Geschwinde u Schoenig* Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/1/913 68 09; +43/664/233 90 75 www.postgresql.at <http://www.postgresql.at>, cluster.postgresql.at <http://cluster.postgresql.at>, www.cybertec.at <http://www.cybertec.at>, kernel.cybertec.at <http://kernel.cybertec.at>
For a more accurate view of the time used, use the \timing switch in psql. That leaves out the overhead for forking and loading psql, connecting to the database and such things. I think, that it would be even nicer if postgresql automatically choose to replace the count(*)-with-no-where with something similar. Regards, Arjen Hans-Jürgen Schönig wrote: > This patch adds a note to the documentation describing why the > performance of min() and max() is slow when applied to the entire table, > and suggesting the simple workaround most experienced Pg users > eventually learn about (SELECT xyz ... ORDER BY xyz LIMIT 1). > > Any suggestions on improving the wording of this section would be > welcome. > > Cheers, > > > ------ > > ORDER and LIMIT work pretty fast (no seq scan). > In special cases there can be another way to avoid seq scans: > > > action=# select tuple_count from pgstattuple('t_text'); > tuple_count > ------------- > 14203 > (1 row) > > action=# BEGIN; > BEGIN > action=# insert into t_text (suchid) VALUES ('100000'); > INSERT 578606 1 > action=# select tuple_count from pgstattuple('t_text'); > tuple_count > ------------- > 14204 > (1 row) > > action=# ROLLBACK; > ROLLBACK > action=# select tuple_count from pgstattuple('t_text'); > tuple_count > ------------- > 14203 > (1 row) > > > If people want to count ALL rows of a table. The contrib stuff is pretty > useful. It seems to be transaction safe. > > The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz): > > > root@actionscouts:~# time psql action -c "select tuple_count from > pgstattuple('t_text');" > tuple_count > ------------- > 14203 > (1 row) > > > real 0m0.266s > user 0m0.030s > sys 0m0.020s > root@actionscouts:~# time psql action -c "select count(*) from t_text" > count > ------- > 14203 > (1 row) > > > real 0m0.701s > user 0m0.040s > sys 0m0.010s > > > I think that this could be a good workaround for huge counts (maybe > millions of records) with no where clause and no joins. > > Hans > > <http://kernel.cybertec.at> > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
I didn't think pgstattuple had proper visibility checks. --------------------------------------------------------------------------- Hans-J�rgen Sch�nig wrote: > This patch adds a note to the documentation describing why the > performance of min() and max() is slow when applied to the entire table, > and suggesting the simple workaround most experienced Pg users > eventually learn about (SELECT xyz ... ORDER BY xyz LIMIT 1). > > Any suggestions on improving the wording of this section would be > welcome. > > Cheers, > > > ------ > > ORDER and LIMIT work pretty fast (no seq scan). > In special cases there can be another way to avoid seq scans: > > > action=# select tuple_count from pgstattuple('t_text'); > tuple_count > ------------- > 14203 > (1 row) > > action=# BEGIN; > BEGIN > action=# insert into t_text (suchid) VALUES ('100000'); > INSERT 578606 1 > action=# select tuple_count from pgstattuple('t_text'); > tuple_count > ------------- > 14204 > (1 row) > > action=# ROLLBACK; > ROLLBACK > action=# select tuple_count from pgstattuple('t_text'); > tuple_count > ------------- > 14203 > (1 row) > > > If people want to count ALL rows of a table. The contrib stuff is pretty > useful. It seems to be transaction safe. > > The performance boost is great (PostgreSQL 7.3, RedHat, 166Mhz): > > > root@actionscouts:~# time psql action -c "select tuple_count from > pgstattuple('t_text');" > tuple_count > ------------- > 14203 > (1 row) > > > real 0m0.266s > user 0m0.030s > sys 0m0.020s > root@actionscouts:~# time psql action -c "select count(*) from t_text" > count > ------- > 14203 > (1 row) > > > real 0m0.701s > user 0m0.040s > sys 0m0.010s > > > I think that this could be a good workaround for huge counts (maybe > millions of records) with no where clause and no joins. > > Hans > > <http://kernel.cybertec.at> > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I didn't think pgstattuple had proper visibility checks. It doesn't, see followup discussion. regards, tom lane