Thread: 8.1 count(*) distinct: IndexScan/SeqScan
Hi, PostgreSQL 8.1 fresh install on a freshly installed OpenBSD 3.8 box. postgres=# CREATE DATABASE test; CREATE DATABASE postgres=# create table test (id serial, val integer); NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for serial column "test.id" CREATE TABLE postgres=# create unique index testid on test (id); CREATE INDEX postgres=# create index testval on test (val); CREATE INDEX postgres=# insert into test (val) values (round(random() *1024*1024*1024)); INSERT 0 1 [...] insert many random values postgres=# vaccum full verbose analyze; postgres=# select count(1) from test; count --------- 2097152 (1 row) postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------------ Aggregate (cost=66328.72..66328.73 rows=1 width=0) -> Unique (cost=0.00..40114.32 rows=2097152 width=8) -> Index Scan using testval on test (cost=0.00..34871.44 rows=2097152 width=8) (3 rows) postgres=# set enable_indexscan=off; postgres=# explain analyze select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------ Aggregate (cost=280438.64..280438.65 rows=1 width=0) (actual time=39604.107..39604.108 rows=1 loops=1) -> Unique (cost=243738.48..254224.24 rows=2097152 width=8) (actual time=30281.004..37746.488 rows=2095104 loops=1) -> Sort (cost=243738.48..248981.36 rows=2097152 width=8) (actual time=30280.999..33744.197 rows=2097152 loops=1) Sort Key: test.val -> Seq Scan on test (cost=0.00..23537.52 rows=2097152 width=8) (actual time=11.550..3262.433 rows=2097152 loops=1) Total runtime: 39624.094 ms (6 rows) postgres=# set enable_indexscan=on; postgres=# explain analyze select count(*) from (select distinct on (val) * from test where val<10000000) as foo; QUERY PLAN ------------------------------------------------------------------------ ----------------------------------------------------------------------- Aggregate (cost=4739.58..4739.59 rows=1 width=0) (actual time=4686.472..4686.473 rows=1 loops=1) -> Unique (cost=4380.56..4483.14 rows=20515 width=8) (actual time=4609.046..4669.289 rows=19237 loops=1) -> Sort (cost=4380.56..4431.85 rows=20515 width=8) (actual time=4609.041..4627.976 rows=19255 loops=1) Sort Key: test.val -> Bitmap Heap Scan on test (cost=88.80..2911.24 rows=20515 width=8) (actual time=130.954..4559.244 rows=19255 loops=1) Recheck Cond: (val < 10000000) -> Bitmap Index Scan on testval (cost=0.00..88.80 rows=20515 width=0) (actual time=120.041..120.041 rows=19255 loops=1) Index Cond: (val < 10000000) Total runtime: 4690.513 ms (9 rows) postgres=# explain select count(*) from (select distinct on (val) * from test where val<100000000) as foo; QUERY PLAN ------------------------------------------------------------------------ ----------------- Aggregate (cost=16350.20..16350.21 rows=1 width=0) -> Unique (cost=0.00..13748.23 rows=208158 width=8) -> Index Scan using testval on test (cost=0.00..13227.83 rows=208158 width=8) Index Cond: (val < 100000000) (4 rows) postgres=# set enable_indexscan=off; postgres=# explain analyze select count(*) from (select distinct on (val) * from test where val<100000000) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ---- Aggregate (cost=28081.27..28081.28 rows=1 width=0) (actual time=6444.650..6444.651 rows=1 loops=1) -> Unique (cost=24438.50..25479.29 rows=208158 width=8) (actual time=5669.118..6277.206 rows=194142 loops=1) -> Sort (cost=24438.50..24958.89 rows=208158 width=8) (actual time=5669.112..5852.351 rows=194342 loops=1) Sort Key: test.val -> Bitmap Heap Scan on test (cost=882.55..6050.53 rows=208158 width=8) (actual time=1341.114..4989.840 rows=194342 loops=1) Recheck Cond: (val < 100000000) -> Bitmap Index Scan on testval (cost=0.00..882.55 rows=208158 width=0) (actual time=1339.707..1339.707 rows=194342 loops=1) Index Cond: (val < 100000000) Total runtime: 6487.114 ms (9 rows) postgres=# explain analyze select count(*) from (select distinct on (val) * from test where val<750000000) as foo; Q UERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------- Aggregate (cost=204576.53..204576.54 rows=1 width=0) (actual time=35718.935..35718.936 rows=1 loops=1) -> Unique (cost=178717.28..186105.64 rows=1477671 width=8) (actual time=29465.856..34459.640 rows=1462348 loops=1) -> Sort (cost=178717.28..182411.46 rows=1477671 width=8) (actual time=29465.853..31658.056 rows=1463793 loops=1) Sort Key: test.val -> Bitmap Heap Scan on test (cost=6256.85..27293.73 rows=1477671 width=8) (actual time=8316.676..11561.018 rows=1463793 loops=1) Recheck Cond: (val < 750000000) -> Bitmap Index Scan on testval (cost=0.00..6256.85 rows=1477671 width=0) (actual time=8305.963..8305.963 rows=1463793 loops=1) Index Cond: (val < 750000000) Total runtime: 35736.167 ms (9 rows) postgres=# explain analyze select count(*) from (select distinct on (val) * from test where val<800000000) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------ Aggregate (cost=217582.20..217582.21 rows=1 width=0) (actual time=28718.331..28718.332 rows=1 loops=1) -> Unique (cost=190140.72..197981.14 rows=1568084 width=8) (actual time=22175.170..27380.343 rows=1559648 loops=1) -> Sort (cost=190140.72..194060.93 rows=1568084 width=8) (actual time=22175.165..24451.892 rows=1561181 loops=1) Sort Key: test.val -> Seq Scan on test (cost=0.00..28780.40 rows=1568084 width=8) (actual time=13.130..3358.923 rows=1561181 loops=1) Filter: (val < 800000000) Total runtime: 28735.264 ms (7 rows) I did not post any result for the indexscan plan, because it takes to much time. Why the stupid indexscan plan on the whole table ? Cordialement, Jean-Gérard Pailloncy
Pailloncy Jean-Gerard <jg@rilk.com> writes: > Why the stupid indexscan plan on the whole table ? Pray tell, what are you using for the planner cost parameters? The only way I can come close to duplicating your numbers is by setting random_page_cost to somewhere around 0.01 ... regards, tom lane
Pailloncy Jean-Gerard <jg@rilk.com> wrote .. [snip] THIS MAY SEEM SILLY but vacuum is mispelled below and presumably there was never any ANALYZE done. > > postgres=# vaccum full verbose analyze;
> THIS MAY SEEM SILLY but vacuum is mispelled below and presumably > there was never any ANALYZE done. > >> >> postgres=# vaccum full verbose analyze; I do have done the "vacUUm full verbose analyze;". But I copy/paste the wrong line. Cordialement, Jean-Gérard Pailloncy
> Pailloncy Jean-Gerard <jg@rilk.com> writes: >> Why the stupid indexscan plan on the whole table ? > > Pray tell, what are you using for the planner cost parameters? > The only way I can come close to duplicating your numbers is > by setting random_page_cost to somewhere around 0.01 ... > I did not change the costs. > grep cost postgresql.conf # note: increasing max_connections costs ~400 bytes of shared memory per # note: increasing max_prepared_transactions costs ~600 bytes of shared memory #vacuum_cost_delay = 0 # 0-1000 milliseconds #vacuum_cost_page_hit = 1 # 0-10000 credits #vacuum_cost_page_miss = 10 # 0-10000 credits #vacuum_cost_page_dirty = 20 # 0-10000 credits #vacuum_cost_limit = 200 # 0-10000 credits #random_page_cost = 4 # units are one sequential page fetch # cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) #autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for # vacuum_cost_delay #autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # vacuum_cost_limit Cordialement, Jean-Gérard Pailloncy
I redo the test, with a freshly installed data directory. Same result. Note: This is the full log. I just suppress the mistake I do like "sl" for "ls". Jean-Gérard Pailloncy Last login: Thu Nov 24 12:52:32 2005 from 192.168.0.1 OpenBSD 3.8 (WDT) #2: Tue Nov 8 00:52:38 CET 2005 Welcome to OpenBSD: The proactively secure Unix-like operating system. Please use the sendbug(1) utility to report bugs in the system. Before reporting a bug, please try to reproduce it with the latest version of the code. With bug reports, please try to ensure that enough information to reproduce the problem is enclosed, and if a known fix for it exists, include that as well. Terminal type? [xterm-color] # cd /mnt2/pg/install/bin/ # mkdir /mnt2/pg/data # chown -R _pgsql:_pgsql /mnt2/pg/data # su _pgsql $ ls clusterdb droplang pg_config pg_resetxlog reindexdb createdb dropuser pg_controldata pg_restore vacuumdb createlang ecpg pg_ctl postgres createuser initdb pg_dump postmaster dropdb ipcclean pg_dumpall psql $ ./initdb -D /mnt2/pg/data The files belonging to this database system will be owned by user "_pgsql". This user must also own the server process. The database cluster will be initialized with locale C. fixing permissions on existing directory /mnt2/pg/data ... ok creating directory /mnt2/pg/data/global ... ok creating directory /mnt2/pg/data/pg_xlog ... ok creating directory /mnt2/pg/data/pg_xlog/archive_status ... ok creating directory /mnt2/pg/data/pg_clog ... ok creating directory /mnt2/pg/data/pg_subtrans ... ok creating directory /mnt2/pg/data/pg_twophase ... ok creating directory /mnt2/pg/data/pg_multixact/members ... ok creating directory /mnt2/pg/data/pg_multixact/offsets ... ok creating directory /mnt2/pg/data/base ... ok creating directory /mnt2/pg/data/base/1 ... ok creating directory /mnt2/pg/data/pg_tblspc ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 1000 creating configuration files ... ok creating template1 database in /mnt2/pg/data/base/1 ... ok initializing pg_authid ... ok enabling unlimited row size for system tables ... ok initializing dependencies ... ok creating system views ... ok loading pg_description ... ok creating conversions ... ok setting privileges on built-in objects ... ok creating information schema ... ok vacuuming database template1 ... ok copying template1 to template0 ... ok copying template1 to postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the -A option the next time you run initdb. Success. You can now start the database server using: ./postmaster -D /mnt2/pg/data or ./pg_ctl -D /mnt2/pg/data -l logfile start $ ./pg_ctl -D /mnt2/pg/data -l /mnt2/pg/data/logfile start postmaster starting $ ./psql postgres Welcome to psql 8.1.0, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g or terminate with semicolon to execute query \q to quit postgres=# create table test (id serial, val integer); NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for serial column "test.id" CREATE TABLE postgres=# create unique index testid on test (id); CREATE INDEX postgres=# create index testval on test (val); CREATE INDEX postgres=# insert into test (val) values (round(random() *1024*1024*1024)); INSERT 0 1 postgres=# vacuum full analyze; VACUUM postgres=# select count(1) from test; count ------- 1 (1 row) postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ---------------------------------------------------------------------- Aggregate (cost=1.04..1.05 rows=1 width=0) -> Unique (cost=1.02..1.03 rows=1 width=8) -> Sort (cost=1.02..1.02 rows=1 width=8) Sort Key: test.val -> Seq Scan on test (cost=0.00..1.01 rows=1 width=8) (5 rows) postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 1 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 2 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 4 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 8 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 16 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 32 postgres=# vacuum full analyze; VACUUM postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ----------------------------------------------------------------------- Aggregate (cost=4.68..4.69 rows=1 width=0) -> Unique (cost=3.56..3.88 rows=64 width=8) -> Sort (cost=3.56..3.72 rows=64 width=8) Sort Key: test.val -> Seq Scan on test (cost=0.00..1.64 rows=64 width=8) (5 rows) postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 64 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 128 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 256 postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 512 postgres=# vacuum full analyze; VACUUM postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------ Aggregate (cost=55.63..55.64 rows=1 width=0) -> Unique (cost=0.00..42.82 rows=1024 width=8) -> Index Scan using testval on test (cost=0.00..40.26 rows=1024 width=8) (3 rows) postgres=# select count(1) from test; count ------- 1024 (1 row) postgres=# set enable_indexscan=off; SET postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ -- Aggregate (cost=85.36..85.37 rows=1 width=0) -> Unique (cost=67.44..72.56 rows=1024 width=8) -> Sort (cost=67.44..70.00 rows=1024 width=8) Sort Key: test.val -> Seq Scan on test (cost=0.00..16.24 rows=1024 width=8) (5 rows) postgres=# set enable_indexscan=on; SET postgres=# insert into test (val) select round(random() *1024*1024*1024) from test; INSERT 0 1024 postgres=# vacuum full analyze; VACUUM postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------ Aggregate (cost=105.25..105.26 rows=1 width=0) -> Unique (cost=0.00..79.65 rows=2048 width=8) -> Index Scan using testval on test (cost=0.00..74.53 rows=2048 width=8) (3 rows) postgres=#
Pailloncy Jean-Gerard <jg@rilk.com> writes: > I redo the test, with a freshly installed data directory. Same result. What "same result"? You only ran it up to 2K rows, not 2M. In any case, EXPLAIN without ANALYZE is pretty poor ammunition for complaining that the planner made the wrong choice. I ran the same test case, and AFAICS the indexscan is the right choice at 2K rows: regression=# explain analyze select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=105.24..105.25 rows=1 width=0) (actual time=41.561..41.565 rows=1 loops=1) -> Unique (cost=0.00..79.63 rows=2048 width=8) (actual time=0.059..32.459 rows=2048 loops=1) -> Index Scan using testval on test (cost=0.00..74.51 rows=2048 width=8) (actual time=0.049..13.197 rows=2048loops=1) Total runtime: 41.683 ms (4 rows) regression=# set enable_indexscan TO 0; SET regression=# explain analyze select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Aggregate (cost=179.96..179.97 rows=1 width=0) (actual time=59.567..59.571 rows=1 loops=1) -> Unique (cost=144.12..154.36 rows=2048 width=8) (actual time=21.438..50.434 rows=2048 loops=1) -> Sort (cost=144.12..149.24 rows=2048 width=8) (actual time=21.425..30.589 rows=2048 loops=1) Sort Key: test.val -> Seq Scan on test (cost=0.00..31.48 rows=2048 width=8) (actual time=0.014..9.902 rows=2048 loops=1) Total runtime: 60.265 ms (6 rows) regards, tom lane
Tom Lane wrote: >What "same result"? You only ran it up to 2K rows, not 2M. In any >case, EXPLAIN without ANALYZE is pretty poor ammunition for complaining >that the planner made the wrong choice. I ran the same > Hello, sorry to jump in mid-stream, but this reminded me of something. I have hit cases where I have a query for which there is a somewhat "obvious" (to a human...) query plan that should make it possible to get a query answer pretty quickly. Yet the query "never" finishes (or rather, after hours of waiting I finally kill it). I assume this is because of a sub-optimal query plan. But, it appears that an EXPLAIN ANALYZE runs the actual query, so it takes as long as the actual query. In such a case, how can I go about tracking down the issue, up to an including a complaint about the query planner? :-) (Overall, I'm pretty pleased with the PG query planner; it often gets better results than another, popular commercial DBMS we use here.... that is just a general impression, not the result of setting up the same schema in each for a comparison.) Kyle Cordes www.kylecordes.com
On Thu, Nov 24, 2005 at 09:15:44PM -0600, Kyle Cordes wrote: > I have hit cases where I have a query for which there is a somewhat > "obvious" (to a human...) query plan that should make it possible to get > a query answer pretty quickly. Yet the query "never" finishes (or > rather, after hours of waiting I finally kill it). I assume this is > because of a sub-optimal query plan. But, it appears that an EXPLAIN > ANALYZE runs the actual query, so it takes as long as the actual query. In this case, you probably can't do better than EXPLAIN. Look at the estimates, find out if the cost is way high somewhere. If a simple query estimates a billion disk page fetches, something is probably wrong, ie. the planner did for some reason overlook the query plan you were thinking of. (A common problem here used to include data type mismatches leading to less efficient joins, lack of index scans and less efficient IN/NOT IN; most of that is fixed, but a few cases still remain.) If the query is estimated at a reasonable amount of disk page fetches but still takes forever, look at the number of estimated rows returned. Do they make sense? If you run subsets of your query, are they about right? If not, you probably want to fiddle with the statistics targets. /* Steinar */ -- Homepage: http://www.sesse.net/
Steinar H. Gunderson wrote: > On Thu, Nov 24, 2005 at 09:15:44PM -0600, Kyle Cordes wrote: > > I have hit cases where I have a query for which there is a somewhat > > "obvious" (to a human...) query plan that should make it possible to get > > a query answer pretty quickly. Yet the query "never" finishes (or > > rather, after hours of waiting I finally kill it). I assume this is > > because of a sub-optimal query plan. But, it appears that an EXPLAIN > > ANALYZE runs the actual query, so it takes as long as the actual query. > > In this case, you probably can't do better than EXPLAIN. Look at the > estimates, find out if the cost is way high somewhere. Also you want to make absolutely sure all the involved tables have been ANALYZEd recently. If you have weird cases where there is an obvious query plan and the optimizer is not using it, by all means submit it so that developers can take a look at how to improve the optimizer. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
> What "same result"? You only ran it up to 2K rows, not 2M. In any Sorry, I do this over and over until xxx.000 rows but I do not write in the mail. I do it again. initdb, create table, insert, vacuum full analyze, explain analyze at each stage. And there was no problem. So I make a copy of the offending data directory, and try again. And I got IndexScan only. I will get an headheak ;-) Too big to be send by mail: http://rilk.com/pg81.html Cordialement, Jean-Gérard Pailloncy
Hi, After few test, the difference is explained by the effective_cache_size parameter. with effective_cache_size=1000 (default) the planner chooses the following plan postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ -------- Aggregate (cost=421893.64..421893.65 rows=1 width=0) -> Unique (cost=385193.48..395679.24 rows=2097152 width=8) -> Sort (cost=385193.48..390436.36 rows=2097152 width=8) Sort Key: test.val -> Seq Scan on test (cost=0.00..31252.52 rows=2097152 width=8) (5 rows) with effective_cache_size=15000 the planner chooses the following plan postgres=# explain select count(*) from (select distinct on (val) * from test) as foo; QUERY PLAN ------------------------------------------------------------------------ ------------------ Aggregate (cost=101720.39..101720.40 rows=1 width=0) -> Unique (cost=0.00..75505.99 rows=2097152 width=8) -> Index Scan using testval on test (cost=0.00..70263.11 rows=2097152 width=8) (3 rows) I test some other values for effective_cache_size. The switch from seq to index scan happens between 9900 and 10000 for effective_cache_size. I have my sql server on a OpenBSD 3.8 box with 1 Gb of RAM with nothing else running on it. I setup the cachepercent to 25. I expect to have 25% of 1 Gb of RAM (256 Mb) as file cache. effective_cache_size=15000 means 15000 x 8K of OS cache = 120,000 Kb which is lower than my 256 MB of disk cache. I recall the result of my precedent test. #rows 2097152 IndexScan 1363396,581s SeqScan 98758,445s Ratio 13,805 So the planner when effective_cache_size=15000 chooses a plan that is 13 times slower than the seqscan one. I did not understand where the problem comes from. Any help welcome. Cordialement, Jean-Gérard Pailloncy