pg 9.3 exists subselect with limit brakes query plan - Mailing list pgsql-bugs
From | Kószó József |
---|---|
Subject | pg 9.3 exists subselect with limit brakes query plan |
Date | |
Msg-id | 53279529.2070902@freemail.hu Whole thread Raw |
Responses |
Re: pg 9.3 exists subselect with limit brakes query plan
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-bugs |
Hello, We plan to upgrade from PostgreSQL 8.3 to 9.3 and experience some planner-related problems in our applications with subselects using EXISTS and LIMIT keywords. We have two similar environment with these postgres versions, in 8.3 does not matter, if the subselect using LIMIT or not, but in 9.3 the LIMIT keyword causes odd behavior, enforces seq scans. In 8.3 always get similar plans to both case, but in 9.3 the LIMIT keyword slows down the query. Additionally without index to subselect id field the plan and running time worse with one magnitude in 9.3 than 8.3. The two test case: SELECT ... WHERE EXISTS (SELECT ...); SELECT ... WHERE EXISTS (SELECT ... LIMIT 1); We try to make some application-independent, reproducible tests to show the difference between versions: ===== Initializations ===== # createdb limit1test -U postgres # pgbench -i -s 10 limit1test -U postgres limit1test=# select count(*) from pgbench_branches; count ------- 10 limit1test=# select count(*) from pgbench_accounts; count --------- 1000000 limit1test=# select B.bid, count(*) from pgbench_branches B join pgbench_accounts using (bid) group by 1 order by 1; bid | count -----+-------- 1 | 100000 2 | 100000 3 | 100000 4 | 100000 5 | 100000 6 | 100000 7 | 100000 8 | 100000 9 | 100000 10 | 100000 ===== PostgreSQL 9.3 ===== # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid); --------------------------------------------------------------------- Nested Loop Semi Join (cost=0.00..26402.35 rows=10width=4) (actual time=0.012..4972.868 rows=10 loops=1) Join Filter: (b.bid = a.bid) Rows Removed by Join Filter: 4500000 -> Seq Scanon pgbench_branches b (cost=0.00..1.10 rows=10 width=4) (actual time=0.003..0.013 rows=10 loops=1) -> Seq Scan on pgbench_accounts a (cost=0.00..26394.00 rows=1000000 width=4) (actual time=0.002..242.321 rows=450001 loops=10) Total runtime: 4972.904 ms limit1test=# explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid limit 1); --------------------------------------------------------------------- Seq Scan on pgbench_branches b (cost=0.00..3.99 rows=5width=4) (actual time=0.018..382.774 rows=10 loops=1) Filter: (SubPlan 1) SubPlan 1 -> Limit (cost=0.00..0.29 rows=1 width=4)(actual time=38.269..38.269 rows=1 loops=10) -> Seq Scan on pgbench_accounts a (cost=0.00..28894.00 rows=100000 width=4) (actual time=38.261..38.261 rows=1 loops=10) Filter: (bid = b.bid) RowsRemoved by Filter: 450000 Total runtime: 382.816 ms # create index pgbench_accounts_bid on pgbench_accounts(bid); # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid); -------------------------------------------------------------------- Nested Loop Semi Join (cost=0.42..2739.24 rows=10 width=4)(actual time=0.030..0.131 rows=10 loops=1) -> Seq Scan on pgbench_branches b (cost=0.00..1.10 rows=10 width=4) (actual time=0.003..0.011 rows=10 loops=1) -> Index Only Scan using pgbench_accounts_bid on pgbench_accountsa (cost=0.42..2733.62 rows=100000 width=4) (actual time=0.010..0.010 rows=1 loops=10) Index Cond: (bid = b.bid) Heap Fetches: 0 Total runtime: 0.169 ms # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid limit 1); -------------------------------------------------------------------- Seq Scan on pgbench_branches b (cost=0.00..3.99 rows=5width=4) (actual time=0.016..384.974 rows=10 loops=1) Filter: (SubPlan 1) SubPlan 1 -> Limit (cost=0.00..0.29 rows=1 width=4)(actual time=38.490..38.490 rows=1 loops=10) -> Seq Scan on pgbench_accounts a (cost=0.00..28894.00 rows=100000 width=4) (actual time=38.482..38.482 rows=1 loops=10) Filter: (bid = b.bid) RowsRemoved by Filter: 450000 Total runtime: 385.015 ms # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid*100); -------------------------------------------------------------------- Nested Loop Semi Join (cost=0.43..2989.31 rows=10 width=4)(actual time=0.055..0.055 rows=0 loops=1) -> Seq Scan on pgbench_branches b (cost=0.00..1.10 rows=10 width=4) (actual time=0.004..0.009 rows=10 loops=1) -> Index Only Scan using pgbench_accounts_bid on pgbench_accountsa (cost=0.43..2983.63 rows=100000 width=4) (actual time=0.002..0.002 rows=0 loops=10) Index Cond: (bid = (b.bid * 100)) Heap Fetches: 0 Total runtime: 0.087 ms # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid*100 limit 1); ------------------------------------------------------------------- Seq Scan on pgbench_branches b (cost=0.00..4.24 rows=5width=4) (actual time=1019.139..1019.139 rows=0 loops=1) Filter: (SubPlan 1) Rows Removed by Filter: 10 SubPlan 1 -> Limit (cost=0.00..0.31 rows=1 width=4) (actual time=101.907..101.907 rows=0 loops=10) -> Seq Scan on pgbench_accounts a (cost=0.00..31394.00 rows=100000 width=4) (actual time=101.898..101.898 rows=0 loops=10) Filter: (bid = (b.bid * 100)) Rows Removed by Filter: 1000000 Total runtime: 1019.166 ms # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid offset 1); ------------------------------------------------------------------ Seq Scan on pgbench_branches b (cost=0.00..6.06 rows=5width=4) (actual time=0.042..0.166 rows=10 loops=1) Filter: (SubPlan 1) SubPlan 1 -> Limit (cost=0.46..3542.39 rows=99997width=4) (actual time=0.014..0.014 rows=1 loops=10) -> Index Only Scan using pgbench_accounts_bid on pgbench_accounts a (cost=0.42..3542.39 rows=99998 width=4) (actual time=0.011..0.012 rows=2 loops=10) Index Cond: (bid = b.bid) Heap Fetches: 0 Total runtime:0.200 ms ===== PostgreSQL 8.3 ===== # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid); --------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..3.99 rows=5 width=4)(actual time=0.020..436.306 rows=10 loops=1) Filter: (subplan) SubPlan -> Seq Scan on accounts a (cost=0.00..28894.00 rows=100000 width=4) (actual time=43.618..43.618 rows=1 loops=10) Filter: (bid = $0) Total runtime: 436.348 ms # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid limit 1); ---------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..3.99 rows=5 width=4)(actual time=0.019..442.392 rows=10 loops=1) Filter: (subplan) SubPlan -> Limit (cost=0.00..0.29 rows=1 width=4) (actual time=44.231..44.231 rows=1 loops=10) -> Seq Scan on accounts a (cost=0.00..28894.00 rows=100000 width=4) (actual time=44.223..44.223 rows=1 loops=10) Filter: (bid = $0) Total runtime: 442.440 ms # create index accounts_bid on accounts(bid); # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid); ---------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..1.55 rows=5 width=4)(actual time=0.049..0.211 rows=10 loops=1) Filter: (subplan) SubPlan -> Index Scan using accounts_bid on accounts a (cost=0.00..4493.36 rows=100000 width=4) (actual time=0.017..0.017 rows=1 loops=10) Index Cond: (bid = $0) Total runtime: 0.258 ms # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid limit 1); --------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..1.55 rows=5 width=4)(actual time=0.059..0.180 rows=10 loops=1) Filter: (subplan) SubPlan -> Limit (cost=0.00..0.04 rows=1 width=4) (actual time=0.015..0.015 rows=1 loops=10) -> Index Scan using accounts_bid on accounts a (cost=0.00..4493.36 rows=100000 width=4) (actual time=0.014..0.014 rows=1 loops=10) Index Cond: (bid = $0) Total runtime: 0.211 ms # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid*100); --------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..1.57 rows=5 width=4)(actual time=0.050..0.050 rows=0 loops=1) Filter: (subplan) SubPlan -> Index Scan using accounts_bid on accounts a (cost=0.00..4493.36 rows=100000 width=4) (actual time=0.003..0.003 rows=0 loops=10) Index Cond: (bid = ($0 * 100)) Total runtime: 0.082 ms # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid*100 limit 1); --------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..1.57 rows=5 width=4)(actual time=0.066..0.066 rows=0 loops=1) Filter: (subplan) SubPlan -> Limit (cost=0.00..0.05 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=10) -> Index Scan using accounts_bid on accounts a (cost=0.00..4493.36 rows=100000 width=4) (actual time=0.003..0.003 rows=0 loops=10) Index Cond: (bid = ($0 * 100)) Total runtime: 0.095 ms # explain analyze select bid from branches B where exists (select bid from accounts A where A.bid=B.bid offset 1); ---------------------------------------------------------------------- Seq Scan on branches b (cost=0.00..2.00 rows=5 width=4)(actual time=0.038..0.181 rows=10 loops=1) Filter: (subplan) SubPlan -> Limit (cost=0.04..4493.36 rows=99999 width=4) (actual time=0.016..0.016 rows=1 loops=10) -> Index Scan using accounts_bid on accounts a (cost=0.00..4493.36 rows=100000 width=4) (actual time=0.013..0.014 rows=2 loops=10) Index Cond: (bid = $0) Total runtime: 0.213 ms ===== Postgres settings ===== PostgreSQL 9.3: version ---------------------------------------------------------------------- PostgreSQL 9.3.3 on x86_64-iso-8859-1-linux-gnu, compiledby gcc (Debian 4.7.2-5) 4.7.2, 64-bit work_mem ---------- 128MB shared_buffers ---------------- 1536MB effective_cache_size ---------------------- 3GB default_statistics_target --------------------------- 100 PostgreSQL 8.3: version ----------------------------------------------------------------------- PostgreSQL 8.3.14 on x86_64-pc-linux-gnu, compiledby GCC gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2 work_mem ---------- 128MB shared_buffers ---------------- 1GB effective_cache_size ---------------------- 2GB default_statistics_target --------------------------- 100 *_cost settings are all the same (default values), and vacuum, analyze or reindex does not make any difference. Any suggestions, what is causes this difference, and how to resolve it globally? Or are these examples are too special/simple, and seqscan plan wins in 9.3? # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid limit 1); ------------------------------------------------------------------- Seq Scan on pgbench_branches b (cost=0.00..3.99 rows=5width=4) (actual time=0.028..379.426 rows=10 loops=1) Filter: (SubPlan 1) SubPlan 1 -> Limit (cost=0.00..0.29 rows=1 width=4)(actual time=37.934..37.934 rows=1 loops=10) -> Seq Scan on pgbench_accounts a (cost=0.00..28894.00 rows=100000 width=4) (actual time=37.927..37.927 rows=1 loops=10) Filter: (bid = b.bid) RowsRemoved by Filter: 450000 Total runtime: 379.464 ms # set enable_seqscan=off; # explain analyze select bid from pgbench_branches B where exists (select bid from pgbench_accounts A where A.bid=B.bid limit 1); ------------------------------------------------------------------- Index Only Scan using pgbench_branches_pkey on pgbench_branchesb (cost=0.14..12.82 rows=5 width=4) (actual time=0.039..0.145 rows=10 loops=1) Filter: (SubPlan 1) Heap Fetches: 0 SubPlan1 -> Limit (cost=0.42..0.45 rows=1 width=4) (actual time=0.012..0.012 rows=1 loops=10) -> Index Only Scan using pgbench_accounts_bid on pgbench_accounts a (cost=0.42..2850.43 rows=100000 width=4) (actual time=0.011..0.011 rows=1 loops=10) Index Cond: (bid = b.bid) Heap Fetches: 0 Total runtime:0.178 ms Thanks, József Kószó
pgsql-bugs by date: