Another unexpected planner choice in simple JOIN - Mailing list pgsql-performance
| From | Mark Kirkwood |
|---|---|
| Subject | Another unexpected planner choice in simple JOIN |
| Date | |
| Msg-id | 5874da50-c9c3-4e66-907a-a5b564eabd85@gmail.com Whole thread Raw |
| Responses |
Re: Another unexpected planner choice in simple JOIN
|
| List | pgsql-performance |
I have encountered another one of these. However managed to work out what the issue was myself this time! However I figured it might be interesting/useful for folk generally. Same schema as before, with a few parameter changes to 16devel: $ cat postgresql.auto.conf # Do not edit this file manually! # It will be overwritten by the ALTER SYSTEM command. parallel_setup_cost = '30000' shared_buffers = '2GB' max_wal_size = '4GB' I don't think any of these actually effect this new query: test0=# EXPLAIN ANALYZE SELECT t0.id0, t1.val FROM tab0 AS t0 JOIN tab1 AS t1 ON (t0.id0 = t1.id0) WHERE t0.id0 < 1000; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Gather (cost=30041.99..2379226.30 rows=990000 width=98) (actual time=0.512..6261.801 rows=1000000.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=18285 read=1705870 -> Hash Join (cost=41.99..2250226.30 rows=412500 width=98) (actual time=158.814..4363.611 rows=333333.33 loops=3) Hash Cond: (t1.id0 = t0.id0) Buffers: shared hit=18285 read=1705870 -> Parallel Seq Scan on tab1 t1 (cost=0.00..2140804.67 rows=41666667 width=98) (actual time=0.055..1830.481 rows=33333333.33 loops=3) Buffers: shared hit=18268 read=1705870 -> Hash (cost=29.62..29.62 rows=990 width=4) (actual time=0.234..0.235 rows=1000.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 44kB Buffers: shared hit=17 -> Index Only Scan using tab0_pkey on tab0 t0 (cost=0.29..29.62 rows=990 width=4) (actual time=0.029..0.119 rows=1000.00 loops=3) Index Cond: (id0 < 1000) Heap Fetches: 0 Index Searches: 3 Buffers: shared hit=17 Planning: Buffers: shared hit=222 Planning Time: 1.031 ms Execution Time: 6293.675 ms (21 rows) This seems like a pretty horrible plan....I'm wondering why it is not eliminating the vast majority of rows from tab0 1st in some way. After some head scratching I wondered if *in this case* I really needed to tell the planner I was on NVMe: test0=# SET random_page_cost=1; SET test0=# EXPLAIN ANALYZE SELECT t0.id0, t1.val FROM tab0 AS t0 JOIN tab1 AS t1 ON (t0.id0 = t1.id0) WHERE t0.id0 < 1000; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Gather (cost=30000.00..1668544.54 rows=990000 width=98) (actual time=0.198..163.617 rows=1000000.00 loops=1) Workers Planned: 1 Workers Launched: 1 Buffers: shared hit=32141 -> Nested Loop (cost=0.00..1539544.54 rows=582353 width=98) (actual time=0.027..110.626 rows=500000.00 loops=2) Buffers: shared hit=32141 -> Parallel Seq Scan on tab0 t0 (cost=0.00..2375.29 rows=582 width=4) (actual time=0.005..2.689 rows=500.00 loops=2) Filter: (id0 < 1000) Rows Removed by Filter: 49500 Buffers: shared hit=1640 -> Index Scan using tab1_id0_hash on tab1 t1 (cost=0.00..2608.85 rows=3233 width=98) (actual time=0.003..0.153 rows=1000.00 loops=1000) Index Cond: (id0 = t0.id0) Index Searches: 1000 Buffers: shared hit=30501 Planning: Buffers: shared hit=9 Planning Time: 0.189 ms Execution Time: 191.870 ms (18 rows) That is more like it! So ahem, note to self: remember to tell the planner you are using storage where random access is just (or nearly) as fast as sequential. Now I can get the planner to use the index on tab0 by making parallel operation more expensive: test0=# SET parallel_setup_cost=1000000; SET test0=# EXPLAIN ANALYZE SELECT t0.id0, t1.val FROM tab0 AS t0 JOIN tab1 AS t1 ON (t0.id0 = t1.id0) WHERE t0.id0 < 1000; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.29..2614793.04 rows=990000 width=98) (actual time=0.027..205.270 rows=1000000.00 loops=1) Buffers: shared hit=30505 -> Index Only Scan using tab0_pkey on tab0 t0 (cost=0.29..20.62 rows=990 width=4) (actual time=0.014..0.147 rows=1000.00 loops=1) Index Cond: (id0 < 1000) Heap Fetches: 0 Index Searches: 1 Buffers: shared hit=5 -> Index Scan using tab1_id0_hash on tab1 t1 (cost=0.00..2608.85 rows=3233 width=98) (actual time=0.003..0.143 rows=1000.00 loops=1000) Index Cond: (id0 = t0.id0) Index Searches: 1000 Buffers: shared hit=30500 Planning: Buffers: shared hit=9 Planning Time: 0.182 ms Execution Time: 232.360 ms (15 rows) But the parallel plan is better! I guess one question to consider is: is the slightly faster parallel plan way more expensive in terms of resources? It seems likely that it is, so if you had a system that needed to run many instances of this type of query would you be better off discouraging parallel execution of it? Anyway, I found this an interesting exercise, hopefully others might too. regards Mark
pgsql-performance by date: