Thread: Planner chooses slow index heap scan despite accurate row estimates
I am trying to insert rows that don't already exist from a temp table into another table. I am using a LEFT JOIN on all the columns and checking for nulls in the base table to know which rows to insert. The problem is that the planner is choosing a nested loop plan which is very slow over the much faster (~40x) hash join. What's interesting is that all the row estimates appear to be fairly accurate. I'm wondering if it has something to do with the GIN indexes on bigint_array_1 and bigint_array_2. Perhaps it misestimates the cost of each index scan? Postgres 9.3.10 on 2.6.32-573.18.1.el6.x86_64 GNU/Linux - base_table has been VACUUM ANALYZED - base_table has GIN indexes on bigint_array_1 and bigint_array_2 - base_table has btree index on id - base_table is 700k rows - temp_table is 4k rows - the bigint arrays are type bigint[] and contain 0 to 5 elements, with a median of 1 element - the time difference between nested loop vs hash join is not based on the cache, I can reproduce it in either order test_db=# BEGIN; BEGIN test_db=# EXPLAIN (ANALYZE, BUFFERS) INSERT INTO base_table ( bigint_array_1, bigint_array_2, id ) ( SELECT s.bigint_array_1, s.bigint_array_2, s.id FROM temp_rows_to_insert s LEFT JOIN base_table t ON s.bigint_array_1 = t.bigint_array_1 AND s.bigint_array_2 = t.bigint_array_2 AND s.id = t.id WHERE s.bigint_array_1 IS NOT NULL AND t.bigint_array_1 IS NULL AND s.bigint_array_2 IS NOT NULL AND t.bigint_array_2 IS NULL AND s.id IS NOT NULL AND t.id IS NULL ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on base_table (cost=2.97..67498.30 rows=8412 width=72) (actual time=40463.771..40463.771 rows=0 loops=1) Buffers: shared hit=2373347 read=129 dirtied=129, local hit=122 -> Nested Loop Anti Join (cost=2.97..67498.30 rows=8412 width=72) (actual time=13.694..40410.273 rows=4389 loops=1) Buffers: shared hit=2338092, local hit=122 -> Seq Scan on temp_rows_to_insert s (cost=0.00..219.60 rows=9614 width=72) (actual time=0.607..4.746 rows=4389 loops=1) Filter: ((bigint_array_1 IS NOT NULL) AND (bigint_array_2 IS NOT NULL) AND (id IS NOT NULL)) Buffers: local hit=122 -> Bitmap Heap Scan on base_table t (cost=2.97..6.98 rows=1 width=74) (actual time=9.201..9.201 rows=0 loops=4389) Recheck Cond: ((s.bigint_array_2 = bigint_array_2) AND (s.bigint_array_1 = bigint_array_1)) Filter: (s.id = id) Buffers: shared hit=2333695 -> BitmapAnd (cost=2.97..2.97 rows=1 width=0) (actual time=9.199..9.199 rows=0 loops=4389) Buffers: shared hit=2333638 -> Bitmap Index Scan on base_table_bigint_array_2_idx (cost=0.00..1.04 rows=3 width=0) (actual time=2.582..2.582 rows=290 loops=4389) Index Cond: (s.bigint_array_2 = bigint_array_2) Buffers: shared hit=738261 -> Bitmap Index Scan on base_table_bigint_array_1_idx (cost=0.00..1.68 rows=3 width=0) (actual time=6.608..6.608 rows=2 loops=4389) Index Cond: (s.bigint_array_1 = bigint_array_1) Buffers: shared hit=1595377 Total runtime: 40463.879 ms (20 rows) test_db=# rollback; ROLLBACK test_db=# BEGIN; BEGIN test_db=# SET enable_nestloop = false; SET test_db=# EXPLAIN (ANALYZE, BUFFERS) INSERT INTO base_table ( bigint_array_1, bigint_array_2, id ) ( SELECT s.bigint_array_1, s.bigint_array_2, s.id FROM temp_rows_to_insert s LEFT JOIN base_table t ON s.bigint_array_1 = t.bigint_array_1 AND s.bigint_array_2 = t.bigint_array_2 AND s.id = t.id WHERE s.bigint_array_1 IS NOT NULL AND t.bigint_array_1 IS NULL AND s.bigint_array_2 IS NOT NULL AND t.bigint_array_2 IS NULL AND s.id IS NOT NULL AND t.id IS NULL ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Insert on base_table (cost=31711.89..71625.39 rows=8412 width=72) (actual time=1838.650..1838.650 rows=0 loops=1) Buffers: shared hit=50013 read=123 dirtied=410, local hit=122 -> Hash Anti Join (cost=31711.89..71625.39 rows=8412 width=72) (actual time=1798.774..1812.872 rows=4389 loops=1) Hash Cond: ((s.bigint_array_1 = t.bigint_array_1) AND (s.bigint_array_2 = t.bigint_array_2) AND (s.id = t.id)) Buffers: shared hit=14761 dirtied=287, local hit=122 -> Seq Scan on temp_rows_to_insert s (cost=0.00..219.60 rows=9614 width=72) (actual time=0.046..3.033 rows=4389 loops=1) Filter: ((bigint_array_1 IS NOT NULL) AND (bigint_array_2 IS NOT NULL) AND (id IS NOT NULL)) Buffers: local hit=122 -> Hash (cost=18131.96..18131.96 rows=775996 width=74) (actual time=1798.528..1798.528 rows=768415 loops=1) Buckets: 131072 Batches: 1 Memory Usage: 84486kB Buffers: shared hit=10372 dirtied=287 -> Seq Scan on base_table t (cost=0.00..18131.96 rows=775996 width=74) (actual time=0.007..490.851 rows=768415 loops=1) Buffers: shared hit=10372 dirtied=287 Total runtime: 1843.336 ms (14 rows) test_db=# rollback; ROLLBACK Thanks, Jake -- View this message in context: http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Jake Magner <jakemagner90@gmail.com> writes: > I am trying to insert rows that don't already exist from a temp table into > another table. I am using a LEFT JOIN on all the columns and checking for > nulls in the base table to know which rows to insert. The problem is that > the planner is choosing a nested loop plan which is very slow over the much > faster (~40x) hash join. What's interesting is that all the row estimates > appear to be fairly accurate. I'm wondering if it has something to do with > the GIN indexes on bigint_array_1 and bigint_array_2. Perhaps it > misestimates the cost of each index scan? I'm curious about what happens to the runtime if you repeatedly roll back the INSERT and do it over (without vacuuming in between). What I'm thinking is that as the INSERT proceeds, it'd be making entries in those GIN indexes' pending-item lists, which the bitmap indexscan would have to scan through since it's examining the same table you're inserting into. The pending-item list is unsorted so it's relatively expensive to scan. Since, after you insert each row from temp_rows_to_insert, you're doing a fresh bitmap indexscan, it seems like the cost to deal with the pending item list would be proportional to O(N^2) --- so even though the cost per pending item is not that large, N=4000 might be enough to hurt. If this theory is correct, a second attempt to do the INSERT without having flushed the pending-list would be even more expensive; while if I'm wrong and that cost is negligible, the time wouldn't change much. The hashjoin approach avoids this problem by not using the index (and even if it did, the index would be scanned only once before any insertions happen). The planner unfortunately has no idea about this interaction. If this diagnosis is correct, there are a couple ways you could get around the problem: * disable use of the pending list by turning off "fastupdate" for these indexes. * construct the set of rows to be inserted in a separate command and put them into a second temp table, then insert to the main table. The second choice is probably preferable; doing bulk GIN inserts without fastupdate is kind of expensive itself. regards, tom lane
I tried without doing an INSERT at all, just running the SELECT queries and the result is the same. Nested loop is chosen but is much slower. -- View this message in context: http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357p5905383.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
Jake Magner <jakemagner90@gmail.com> writes: > I tried without doing an INSERT at all, just running the SELECT queries and > the result is the same. Nested loop is chosen but is much slower. FWIW, I just noticed that the comparisons you're using are plain equality of the arrays. While a GIN array index supports that, it's not exactly its strong suit: the sort of questions that index type supports well are more like "which arrays contain value X?". I wonder if it'd be worth creating btree indexes on the array column. regards, tom lane
Tom Lane-2 wrote > Jake Magner < > jakemagner90@ > > writes: >> I tried without doing an INSERT at all, just running the SELECT queries >> and >> the result is the same. Nested loop is chosen but is much slower. > > FWIW, I just noticed that the comparisons you're using are plain equality > of the arrays. While a GIN array index supports that, it's not exactly > its strong suit: the sort of questions that index type supports well are > more like "which arrays contain value X?". I wonder if it'd be worth > creating btree indexes on the array column. I added btree indexes and now the nested loop uses those and is a bit faster than the hash join. So the planner just misestimates the cost of doing the equality comparisons? I'd prefer not to add more indexes, the hash join performance is fast enough if it would just choose that but I'm reluctant to turn off nested loops in case the table gets a lot bigger. -- View this message in context: http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357p5905453.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
On Sat, May 28, 2016 at 5:38 PM, Jake Magner <jakemagner90@gmail.com> wrote: > Tom Lane-2 wrote >> Jake Magner < > >> jakemagner90@ > >> > writes: >>> I tried without doing an INSERT at all, just running the SELECT queries >>> and >>> the result is the same. Nested loop is chosen but is much slower. >> >> FWIW, I just noticed that the comparisons you're using are plain equality >> of the arrays. While a GIN array index supports that, it's not exactly >> its strong suit: the sort of questions that index type supports well are >> more like "which arrays contain value X?". I wonder if it'd be worth >> creating btree indexes on the array column. > > I added btree indexes and now the nested loop uses those and is a bit faster > than the hash join. So the planner just misestimates the cost of doing the > equality comparisons? I wonder how it would do in 9.4? Either in them actually being faster, or the planner doing a better job of realizing they won't be fast. > I'd prefer not to add more indexes, the hash join > performance is fast enough if it would just choose that but I'm reluctant to > turn off nested loops in case the table gets a lot bigger. A large hash join just needs to divide it up into batches. It should still be faster than the nested loop (as currently implemented) , until you run out of temp space. But, you already have a solution in hand. I agree you shouldn't add more indexes without reason, but you do have a reason. Cheers, Jeff