Thread: Planner chooses slow index heap scan despite accurate row estimates

Planner chooses slow index heap scan despite accurate row estimates

From

Jake Magner

Date:

27 May 2016, 22:09:03

I am trying to insert rows that don't already exist from a temp table into
another table. I am using a LEFT JOIN on all the columns and checking for
nulls in the base table to know which rows to insert. The problem is that
the planner is choosing a nested loop plan which is very slow over the much
faster (~40x) hash join. What's interesting is that all the row estimates
appear to be fairly accurate. I'm wondering if it has something to do with
the GIN indexes on bigint_array_1 and bigint_array_2. Perhaps it
misestimates the cost of each index scan?


Postgres 9.3.10 on 2.6.32-573.18.1.el6.x86_64 GNU/Linux
- base_table has been VACUUM ANALYZED
- base_table has GIN indexes on bigint_array_1 and bigint_array_2
- base_table has btree index on id
- base_table is 700k rows
- temp_table is 4k rows
- the bigint arrays are type bigint[] and contain 0 to 5 elements, with a
median of 1 element
- the time difference between nested loop vs hash join is not based on the
cache, I can reproduce it in either order

test_db=# BEGIN;
BEGIN
test_db=# EXPLAIN (ANALYZE, BUFFERS)
INSERT INTO base_table (
    bigint_array_1, bigint_array_2, id
) (
    SELECT s.bigint_array_1, s.bigint_array_2, s.id
    FROM temp_rows_to_insert s
    LEFT JOIN base_table t
    ON s.bigint_array_1 = t.bigint_array_1 AND s.bigint_array_2 =
t.bigint_array_2 AND s.id = t.id
    WHERE s.bigint_array_1 IS NOT NULL AND t.bigint_array_1 IS NULL AND
s.bigint_array_2 IS NOT NULL AND t.bigint_array_2 IS NULL AND s.id IS NOT
NULL AND t.id IS NULL
);

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Insert on base_table  (cost=2.97..67498.30 rows=8412 width=72) (actual
time=40463.771..40463.771 rows=0 loops=1)
   Buffers: shared hit=2373347 read=129 dirtied=129, local hit=122
   ->  Nested Loop Anti Join  (cost=2.97..67498.30 rows=8412 width=72)
(actual time=13.694..40410.273 rows=4389 loops=1)
         Buffers: shared hit=2338092, local hit=122
         ->  Seq Scan on temp_rows_to_insert s  (cost=0.00..219.60 rows=9614
width=72) (actual time=0.607..4.746 rows=4389 loops=1)
               Filter: ((bigint_array_1 IS NOT NULL) AND (bigint_array_2 IS
NOT NULL) AND (id IS NOT NULL))
               Buffers: local hit=122
         ->  Bitmap Heap Scan on base_table t  (cost=2.97..6.98 rows=1
width=74) (actual time=9.201..9.201 rows=0 loops=4389)
               Recheck Cond: ((s.bigint_array_2 = bigint_array_2) AND
(s.bigint_array_1 = bigint_array_1))
               Filter: (s.id = id)
               Buffers: shared hit=2333695
               ->  BitmapAnd  (cost=2.97..2.97 rows=1 width=0) (actual
time=9.199..9.199 rows=0 loops=4389)
                     Buffers: shared hit=2333638
                     ->  Bitmap Index Scan on base_table_bigint_array_2_idx
(cost=0.00..1.04 rows=3 width=0) (actual time=2.582..2.582 rows=290
loops=4389)
                           Index Cond: (s.bigint_array_2 = bigint_array_2)
                           Buffers: shared hit=738261
                     ->  Bitmap Index Scan on base_table_bigint_array_1_idx
(cost=0.00..1.68 rows=3 width=0) (actual time=6.608..6.608 rows=2
loops=4389)
                           Index Cond: (s.bigint_array_1 = bigint_array_1)
                           Buffers: shared hit=1595377
 Total runtime: 40463.879 ms
(20 rows)

test_db=# rollback;
ROLLBACK
test_db=# BEGIN;
BEGIN
test_db=# SET enable_nestloop = false;
SET
test_db=# EXPLAIN (ANALYZE, BUFFERS)
INSERT INTO base_table (
    bigint_array_1, bigint_array_2, id
) (
    SELECT s.bigint_array_1, s.bigint_array_2, s.id
    FROM temp_rows_to_insert s
    LEFT JOIN base_table t
    ON s.bigint_array_1 = t.bigint_array_1 AND s.bigint_array_2 =
t.bigint_array_2 AND s.id = t.id
    WHERE s.bigint_array_1 IS NOT NULL AND t.bigint_array_1 IS NULL AND
s.bigint_array_2 IS NOT NULL AND t.bigint_array_2 IS NULL AND s.id IS NOT
NULL AND t.id IS NULL
);

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
 Insert on base_table  (cost=31711.89..71625.39 rows=8412 width=72) (actual
time=1838.650..1838.650 rows=0 loops=1)
   Buffers: shared hit=50013 read=123 dirtied=410, local hit=122
   ->  Hash Anti Join  (cost=31711.89..71625.39 rows=8412 width=72) (actual
time=1798.774..1812.872 rows=4389 loops=1)
         Hash Cond: ((s.bigint_array_1 = t.bigint_array_1) AND
(s.bigint_array_2 = t.bigint_array_2) AND (s.id = t.id))
         Buffers: shared hit=14761 dirtied=287, local hit=122
         ->  Seq Scan on temp_rows_to_insert s  (cost=0.00..219.60 rows=9614
width=72) (actual time=0.046..3.033 rows=4389 loops=1)
               Filter: ((bigint_array_1 IS NOT NULL) AND (bigint_array_2 IS
NOT NULL) AND (id IS NOT NULL))
               Buffers: local hit=122
         ->  Hash  (cost=18131.96..18131.96 rows=775996 width=74) (actual
time=1798.528..1798.528 rows=768415 loops=1)
               Buckets: 131072  Batches: 1  Memory Usage: 84486kB
               Buffers: shared hit=10372 dirtied=287
               ->  Seq Scan on base_table t  (cost=0.00..18131.96
rows=775996 width=74) (actual time=0.007..490.851 rows=768415 loops=1)
                     Buffers: shared hit=10372 dirtied=287
 Total runtime: 1843.336 ms
(14 rows)

test_db=# rollback;
ROLLBACK


Thanks,
Jake




--
View this message in context:
http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: Planner chooses slow index heap scan despite accurate row estimates

From

Tom Lane

Date:

28 May 2016, 02:18:00

Jake Magner <jakemagner90@gmail.com> writes:
> I am trying to insert rows that don't already exist from a temp table into
> another table. I am using a LEFT JOIN on all the columns and checking for
> nulls in the base table to know which rows to insert. The problem is that
> the planner is choosing a nested loop plan which is very slow over the much
> faster (~40x) hash join. What's interesting is that all the row estimates
> appear to be fairly accurate. I'm wondering if it has something to do with
> the GIN indexes on bigint_array_1 and bigint_array_2. Perhaps it
> misestimates the cost of each index scan?

I'm curious about what happens to the runtime if you repeatedly roll back
the INSERT and do it over (without vacuuming in between).

What I'm thinking is that as the INSERT proceeds, it'd be making entries
in those GIN indexes' pending-item lists, which the bitmap indexscan would
have to scan through since it's examining the same table you're inserting
into.  The pending-item list is unsorted so it's relatively expensive to
scan.  Since, after you insert each row from temp_rows_to_insert, you're
doing a fresh bitmap indexscan, it seems like the cost to deal with the
pending item list would be proportional to O(N^2) --- so even though the
cost per pending item is not that large, N=4000 might be enough to hurt.

If this theory is correct, a second attempt to do the INSERT without
having flushed the pending-list would be even more expensive; while
if I'm wrong and that cost is negligible, the time wouldn't change much.

The hashjoin approach avoids this problem by not using the index (and
even if it did, the index would be scanned only once before any
insertions happen).  The planner unfortunately has no idea about this
interaction.

If this diagnosis is correct, there are a couple ways you could get around
the problem:

* disable use of the pending list by turning off "fastupdate" for these
indexes.

* construct the set of rows to be inserted in a separate command and
put them into a second temp table, then insert to the main table.

The second choice is probably preferable; doing bulk GIN inserts
without fastupdate is kind of expensive itself.

            regards, tom lane

Re: Planner chooses slow index heap scan despite accurate row estimates

From

Jake Magner

Date:

28 May 2016, 02:31:01

I tried without doing an INSERT at all, just running the SELECT queries and
the result is the same. Nested loop is chosen but is much slower.



--
View this message in context:
http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357p5905383.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: Re: Planner chooses slow index heap scan despite accurate row estimates

From

Tom Lane

Date:

28 May 2016, 03:16:17

Jake Magner <jakemagner90@gmail.com> writes:
> I tried without doing an INSERT at all, just running the SELECT queries and
> the result is the same. Nested loop is chosen but is much slower.

FWIW, I just noticed that the comparisons you're using are plain equality
of the arrays.  While a GIN array index supports that, it's not exactly
its strong suit: the sort of questions that index type supports well are
more like "which arrays contain value X?".  I wonder if it'd be worth
creating btree indexes on the array column.

            regards, tom lane

Re: Planner chooses slow index heap scan despite accurate row estimates

From

Jake Magner

Date:

29 May 2016, 03:38:43

Tom Lane-2 wrote
> Jake Magner <

> jakemagner90@

> > writes:
>> I tried without doing an INSERT at all, just running the SELECT queries
>> and
>> the result is the same. Nested loop is chosen but is much slower.
>
> FWIW, I just noticed that the comparisons you're using are plain equality
> of the arrays.  While a GIN array index supports that, it's not exactly
> its strong suit: the sort of questions that index type supports well are
> more like "which arrays contain value X?".  I wonder if it'd be worth
> creating btree indexes on the array column.

I added btree indexes and now the nested loop uses those and is a bit faster
than the hash join. So the planner just misestimates the cost of doing the
equality comparisons? I'd prefer not to add more indexes, the hash join
performance is fast enough if it would just choose that but I'm reluctant to
turn off nested loops in case the table gets a lot bigger.

--
View this message in context:
http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357p5905453.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: Re: Planner chooses slow index heap scan despite accurate row estimates

From

Jeff Janes

Date:

30 May 2016, 22:34:39

On Sat, May 28, 2016 at 5:38 PM, Jake Magner <jakemagner90@gmail.com> wrote:
> Tom Lane-2 wrote
>> Jake Magner <
>
>> jakemagner90@
>
>> > writes:
>>> I tried without doing an INSERT at all, just running the SELECT queries
>>> and
>>> the result is the same. Nested loop is chosen but is much slower.
>>
>> FWIW, I just noticed that the comparisons you're using are plain equality
>> of the arrays.  While a GIN array index supports that, it's not exactly
>> its strong suit: the sort of questions that index type supports well are
>> more like "which arrays contain value X?".  I wonder if it'd be worth
>> creating btree indexes on the array column.
>
> I added btree indexes and now the nested loop uses those and is a bit faster
> than the hash join. So the planner just misestimates the cost of doing the
> equality comparisons?

I wonder how it would do in 9.4?  Either in them actually being
faster, or the planner doing
a better job of realizing they won't be fast.

> I'd prefer not to add more indexes, the hash join
> performance is fast enough if it would just choose that but I'm reluctant to
> turn off nested loops in case the table gets a lot bigger.

A large hash join just needs to divide it up into batches.  It should
still be faster than the nested loop (as currently implemented) ,
until you run out of temp space.

But, you already have a solution in hand.  I agree you shouldn't add
more indexes without reason, but you do have a reason.

Cheers,

Jeff