Home > mailing lists

Re: Re: Planner chooses slow index heap scan despite accurate row estimates - Mailing list pgsql-performance

From	Jeff Janes
Subject	Re: Re: Planner chooses slow index heap scan despite accurate row estimates
Date	May 30, 2016 19:34:39
Msg-id	CAMkU=1w80Cz9h3er1gdaDi4JTzX-7D6qfrCp6th41B936OAM9A@mail.gmail.com Whole thread
In response to	Re: Planner chooses slow index heap scan despite accurate row estimates (Jake Magner <jakemagner90@gmail.com>)
List	pgsql-performance

Tree view

On Sat, May 28, 2016 at 5:38 PM, Jake Magner <jakemagner90@gmail.com> wrote:
> Tom Lane-2 wrote
>> Jake Magner <
>
>> jakemagner90@
>
>> > writes:
>>> I tried without doing an INSERT at all, just running the SELECT queries
>>> and
>>> the result is the same. Nested loop is chosen but is much slower.
>>
>> FWIW, I just noticed that the comparisons you're using are plain equality
>> of the arrays.  While a GIN array index supports that, it's not exactly
>> its strong suit: the sort of questions that index type supports well are
>> more like "which arrays contain value X?".  I wonder if it'd be worth
>> creating btree indexes on the array column.
>
> I added btree indexes and now the nested loop uses those and is a bit faster
> than the hash join. So the planner just misestimates the cost of doing the
> equality comparisons?

I wonder how it would do in 9.4?  Either in them actually being
faster, or the planner doing
a better job of realizing they won't be fast.

> I'd prefer not to add more indexes, the hash join
> performance is fast enough if it would just choose that but I'm reluctant to
> turn off nested loops in case the table gets a lot bigger.

A large hash join just needs to divide it up into batches.  It should
still be faster than the nested loop (as currently implemented) ,
until you run out of temp space.

But, you already have a solution in hand.  I agree you shouldn't add
more indexes without reason, but you do have a reason.

Cheers,

Jeff

pgsql-performance by date:

From: "David G. Johnston"
Date: 30 May 2016, 18:20:43
Subject: Re: similarity and operator '%'

From: Jeff Janes
Date: 30 May 2016, 20:05:50
Subject: Re: similarity and operator '%'

Re: Re: Planner chooses slow index heap scan despite accurate row estimates - Mailing list pgsql-performance

Previous

Next