On Sat, May 28, 2016 at 5:38 PM, Jake Magner <jakemagner90@gmail.com> wrote:
> Tom Lane-2 wrote
>> Jake Magner <
>
>> jakemagner90@
>
>> > writes:
>>> I tried without doing an INSERT at all, just running the SELECT queries
>>> and
>>> the result is the same. Nested loop is chosen but is much slower.
>>
>> FWIW, I just noticed that the comparisons you're using are plain equality
>> of the arrays. While a GIN array index supports that, it's not exactly
>> its strong suit: the sort of questions that index type supports well are
>> more like "which arrays contain value X?". I wonder if it'd be worth
>> creating btree indexes on the array column.
>
> I added btree indexes and now the nested loop uses those and is a bit faster
> than the hash join. So the planner just misestimates the cost of doing the
> equality comparisons?
I wonder how it would do in 9.4? Either in them actually being
faster, or the planner doing
a better job of realizing they won't be fast.
> I'd prefer not to add more indexes, the hash join
> performance is fast enough if it would just choose that but I'm reluctant to
> turn off nested loops in case the table gets a lot bigger.
A large hash join just needs to divide it up into batches. It should
still be faster than the nested loop (as currently implemented) ,
until you run out of temp space.
But, you already have a solution in hand. I agree you shouldn't add
more indexes without reason, but you do have a reason.
Cheers,
Jeff