Re: Index is not used for "IN (non-correlated subquery)" - Mailing list pgsql-general

From Merlin Moncure
Subject Re: Index is not used for "IN (non-correlated subquery)"
Date
Msg-id CAHyXU0yvg9T2+UQn7Suqn_CbKUYYJxUFqAeOhBas5h1=7fduXg@mail.gmail.com
Whole thread Raw
In response to Re: Index is not used for "IN (non-correlated subquery)"  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Index is not used for "IN (non-correlated subquery)"  (George <pinkisntwell@gmail.com>)
List pgsql-general
On Wed, Nov 30, 2016 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> George <pinkisntwell@gmail.com> writes:
>> explain select * from wg3ppbm_transaction where partner_uuid in (
>>             select p.uuid
>>             from wg3ppbm_userpartner up
>>             join wg3ppbm_partner p on p.id = up.partner_id
>> );
>
>> "Hash Semi Join  (cost=2.07..425.72 rows=2960 width=482)"
>> "  Hash Cond: ((wg3ppbm_transaction.partner_uuid)::text = (p.uuid)::text)"
>> "  ->  Seq Scan on wg3ppbm_transaction  (cost=0.00..375.19 rows=5919 width=482)"
>> "  ->  Hash  (cost=2.06..2.06 rows=1 width=37)"
>> "        ->  Nested Loop  (cost=0.00..2.06 rows=1 width=37)"
>> "              Join Filter: (up.partner_id = p.id)"
>> "              ->  Seq Scan on wg3ppbm_userpartner up
>> (cost=0.00..1.01 rows=1 width=4)"
>> "              ->  Seq Scan on wg3ppbm_partner p  (cost=0.00..1.02
>> rows=2 width=41)"
>
> This plan is expecting to have to return about half of the rows in
> wg3ppbm_transaction, a situation for which an indexscan would NOT
> be a better choice.  The usual rule of thumb is that you need to be
> retrieving at most one or two percent of a table's rows for an indexscan
> on it to be faster than a seqscan.
>
> I think however that the "half" may be a default estimate occasioned
> by the other tables being empty and therefore not having any statistics.
> Another rule of thumb is that the plans you get for tiny tables have
> little to do with what happens once there's lots of data.

Yeah, don't make query plan assumptions against empty or nearly empty
tables.  As the data grows, the plans will suitably change.  Perhaps
OP just recently loaded a bunch of data and the tables haven't been
analyzed yet?

merlin


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Index is not used for "IN (non-correlated subquery)"
Next
From: Melvin Davidson
Date:
Subject: Re: Monitoring Replication - Postgres 9.2