Thread: Why is PostgreSQL 9.1 not using index for simple equality select

Why is PostgreSQL 9.1 not using index for simple equality select

From
Yang Zhang
Date:
Any hints with this question I had posted to SO?

http://stackoverflow.com/questions/15965785/why-is-postgresql-9-1-not-using-index-for-simple-equality-select

Pasted here as well.  Thanks.

My table `lead` has an index:

    \d lead
    ...
    Indexes:
        "lead_pkey" PRIMARY KEY, btree (id)
        "lead_account__c" btree (account__c)
        ...
        "lead_email" btree (email)
        "lead_id_prefix" btree (id text_pattern_ops)

Why doesn't PG (9.1) use the index for this straightforward equality
selection?  Emails are almost all unique....

    db=> explain select * from lead where email = 'blah';
                             QUERY PLAN
    ------------------------------------------------------------
     Seq Scan on lead  (cost=0.00..319599.38 rows=1 width=5108)
       Filter: (email = 'blah'::text)
    (2 rows)

Other index-hitting queries seem to be OK (though I don't know why
this one doesn't just use the pkey index):

    db=> explain select * from lead where id = '';
                                      QUERY PLAN
    ------------------------------------------------------------------------------
     Index Scan using lead_id_prefix on lead  (cost=0.00..8.57 rows=1
width=5108)
       Index Cond: (id = ''::text)
    (2 rows)

    db=> explain select * from lead where account__c = '';
                                        QUERY PLAN
    ----------------------------------------------------------------------------------
     Index Scan using lead_account__c on lead  (cost=0.00..201.05
rows=49 width=5108)
       Index Cond: (account__c = ''::text)
    (2 rows)

At first I thought it may be due to not enough distinct values of
`email`.  For instance, if the stats claim that `email` is `blah` for
most of the table, then a seq scan is faster.  But that's not the
case:

    db=> select count(*), count(distinct email) from lead;
     count  | count
    --------+--------
     749148 | 733416
    (1 row)

Even if I force seq scans to be off, the planner behaves as if it has
no other choice:

    db=> set enable_seqscan = off;
    SET
    db=> show enable_seqscan;
     enable_seqscan
    ----------------
     off
    (1 row)

    db=> explain select * from lead where email = 'foo@blah.com';
                                QUERY PLAN
    ---------------------------------------------------------------------------
     Seq Scan on lead  (cost=10000000000.00..10000319599.38 rows=1 width=5108)
       Filter: (email = 'foo@blah.com'::text)
    (2 rows)

I searched over a good number of past SO questions but none were about
a simple equality query like this one.


Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
John R Pierce
Date:
On 4/12/2013 1:03 AM, Yang Zhang wrote:
>      db=> explain select * from lead where email = 'blah';
>                               QUERY PLAN
>      ------------------------------------------------------------
>       Seq Scan on lead  (cost=0.00..319599.38 rows=1 width=5108)
>         Filter: (email = 'blah'::text)


try EXPLAIN ANALYZE .....    its more useful.

my guess is, there's no statistics on this table, and doing an ANALYZE
lead;  would rectify this.



--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
Yang Zhang
Date:
Doesn't seem to be the case.  This table has been around for a while
and should have been auto-analyzed by now.  But anyway:

db=> analyze lead;
ANALYZE
db=> explain select * from lead where email = 'foo@blah.com';
                                QUERY PLAN
---------------------------------------------------------------------------
 Seq Scan on lead  (cost=10000000000.00..10000319666.99 rows=1 width=5208)
   Filter: (email = 'foo@blah.com'::text)
(2 rows)

On Fri, Apr 12, 2013 at 1:13 AM, John R Pierce <pierce@hogranch.com> wrote:
> On 4/12/2013 1:03 AM, Yang Zhang wrote:
>>
>>      db=> explain select * from lead where email = 'blah';
>>                               QUERY PLAN
>>      ------------------------------------------------------------
>>       Seq Scan on lead  (cost=0.00..319599.38 rows=1 width=5108)
>>         Filter: (email = 'blah'::text)
>
>
>
> try EXPLAIN ANALYZE .....    its more useful.
>
> my guess is, there's no statistics on this table, and doing an ANALYZE lead;
> would rectify this.
>
>
>
> --
> john r pierce                                      37N 122W
> somewhere on the middle of the left coast
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Yang Zhang
http://yz.mit.edu/


Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
Alban Hertroys
Date:
On 12 April 2013 10:45, Yang Zhang <yanghatespam@gmail.com> wrote:
explain select * from lead where email = 'foo@blah.com';

What about:
explain analyze select * from lead where email = 'foo@blah.com';


--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
John R Pierce
Date:
On 4/12/2013 1:45 AM, Yang Zhang wrote:
db=> explain select * from lead where email = 'foo@blah.com';

can you try
explain analyze select * from lead where email = 'foo@blah.com';  
?


-- 
john r pierce                                      37N 122W
somewhere on the middle of the left coast

Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
Tom Lane
Date:
Yang Zhang <yanghatespam@gmail.com> writes:
>     db=> explain select * from lead where email = 'blah';
>                              QUERY PLAN
>     ------------------------------------------------------------
>      Seq Scan on lead  (cost=0.00..319599.38 rows=1 width=5108)
>        Filter: (email = 'blah'::text)
>     (2 rows)

That's awfully odd.  What data type is the "email" column?

It seems possible also that the index on it is marked invalid.  I'd have
expected \d to tell you so, but maybe you're using a version of psql that
doesn't know about that.  It'd be interesting to look at
select * from pg_index where indexrelid = 'index name here'::regclass;


            regards, tom lane


Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
Yang Zhang
Date:
It's actually just `text`.

I updated my SO question with some more info including explain analyze
(no difference), \d,
and your last incantation.

Thanks!

On Fri, Apr 12, 2013 at 7:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yang Zhang <yanghatespam@gmail.com> writes:
>>     db=> explain select * from lead where email = 'blah';
>>                              QUERY PLAN
>>     ------------------------------------------------------------
>>      Seq Scan on lead  (cost=0.00..319599.38 rows=1 width=5108)
>>        Filter: (email = 'blah'::text)
>>     (2 rows)
>
> That's awfully odd.  What data type is the "email" column?
>
> It seems possible also that the index on it is marked invalid.  I'd have
> expected \d to tell you so, but maybe you're using a version of psql that
> doesn't know about that.  It'd be interesting to look at
> select * from pg_index where indexrelid = 'index name here'::regclass;
>
>
>                         regards, tom lane



--
Yang Zhang
http://yz.mit.edu/


Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
Tom Lane
Date:
Yang Zhang <yanghatespam@gmail.com> writes:
> I updated my SO question with some more info including explain analyze
> (no difference), \d,
> and your last incantation.

The question is being asked here, not in SO, and I find it rather
impolite of you to expect me to go chasing off to some other forum
to answer your question.

            regards, tom lane


Re: Why is PostgreSQL 9.1 not using index for simple equality select

From
Yang Zhang
Date:
Apologies for that Tom.  I will paste the information in line once I'm
back at my computer.  I do appreciate your help.

On Fri, Apr 12, 2013 at 10:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yang Zhang <yanghatespam@gmail.com> writes:
>> I updated my SO question with some more info including explain analyze
>> (no difference), \d,
>> and your last incantation.
>
> The question is being asked here, not in SO, and I find it rather
> impolite of you to expect me to go chasing off to some other forum
> to answer your question.
>
>                         regards, tom lane



--
Yang Zhang
http://yz.mit.edu/