Re: : PostgreSQL Index behavior - Mailing list pgsql-performance

From Jeff Janes
Subject Re: : PostgreSQL Index behavior
Date
Msg-id CAMkU=1wziuZJMD8Aat6+rwMtWdg25qCb75Len6LaFrSJ-SKnqg@mail.gmail.com
Whole thread Raw
In response to : PostgreSQL Index behavior  (Venkat Balaji <venkat.balaji@verse.in>)
Responses Re: : PostgreSQL Index behavior
List pgsql-performance
On Mon, Sep 10, 2012 at 5:39 AM, Venkat Balaji <venkat.balaji@verse.in> wrote:
> Hello Community,
>
> I intend to understand further on PostgreSQL Index behavior on a "SELECT"
> statement.
>
> We have a situation where-in Index on unique column is not being picked up
> as expected when used with-in the WHERE clause with other non-unique columns
> using AND operator.
>
> explain  SELECT tv.short_code, tv.chn as pkg_subscription_chn,
>                tv.vert as pkg_vert, ubs.campaign_id as campaign,
> 'none'::varchar as referer,
>                CAST('CAMPAIGNWISE_SUBSCRIBER_BASE' AS VARCHAR) as vn,
> count(tv.msisdn) as n_count, '0'::numeric AS tot_revenue
>                FROM campaign_base ubs
>                JOIN tab_current_day_v2 tv
>                ON ubs.ubs_seq_id = tv.ubs_seq_id
>              AND tv.dt = CAST('2012-09-08' AS DATE)
>              GROUP BY tv.short_code, tv.vert, tv.chn, ubs.campaign_id, vn;
...
>
> The above plan shows "seq scan" on tab_current_day_v2 table, though there is
> an index on "ubs_seq_id" column which is an unique column.
>
> Can anyone please help us understand, why PostgreSQL optimizer is not
> prioritizing the unique column and hitting ubs_seq_id_idx Index here ?

The query where clause does not specify a constant value for
ubs_seq_id.  So it is likely that the only way to use that index would
be to reverse the order of the nested loop and seq scan the other
table.  Is there any reason to think that doing that would be faster?


>
> Later -
>
> We have created composite Index on "dt" (one distinct value) and
> "ubs_seq_id" (no duplicate values) and the index has been picked up.

Postgres seems to think that "dt" has no duplicate values, the
opposite of having one distinct value.
That is based on the estimates given in the explain plan, that teh seq
scan will return only one row after the filter on Filter: "(dt =
'2012-09-08'::date)".   This does seem to conflict with what you
report from pg_stats, but I'm not familiar with that view, and you
haven't told us what version of pgsql you are using.



> Below is the scenario where-in the same query's plan picking up the
> composite Index.

It is only using the first column of that composite index.  So if you
built a single column index just on dt, it would be picked up as well.

Cheers,

Jeff


pgsql-performance by date:

Previous
From: Willy-Bas Loos
Date:
Subject: force defaults
Next
From: Jesper Krogh
Date:
Subject: Re: Planner selects different execution plans depending on limit