Thread: Incorrect index used in few cases..

Incorrect index used in few cases..

From
AminPG Jaffer
Date:
Hi

We recently upgraded one of the instances from 9.6.3 to 9.6.12 and seeing following issue that occurs for few cases.  

I have tried running analyze on the table with different values from 1000 - 5000 but it doesn't seem to help the issue.     There is some skew in a_id but the combination index  i_tc_adid_tid btree (a_id, id) makes the index unique as it includes primary key.  

Is there an explanation why it is using incorrect index?

SQL:
SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))

Indexes on the table:
   i_tc_adid_tid btree (a_id, id)
   pk_id PRIMARY KEY, btree (id)
   i_agi_tc_tcn btree (ag_id, tname)  ---> index that gets used


duration: 49455.649 ms  execute S_10: SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))
DETAIL:  parameters: $1 = '11786959222', $2 = '6', $3 = '54460816501', $4 = '3', $5 = '6', $6 = '103'
LOG:  duration: 49455.639 ms  plan:
        Query Text: SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))
        Aggregate  (cost=5009342.34..5009342.35 rows=1 width=8) (actual time=49455.626..49455.626 rows=1 loops=1)
          Output: count(*)
          Buffers: shared hit=56288997
          ->  Index Scan using i_agi_tc_tcn on b.tc  (cost=0.57..5009342.34 rows=1 width=0) (actual time=46452.555..49455.616 rows=1 loops=1)
                Output: id, tname, ...
                Index Cond: (tc.ag_id IS NOT NULL)
                Filter: ((tc.a_id = '11786959222'::numeric) AND (tc.m_id = '6'::numeric) AND (tc.id = '54460816501'::numeric) AND (tc.pt = ANY ('{3,6,103}'::numeric[])))
                Rows Removed by Filter: 70996637
                Buffers: shared hit=56288997

Thanks

Re: Incorrect index used in few cases..

From
Tom Lane
Date:
AminPG Jaffer <aminjaffer.pg@gmail.com> writes:
> Is there an explanation why it is using incorrect index?

> SQL:
> SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
> ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))

What data types are these columns?  For that matter, could we see the
whole schema for the table (psql \d+ output or equivalent)?

            regards, tom lane



Re: Incorrect index used in few cases..

From
AminPG Jaffer
Date:

Here is the table structure.

        Column         |            Type             |                         Modifiers
-----------------------+-----------------------------+-----------------------------------------------------------
 id                    | numeric(38,0)               | not null
 tname                 | character varying(255)      | not null
 ag_id                 | numeric(38,0)               |
 tc                    | character varying(255)      | not null
 status                | numeric(10,0)               | not null
 internal_status       | numeric(10,0)               | not null
 create_date           | timestamp(6) with time zone | not null
 version               | numeric(38,0)               | not null
 match_type            | numeric(10,0)               | not null default 0
 c_id                  | numeric(38,0)               | not null
 m_id                  | numeric(38,0)               | not null
 a_id                  | numeric(38,0)               | not null
 maxb                  | numeric(18,6)               |
 b_cc                  | character varying(10)       |
 ui_status             | numeric(10,0)               | not null default 0
 destination_url       | character varying(2084)     |
 created_by            | character varying(64)       | not null
 creation_date         | timestamp(0) with time zone | not null default timezone('UTC'::text, clock_timestamp())
 last_updated_by       | character varying(64)       | not null
 last_updated_date     | timestamp(0) with time zone | not null
 pr                    | numeric(5,0)                | not null default 0
 ts                    | numeric(1,0)                | not null default 0
 uniqueness_hash_v2    | numeric(29,0)               | not null
 pt                    | numeric(5,0)                |
 history               | bigint                      |
 t_secondary           | text                        |

Indexes:
    "pk_id" PRIMARY KEY, btree (id)
    "i_agi_tc_tcn" btree (ag_id, tname)
    "i_cid_agid_tcn" btree (c_id, ag_id, tname)
    "i_tc_adid_tid" btree (a_id, id)
    "i_tc_advertiser_id" btree (a_id)
    "i_tc_campaign_id" btree (c_id)
    "i_tc_lud_agi" btree (last_updated_date, ag_id)
    "i_tc_uniqueness_hash_v2" btree (uniqueness_hash_v2)
Check constraints:
    "tc_secondary" CHECK (length(t_secondary) <= 4500)


On Tue, Jun 18, 2019 at 6:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
AminPG Jaffer <aminjaffer.pg@gmail.com> writes:
> Is there an explanation why it is using incorrect index?

> SQL:
> SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
> ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))

What data types are these columns?  For that matter, could we see the
whole schema for the table (psql \d+ output or equivalent)?

                        regards, tom lane

Re: Incorrect index used in few cases..

From
Tom Lane
Date:
AminPG Jaffer <aminjaffer.pg@gmail.com> writes:
> Here is the table structure.

Hpmh.  I thought it was just barely possible that you had a datatype
mismatch between the columns and the parameters, but nope, the columns
are "numeric" just like the parameters.

I'm pretty baffled.  I tried to duplicate the problem with some dummy
data (as attached) and could not.  In my hands, it wants to use the
i_tc_adid_tid index, or if I drop that then the pkey index, and any
other possible plan is orders of magnitude more expensive than those.

Another far-fetched theory is that the theoretically-better indexes
are so badly bloated as to discourage the planner from using them.
You could eliminate that one by checking the index sizes with "\di+".

Are you perhaps running with non-default values for any planner cost
parameters?  Or it's not a stock build of Postgres?

If you could find a way to adjust the attached example so that it
produces the same misbehavior you see with live data, that would be
very interesting ...

            regards, tom lane

drop table tc;

create table tc(
 id                     numeric(38,0)                primary key,
 tname                  character varying(255)       not null,
 ag_id                  numeric(38,0)               ,
 tc                     character varying(255)       not null,
 status                 numeric(10,0)                not null,
 internal_status        numeric(10,0)                not null,
 create_date            timestamp(6) with time zone  not null,
 version                numeric(38,0)                not null,
 match_type             numeric(10,0)                not null default 0,
 c_id                   numeric(38,0)                not null,
 m_id                   numeric(38,0)                not null,
 a_id                   numeric(38,0)                not null,
 maxb                   numeric(18,6)               ,
 b_cc                   character varying(10)       ,
 ui_status              numeric(10,0)                not null default 0,
 destination_url        character varying(2084)     ,
 created_by             character varying(64)        not null,
 creation_date          timestamp(0) with time zone  not null default timezone('UTC'::text, clock_timestamp()),
 last_updated_by        character varying(64)        not null,
 last_updated_date      timestamp(0) with time zone  not null,
 pr                     numeric(5,0)                 not null default 0,
 ts                     numeric(1,0)                 not null default 0,
 uniqueness_hash_v2     numeric(29,0)                not null,
 pt                     numeric(5,0)                ,
 history                bigint                      ,
 t_secondary            text            CHECK (length(t_secondary) <= 4500)
);

insert into tc
select
  x as id,
  'tname_' || x as tname,
  case when x % 10 = 0 then null else x end as ag_id,
  'tc' as tc,
  0 as status,
  0 as internal_status,
  now() as create_date,
  42 as version,
  0 as match_type,
  (random()*1000)::int as c_id,
  x/4 as m_id,
  (random()*100)::int as a_id,
  0 as maxb,
  null as b_cc,
  0 as ui_status,
  null as destination_url,
  'c_b' as created_by,
  now() as creation_date,
  'l_u' as last_updated_by,
  now() as last_updated_date,
  0 as pr,
  0 as ts,
  x as uniqueness_hash_v2,
  x % 100 as pt,
  0 as history,
  null as t_secondary
from generate_series(1,1000000) x;

create index    "i_agi_tc_tcn" on tc (ag_id, tname);
create index    "i_cid_agid_tcn" on tc (c_id, ag_id, tname);
create index    "i_tc_adid_tid" on tc (a_id, id);
create index    "i_tc_advertiser_id" on tc (a_id);
create index    "i_tc_campaign_id" on tc (c_id);
create index    "i_tc_lud_agi" on tc (last_updated_date, ag_id);
create index    "i_tc_uniqueness_hash_v2" on tc (uniqueness_hash_v2);

vacuum analyze tc;

prepare p as
SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)));

explain verbose execute p(0, 1, 42, 4,5,6);

Re: Incorrect index used in few cases..

From
Peter Geoghegan
Date:
On Tue, Jun 18, 2019 at 2:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Are you perhaps running with non-default values for any planner cost
> parameters?  Or it's not a stock build of Postgres?
>
> If you could find a way to adjust the attached example so that it
> produces the same misbehavior you see with live data, that would be
> very interesting ...

FWIW, if you move the CREATE INDEX statements before the INSERT, and
compared earlier versions of Postgres to 12, you'll see that the size
of some of the indexes are a lot smaller on 12.

v11 (representative of 9.6):

pg@tc:5411 [1067]=# \di+ i_*
                               List of relations
 Schema │          Name           │ Type  │ Owner │ Table │ Size  │ Description
────────┼─────────────────────────┼───────┼───────┼───────┼───────┼─────────────
 public │ i_agi_tc_tcn            │ index │ pg    │ tc    │ 74 MB │
 public │ i_cid_agid_tcn          │ index │ pg    │ tc    │ 82 MB │
 public │ i_tc_adid_tid           │ index │ pg    │ tc    │ 57 MB │
 public │ i_tc_advertiser_id      │ index │ pg    │ tc    │ 27 MB │
 public │ i_tc_campaign_id        │ index │ pg    │ tc    │ 28 MB │
 public │ i_tc_lud_agi            │ index │ pg    │ tc    │ 57 MB │
 public │ i_tc_uniqueness_hash_v2 │ index │ pg    │ tc    │ 21 MB │
(7 rows)

v12/master:

pg@regression:5432 [1022]=# \di+ i_*
                               List of relations
 Schema │          Name           │ Type  │ Owner │ Table │ Size  │ Description
────────┼─────────────────────────┼───────┼───────┼───────┼───────┼─────────────
 public │ i_agi_tc_tcn            │ index │ pg    │ tc    │ 69 MB │
 public │ i_cid_agid_tcn          │ index │ pg    │ tc    │ 78 MB │
 public │ i_tc_adid_tid           │ index │ pg    │ tc    │ 36 MB │
 public │ i_tc_advertiser_id      │ index │ pg    │ tc    │ 20 MB │
 public │ i_tc_campaign_id        │ index │ pg    │ tc    │ 24 MB │
 public │ i_tc_lud_agi            │ index │ pg    │ tc    │ 30 MB │
 public │ i_tc_uniqueness_hash_v2 │ index │ pg    │ tc    │ 21 MB │
(7 rows)

Note that i_tc_lud_agi is 30 MB, not 57MB, and that i_tc_adid_tid is
36 MB, not 57 MB.

I can see that both i_tc_lud_agi and i_tc_adid_tid consistently use
the "split after new tuple" optimization on v12.

-- 
Peter Geoghegan

Re: Incorrect index used in few cases..

From
Andres Freund
Date:
Hi,

On 2019-06-18 06:11:54 -0700, AminPG Jaffer wrote:
> We recently upgraded one of the instances from 9.6.3 to 9.6.12 and seeing
> following issue that occurs for few cases.
> 
> We recently upgraded one of the instances from 9.6.3 to 9.6.12 and seeing
> following issue that occurs for few cases.
> 
> I have tried running analyze on the table with different values from 1000 -
> 5000 but it doesn't seem to help the issue.     There is some skew in a_id
> but the combination index  i_tc_adid_tid btree (a_id, id) makes the index
> unique as it includes primary key.
> 
> Is there an explanation why it is using incorrect index?
> 
> SQL:
> SELECT count(*) FROM tc WHERE ((tc.a_id = $1)) AND ((tc.m_id = $2)) AND
> ((tc.ag_id is not null)) AND ((tc.id in ($3))) AND ((tc.pt in ($4, $5, $6)))
> 
> Indexes on the table:
>    i_tc_adid_tid btree (a_id, id)
>    pk_id PRIMARY KEY, btree (id)
>    i_agi_tc_tcn btree (ag_id, tname)  ---> index that gets used

Are those indexes used for other queries? Any chance they've been
recently created?

SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready,
indislive, txid_current(), txid_current_snapshot()
FROM pg_index WHERE indrelid = 'tc'::regclass;

might tell us.


On 2019-06-18 17:07:55 -0400, Tom Lane wrote:
> I'm pretty baffled.  I tried to duplicate the problem with some dummy
> data (as attached) and could not.  In my hands, it wants to use the
> i_tc_adid_tid index, or if I drop that then the pkey index, and any
> other possible plan is orders of magnitude more expensive than those.

> Another far-fetched theory is that the theoretically-better indexes
> are so badly bloated as to discourage the planner from using them.
> You could eliminate that one by checking the index sizes with "\di+".
> 
> Are you perhaps running with non-default values for any planner cost
> parameters?  Or it's not a stock build of Postgres?
> 
> If you could find a way to adjust the attached example so that it
> produces the same misbehavior you see with live data, that would be
> very interesting ...

Amin, might be worth to see what the query plan is if you disable that
index. I assume it's too big to quickly drop (based on the ?

Something like:

BEGIN;
LOCK tc;
UPDATE pg_index SET indisvalid = false WHERE indexrelid = 'name_of_index'::regclass AND indisvalid;
EXPLAIN yourquery;
ROLLBACK;

might allow to test that without actually dropping the index. But that
of course requires superuser access.

Greetings,

Andres Freund



Re: Incorrect index used in few cases..

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Are those indexes used for other queries? Any chance they've been
> recently created?

> SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready,
> indislive, txid_current(), txid_current_snapshot()
> FROM pg_index WHERE indrelid = 'tc'::regclass;

> might tell us.

Oh, that's a good idea.

> Amin, might be worth to see what the query plan is if you disable that
> index. I assume it's too big to quickly drop (based on the ?

Considering that the "right" query plan would have a cost estimate in
the single digits or close to it, I have to suppose that the planner is
rejecting that index as unusable, not making a cost-based decision not
to use it.  (Well, maybe if it's bloated by three orders of magnitude
compared to the other indexes, it'd lose on cost.  Doesn't seem likely
though.)

So I think we're looking for a hard "can't use the index" reason, and
now we've eliminated datatype mismatch which'd be the most obvious
such reason.  But index-isnt-valid or index-isnt-ready might do the
trick.

            regards, tom lane



Re: Incorrect index used in few cases..

From
AminPG Jaffer
Date:

Sorry for late reply.

The initial values before upgrade for seq_page_cost=1, random_page_cost=4 and after upgrading when we started to see the issues as we were seeing "Seq Scan" we change them  seq_page_cost=1, random_page_cost=1 

The issue happens only in production so making the index invalid would affect service so it isn't something we can do.  
I have tried to rebuild the PK index to see it helps or not but it doesn't seem help.

Related to the same issue we sometimes see following Seq Scan on update when querying by PK alone which appears to be related.

   update tc set...where id = $1 and version <$2
       Update on tc  (cost=10000000000.00..10003184001.52 rows=1 width=1848)
          ->  Seq Scan on tc  (cost=10000000000.00..10003184001.52 rows=1 width=1848)
                Filter: ((version < '38'::numeric) AND (id = '53670604704'::numeric))

I was trying to find where the cost=10000000000 is set in the source code but wasn't able to find it, do anyone where it is set?
And if you someone can point me to the code where it goes through the execution plans when SQL is sent i can try to go through the code to see if can figure out what it is doing behind to scene in it's calculation?

Thanks

On Tue, Jun 18, 2019 at 3:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> Are those indexes used for other queries? Any chance they've been
> recently created?

> SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready,
> indislive, txid_current(), txid_current_snapshot()
> FROM pg_index WHERE indrelid = 'tc'::regclass;

> might tell us.

Oh, that's a good idea.

> Amin, might be worth to see what the query plan is if you disable that
> index. I assume it's too big to quickly drop (based on the ?

Considering that the "right" query plan would have a cost estimate in
the single digits or close to it, I have to suppose that the planner is
rejecting that index as unusable, not making a cost-based decision not
to use it.  (Well, maybe if it's bloated by three orders of magnitude
compared to the other indexes, it'd lose on cost.  Doesn't seem likely
though.)

So I think we're looking for a hard "can't use the index" reason, and
now we've eliminated datatype mismatch which'd be the most obvious
such reason.  But index-isnt-valid or index-isnt-ready might do the
trick.

                        regards, tom lane

Re: Incorrect index used in few cases..

From
AminPG Jaffer
Date:
Hi

I didn't see my following response got posted on the mailing list so not sure if it is duplicate.

Sorry for late reply.

The initial values before upgrade for seq_page_cost=1, random_page_cost=4 and after upgrading when we started to see the issues as we were seeing "Seq Scan" we change them  seq_page_cost=1, random_page_cost=1 

The issue happens only in production so making the index invalid would affect service so it isn't something we can do.  
I have tried to rebuild the PK index to see it helps or not but it doesn't seem help.

Related to the same issue we sometimes see following Seq Scan on update when querying by PK alone which appears to be related.

   update tc set...where id = $1 and version <$2
       Update on tc  (cost=10000000000.00..10003184001.52 rows=1 width=1848)
          ->  Seq Scan on tc  (cost=10000000000.00..10003184001.52 rows=1 width=1848)
                Filter: ((version < '38'::numeric) AND (id = '53670604704'::numeric))

I was trying to find where the cost=10000000000 is set in the source code but wasn't able to find it, do anyone where it is set?
And if you someone can point me to the code where it goes through the execution plans when SQL is sent i can try to go through the code to see if can figure out what it is doing behind to scene in it's calculation?

On Sun, Jun 23, 2019 at 8:07 AM AminPG Jaffer <aminjaffer.pg@gmail.com> wrote:

Sorry for late reply.

The initial values before upgrade for seq_page_cost=1, random_page_cost=4 and after upgrading when we started to see the issues as we were seeing "Seq Scan" we change them  seq_page_cost=1, random_page_cost=1 

The issue happens only in production so making the index invalid would affect service so it isn't something we can do.  
I have tried to rebuild the PK index to see it helps or not but it doesn't seem help.

Related to the same issue we sometimes see following Seq Scan on update when querying by PK alone which appears to be related.

   update tc set...where id = $1 and version <$2
       Update on tc  (cost=10000000000.00..10003184001.52 rows=1 width=1848)
          ->  Seq Scan on tc  (cost=10000000000.00..10003184001.52 rows=1 width=1848)
                Filter: ((version < '38'::numeric) AND (id = '53670604704'::numeric))

I was trying to find where the cost=10000000000 is set in the source code but wasn't able to find it, do anyone where it is set?
And if you someone can point me to the code where it goes through the execution plans when SQL is sent i can try to go through the code to see if can figure out what it is doing behind to scene in it's calculation?

Thanks

On Tue, Jun 18, 2019 at 3:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> Are those indexes used for other queries? Any chance they've been
> recently created?

> SELECT indexrelid::regclass, xmin, indcheckxmin, indisvalid, indisready,
> indislive, txid_current(), txid_current_snapshot()
> FROM pg_index WHERE indrelid = 'tc'::regclass;

> might tell us.

Oh, that's a good idea.

> Amin, might be worth to see what the query plan is if you disable that
> index. I assume it's too big to quickly drop (based on the ?

Considering that the "right" query plan would have a cost estimate in
the single digits or close to it, I have to suppose that the planner is
rejecting that index as unusable, not making a cost-based decision not
to use it.  (Well, maybe if it's bloated by three orders of magnitude
compared to the other indexes, it'd lose on cost.  Doesn't seem likely
though.)

So I think we're looking for a hard "can't use the index" reason, and
now we've eliminated datatype mismatch which'd be the most obvious
such reason.  But index-isnt-valid or index-isnt-ready might do the
trick.

                        regards, tom lane