Thread: Slow planning, fast execution for particular 3-table query

Slow planning, fast execution for particular 3-table query

From

David Wheeler

Date:

04 November 2019, 03:04:45

We’re having trouble working out why the planning time for this particular query is slow (~2.5s vs 0.9ms execution time). As you can see below, there are only 3 tables involved so it’s hard to imagine what decisions the planner has to make that take so long. After 5 runs the prepared-statement code kicks in and it becomes quick, but it’s quite infuriating for the first 5 runs given the execution is so much faster.

Are you able to give any tips what might be taking so long (and how we might improve it)?

We read elsewhere that someone had a “catalog stats file leak”, which I’m taking to mean a big pg_statistic table. Ours is 10mb, which doesn’t seem particularly large to me, but I don’t have much context for it. https://www.postgresql.org/message-id/CABWW-d21z_WgawkjXFQQviqm16oAx0KQvR6bLkRxvYQmhdByfg%40mail.gmail.com

Other queries (with 3 or more tables) in the same db seem to be planning much quicker.

The query:

explain (analyse) SELECT subscription_binding.subscription_binding,

subscription_binding.tid,

subscription.subscription_uuid,

subscription_binding.subscription_binding_uuid,

binding.binding_uuid,

subscription_binding.start_time,

subscription_binding.end_time,

subscription_binding.timezone,

now() >= subscription_binding.start_time AND (subscription_binding.end_time IS NULL OR now() <= subscription_binding.end_time) AS active

FROM jackpot.binding

JOIN jackpot.subscription_binding USING (tid, binding)

JOIN jackpot.subscription USING (tid, subscription)

where (tid=2082003407) AND (binding_uuid='4f61dcd5-97a0-4098-b9ae-c1546c31b2e6'::uuid) offset 0 limit 1000;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=1.29..25.38 rows=1 width=80) (actual time=0.770..0.771 rows=1 loops=1)

-> Nested Loop (cost=1.29..25.38 rows=1 width=80) (actual time=0.770..0.771 rows=1 loops=1)

-> Nested Loop (cost=0.86..16.91 rows=1 width=76) (actual time=0.697..0.698 rows=1 loops=1)

-> Index Scan using binding_tid_binding_uuid_key on binding (cost=0.43..8.45 rows=1 width=28) (actual time=0.647..0.647 rows=1 loops=1)

Index Cond: ((tid = 2082003407) AND (binding_uuid = '4f61dcd5-97a0-4098-b9ae-c1546c31b2e6'::uuid))

-> Index Scan using subscription_binding_idx on subscription_binding (cost=0.43..8.45 rows=1 width=64) (actual time=0.045..0.046 rows=1 loops=1)

Index Cond: ((tid = 2082003407) AND (binding = binding.binding))

-> Index Scan using subscription_pkey on subscription (cost=0.43..8.45 rows=1 width=28) (actual time=0.068..0.068 rows=1 loops=1)

Index Cond: ((tid = 2082003407) AND (subscription = subscription_binding.subscription))

Planning time: 2429.682 ms

Execution time: 0.914 ms

(11 rows)

Postgres version 9.5.19

Each of the tables has between 3-4 indexes, and all the indexes include tid as first parameter. No partitions, no sign of a stray replication slot / uncommitted transaction / prepared transaction that may be holding up autovac, no sign of bloated indexes.

TIA!

Best regards,

David Wheeler

General Manager Bali Office

Bali T +62 361 475 2333 M +62 819 3660 9180

E dwheeler@dgitsystems.com

Jl. Pura Mertasari No. 7, Sunset Road Abian Base

Kuta, Badung – Bali 80361, Indonesia

http://www.dgitsystems.com

signature_1605257152 signature_1263573595

Attachment

Re: Slow planning, fast execution for particular 3-table query

From

Laurenz Albe

Date:

04 November 2019, 04:37:00

On Mon, 2019-11-04 at 03:04 +0000, David Wheeler wrote:
> We’re having trouble working out why the planning time for this particular query is slow
> (~2.5s vs 0.9ms execution time). As you can see below, there are only 3 tables involved
> so it’s hard to imagine what decisions the planner has to make that take so long. After
> 5 runs the prepared-statement code kicks in and it becomes quick, but it’s quite
> infuriating for the first 5 runs given the execution is so much faster. 
> 
> Are you able to give any tips what might be taking so long (and how we might improve it)?
> 
[...]
> Planning time: 2429.682 ms
> 
> Execution time: 0.914 ms

Strange.
Are any of your catalog tables unusually large?

SELECT pg_relation_size(t.oid),
       t.relname
FROM pg_class AS t
   JOIN pg_namespace AS n ON t.relnamespace = n.oid
WHERE t.relkind = 'r'
ORDER BY pg_relation_size(t.oid) DESC
LIMIT 10;

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Slow planning, fast execution for particular 3-table query

From

David Wheeler

Date:

04 November 2019, 04:42:00

I'm not sure what "unusually large" is, but they're all < 1mb which is a little larger than some of our other
comparabledatabases (mostly <300kb) but seems reasonable to me. 
 

Regards, 
 
David

On 4/11/19, 3:37 pm, "Laurenz Albe" <laurenz.albe@cybertec.at> wrote:

    On Mon, 2019-11-04 at 03:04 +0000, David Wheeler wrote:
    > We’re having trouble working out why the planning time for this particular query is slow
    > (~2.5s vs 0.9ms execution time). As you can see below, there are only 3 tables involved
    > so it’s hard to imagine what decisions the planner has to make that take so long. After
    > 5 runs the prepared-statement code kicks in and it becomes quick, but it’s quite
    > infuriating for the first 5 runs given the execution is so much faster. 
    > 
    > Are you able to give any tips what might be taking so long (and how we might improve it)?
    > 
    [...]
    > Planning time: 2429.682 ms
    > 
    > Execution time: 0.914 ms
    
    Strange.
    Are any of your catalog tables unusually large?
    
    SELECT pg_relation_size(t.oid),
           t.relname
    FROM pg_class AS t
       JOIN pg_namespace AS n ON t.relnamespace = n.oid
    WHERE t.relkind = 'r'
    ORDER BY pg_relation_size(t.oid) DESC
    LIMIT 10;
    
    Yours,
    Laurenz Albe
    -- 
    Cybertec | https://www.cybertec-postgresql.com

Re: Slow planning, fast execution for particular 3-table query

From

Laurenz Albe

Date:

04 November 2019, 04:46:41

David Wheeler wrote:
> I'm not sure what "unusually large" is, but they're all < 1mb which is a little larger
> than some of our other comparable databases (mostly <300kb) but seems reasonable to me. 

I forgot the condition "AND n.nspname = 'pg_catalog'"...

But if all your tables are small, catalog bloat is probably not your problem.

Do you have many tables in the database? Partitioning?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com

Re: Slow planning, fast execution for particular 3-table query

From

Tom Lane

Date:

04 November 2019, 05:00:59

David Wheeler <dwheeler@dgitsystems.com> writes:
> We’re having trouble working out why the planning time for this
> particular query is slow (~2.5s vs 0.9ms execution time). As you can see
> below, there are only 3 tables involved so it’s hard to imagine what
> decisions the planner has to make that take so long.

I wonder whether this traces to the cost of trying to estimate the
largest/smallest value of an indexed column by looking into the index.
Normally that's pretty cheap, but if you have a lot of recently-inserted
or recently-deleted values at the end of the index, it can get painful.
AFAIR this only happens for columns that are equijoin keys, so the fact
that your query is a join is significant.

I'm not convinced that this is the problem, because it's a corner case
that few people hit.  To see this issue, you have to have recently
inserted or deleted a bunch of extremal values of the indexed join-key
column.  And the problem only persists until those values become known
committed-good, or known dead-to-everybody.  (Maybe you've got a
long-running transaction somewhere, postponing the dead-to-everybody
condition?)

> Postgres version 9.5.19

If this *is* the cause, v11 and up have a performance improvement that
you need:

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=3ca930fc3

            regards, tom lane

Re: Slow planning, fast execution for particular 3-table query

From

David Wheeler

Date:

04 November 2019, 05:17:11

>    To see this issue, you have to have recently
>    inserted or deleted a bunch of extremal values of the indexed join-key
>    column.  And the problem only persists until those values become known
>    committed-good, or known dead-to-everybody.  (Maybe you've got a
>    long-running transaction somewhere, postponing the dead-to-everybody
>    condition?)

There are no long-running transactions that have backend_xmin set in pg_stat_activity, if that's what you mean here.
Thereare also no open prepared transactions or replication slots which I understand have a similar keeping-things-alive
issue.
 

These tables are biggish (hundreds of mb), but not changing so frequently that I'd expect large quantities of data to
beinserted or deleted before autovac can get in there and clean it up. And certainly not in a single uncommitted
transaction.
 

I'll try reindexing each of the tables just to make sure it's not strange index imbalance or something causing the
issue.
 

Regards, 
 
David

On 4/11/19, 4:01 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

    David Wheeler <dwheeler@dgitsystems.com> writes:
    > We’re having trouble working out why the planning time for this
    > particular query is slow (~2.5s vs 0.9ms execution time). As you can see
    > below, there are only 3 tables involved so it’s hard to imagine what
    > decisions the planner has to make that take so long.
    
    I wonder whether this traces to the cost of trying to estimate the
    largest/smallest value of an indexed column by looking into the index.
    Normally that's pretty cheap, but if you have a lot of recently-inserted
    or recently-deleted values at the end of the index, it can get painful.
    AFAIR this only happens for columns that are equijoin keys, so the fact
    that your query is a join is significant.
    
    I'm not convinced that this is the problem, because it's a corner case
    that few people hit.  To see this issue, you have to have recently
    inserted or deleted a bunch of extremal values of the indexed join-key
    column.  And the problem only persists until those values become known
    committed-good, or known dead-to-everybody.  (Maybe you've got a
    long-running transaction somewhere, postponing the dead-to-everybody
    condition?)
    
    > Postgres version 9.5.19
    
    If this *is* the cause, v11 and up have a performance improvement that
    you need:
    
    https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=3ca930fc3
    
                regards, tom lane

Re: Slow planning, fast execution for particular 3-table query

From

Pavel Stehule

Date:

04 November 2019, 05:53:16

po 4. 11. 2019 v 6:17 odesílatel David Wheeler <dwheeler@dgitsystems.com> napsal:

> To see this issue, you have to have recently
> inserted or deleted a bunch of extremal values of the indexed join-key
> column. And the problem only persists until those values become known
> committed-good, or known dead-to-everybody. (Maybe you've got a
> long-running transaction somewhere, postponing the dead-to-everybody
> condition?)

There are no long-running transactions that have backend_xmin set in pg_stat_activity, if that's what you mean here. There are also no open prepared transactions or replication slots which I understand have a similar keeping-things-alive issue.

These tables are biggish (hundreds of mb), but not changing so frequently that I'd expect large quantities of data to be inserted or deleted before autovac can get in there and clean it up. And certainly not in a single uncommitted transaction.

I'll try reindexing each of the tables just to make sure it's not strange index imbalance or something causing the issue.

I seen this issue few time - and reindex helps.

Pavel

Regards,

David

On 4/11/19, 4:01 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

David Wheeler <dwheeler@dgitsystems.com> writes:
> We’re having trouble working out why the planning time for this
> particular query is slow (~2.5s vs 0.9ms execution time). As you can see
> below, there are only 3 tables involved so it’s hard to imagine what
> decisions the planner has to make that take so long.

I wonder whether this traces to the cost of trying to estimate the
largest/smallest value of an indexed column by looking into the index.
Normally that's pretty cheap, but if you have a lot of recently-inserted
or recently-deleted values at the end of the index, it can get painful.
AFAIR this only happens for columns that are equijoin keys, so the fact
that your query is a join is significant.

I'm not convinced that this is the problem, because it's a corner case
that few people hit. To see this issue, you have to have recently
inserted or deleted a bunch of extremal values of the indexed join-key
column. And the problem only persists until those values become known
committed-good, or known dead-to-everybody. (Maybe you've got a
long-running transaction somewhere, postponing the dead-to-everybody
condition?)

> Postgres version 9.5.19

If this *is* the cause, v11 and up have a performance improvement that
you need:

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=3ca930fc3

regards, tom lane

Re: Slow planning, fast execution for particular 3-table query

From

David Wheeler

Date:

06 November 2019, 22:46:25

>> I'll try reindexing each of the tables just to make sure it's not strange index imbalance or something causing the issue.

> I seen this issue few time - and reindex helps.

Alas our reindex doesn’t seem to have helped. I’m going to see if we can reproduce this on a non-prod environment so we can muck about a bit more. If we can reproduce it in a safe place, is there a tool we can use to get more info out of the query planner to find what it’s doing to take so long?

Regards,

David

From: Pavel Stehule <pavel.stehule@gmail.com>
Date: Monday, 4 November 2019 at 4:53 pm
To: David Wheeler <dwheeler@dgitsystems.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>, "pgsql-performance@lists.postgresql.org" <pgsql-performance@lists.postgresql.org>, Cameron Redpath <credpath@dgitsystems.com>
Subject: Re: Slow planning, fast execution for particular 3-table query

po 4. 11. 2019 v 6:17 odesílatel David Wheeler <dwheeler@dgitsystems.com> napsal:

> To see this issue, you have to have recently
> inserted or deleted a bunch of extremal values of the indexed join-key
> column. And the problem only persists until those values become known
> committed-good, or known dead-to-everybody. (Maybe you've got a
> long-running transaction somewhere, postponing the dead-to-everybody
> condition?)

There are no long-running transactions that have backend_xmin set in pg_stat_activity, if that's what you mean here. There are also no open prepared transactions or replication slots which I understand have a similar keeping-things-alive issue.

These tables are biggish (hundreds of mb), but not changing so frequently that I'd expect large quantities of data to be inserted or deleted before autovac can get in there and clean it up. And certainly not in a single uncommitted transaction.

I'll try reindexing each of the tables just to make sure it's not strange index imbalance or something causing the issue.

I seen this issue few time - and reindex helps.

Pavel

Regards,

David

On 4/11/19, 4:01 pm, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

David Wheeler <dwheeler@dgitsystems.com> writes:
> We’re having trouble working out why the planning time for this
> particular query is slow (~2.5s vs 0.9ms execution time). As you can see
> below, there are only 3 tables involved so it’s hard to imagine what
> decisions the planner has to make that take so long.

I wonder whether this traces to the cost of trying to estimate the
largest/smallest value of an indexed column by looking into the index.
Normally that's pretty cheap, but if you have a lot of recently-inserted
or recently-deleted values at the end of the index, it can get painful.
AFAIR this only happens for columns that are equijoin keys, so the fact
that your query is a join is significant.

I'm not convinced that this is the problem, because it's a corner case
that few people hit. To see this issue, you have to have recently
inserted or deleted a bunch of extremal values of the indexed join-key
column. And the problem only persists until those values become known
committed-good, or known dead-to-everybody. (Maybe you've got a
long-running transaction somewhere, postponing the dead-to-everybody
condition?)

> Postgres version 9.5.19

If this *is* the cause, v11 and up have a performance improvement that
you need:

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=3ca930fc3

regards, tom lane

Re: Slow planning, fast execution for particular 3-table query

From

Michael Lewis

Date:

06 November 2019, 22:56:50

Is default_statistics_target set above default 100? I would assume that would reflect in the size of pg_statistic, but wanted to ask since increasing that from 100 to 1000 was the only time I have seen planning time explode. Are other queries slow to plan?

Re: Slow planning, fast execution for particular 3-table query

From

Justin Pryzby

Date:

06 November 2019, 22:59:45

On Mon, Nov 04, 2019 at 03:04:45AM +0000, David Wheeler wrote:
> Postgres version 9.5.19
> Each of the tables has between 3-4 indexes, and all the indexes include tid as first parameter.

On Mon, Nov 04, 2019 at 12:00:59AM -0500, Tom Lane wrote:
> If this *is* the cause, v11 and up have a performance improvement that
> you need:

But note that index definition will be prohibited since:

https://www.postgresql.org/docs/9.6/release-9-6.html
|Disallow creation of indexes on system columns, except for OID columns (David Rowley)
|Such indexes were never considered supported, and would very possibly misbehave since the system might change the
system-columnfields of a tuple without updating indexes. However, previously there were no error checks to prevent them
frombeing created.

Justin

Re: Slow planning, fast execution for particular 3-table query

From

David Wheeler

Date:

06 November 2019, 23:41:57

Looks like you’ve found it! Someone has set the target to 10k so that’s going to wildly increase planning time.

Thanks for your help! And thanks to the others who chipped in along the way 😊

Regards,

David

Re: Slow planning, fast execution for particular 3-table query

From

David Rowley

Date:

07 November 2019, 00:15:30

On Thu, 7 Nov 2019 at 11:59, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Mon, Nov 04, 2019 at 03:04:45AM +0000, David Wheeler wrote:
> > Postgres version 9.5.19
> > Each of the tables has between 3-4 indexes, and all the indexes include tid as first parameter.


> But note that index definition will be prohibited since:
>
> https://www.postgresql.org/docs/9.6/release-9-6.html
> |Disallow creation of indexes on system columns, except for OID columns (David Rowley)
> |Such indexes were never considered supported, and would very possibly misbehave since the system might change the
system-columnfields of a tuple without updating indexes. However, previously there were no error checks to prevent them
frombeing created. 

David will have meant the user column named "tid" rather than the
system column named "ctid".

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Slow planning, fast execution for particular 3-table query

From

Justin Pryzby

Date:

07 November 2019, 00:18:38

On Thu, Nov 07, 2019 at 01:15:30PM +1300, David Rowley wrote:
> On Thu, 7 Nov 2019 at 11:59, Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > On Mon, Nov 04, 2019 at 03:04:45AM +0000, David Wheeler wrote:
> > > Postgres version 9.5.19
> > > Each of the tables has between 3-4 indexes, and all the indexes include tid as first parameter.
> 
> > But note that index definition will be prohibited since:
> >
> > https://www.postgresql.org/docs/9.6/release-9-6.html
> > |Disallow creation of indexes on system columns, except for OID columns (David Rowley)
> > |Such indexes were never considered supported, and would very possibly misbehave since the system might change the
system-columnfields of a tuple without updating indexes. However, previously there were no error checks to prevent them
frombeing created.
 
> 
> David will have meant the user column named "tid" rather than the
> system column named "ctid".

Ah.  And David must have meant David W :)

Justin