Thread: Temporarily very slow planning time after a big delete

Temporarily very slow planning time after a big delete

From

Walter Smith

Date:

21 May 2019, 03:43:45

We had a mysterious (to us) slowdown today that I'm hoping someone can explain just based on PG's principles of operation. It got better by itself so it seems like it was "normal" behavior -- I just don't know what behavior it was exhibiting.

We have a table of user notifications containing about 80 million rows. It gets a lot of inserts continually, and is cleaned up once a day. There are no updates. In all history there have been about 330 million rows created.

Today we deleted about 15 million rows in one transaction from this table. Immediately afterwards, a particular SELECT started running very slowly -- 500 to 3000 ms rather than the usual <1ms.

We did an EXPLAIN ANALYZE on this select and it was still doing an index scan as usual. The *planning time* for the query is what had gotten slow. The query itself was still executing in <1ms.

Over the next few hours the time slowly improved, until it returned to the former performance. You can see a graph at https://imgur.com/a/zIfqkF5.

Is this sort of thing expected after a large delete, and if so, can someone explain the mechanism behind it? I've looked for an explanation of what could cause this kind of excess planning time and haven't found one. I'm hoping someone will just recognize what's going on here.

Here is the pg_class data for the table and index:

relname=notifications

relpages=2799880

reltuples=7.15229e+07

relallvisible=1219791

relkind=r

relnatts=11

relhassubclass=f

reloptions=

pg_table_size=22943326208

relname=index_notifications_on_person_id_and_created_at

relpages=473208

reltuples=7.03404e+07

relallvisible=0

relkind=i

relnatts=2

relhassubclass=f

reloptions=

pg_table_size=3877494784

Thanks,

Walter

Re: Temporarily very slow planning time after a big delete

From

David Rowley

Date:

21 May 2019, 04:00:38

On Tue, 21 May 2019 at 12:44, Walter Smith <walter@carezone.com> wrote:
>
> We had a mysterious (to us) slowdown today that I'm hoping someone can explain just based on PG's principles of
operation.It got better by itself so it seems like it was "normal" behavior -- I just don't know what behavior it was
exhibiting.
>
> We have a table of user notifications containing about 80 million rows. It gets a lot of inserts continually, and is
cleanedup once a day. There are no updates. In all history there have been about 330 million rows created.

>
> Today we deleted about 15 million rows in one transaction from this table. Immediately afterwards, a particular
SELECTstarted running very slowly -- 500 to 3000 ms rather than the usual <1ms.

>
> We did an EXPLAIN ANALYZE on this select and it was still doing an index scan as usual. The *planning time* for the
queryis what had gotten slow. The query itself was still executing in <1ms.

It would be good to know which version you're running here.  It
basically sounds very much like get_actual_variable_range() will be
the culprit.  Basically, if a constant value that's being used by the
planner to determine row estimates with falls outside the statistic's
histogram and a btree index exists that we can use to look up the
actual bound of the data, then we do so in that function.  If you've
just deleted a bunch of rows then that index scan may have to traverse
a bunch of dead tuples before finding that first live tuple.  This
code has changed a few times in recent times, see fccebe421 and
3ca930fc3, which is why your version is of interest.

Apart from that, if you want to confirm that's the issue and you just
want it fixed, just VACUUM the table. You should likely be doing that
anyway directly after your bulk delete.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Temporarily very slow planning time after a big delete

From

Tom Lane

Date:

21 May 2019, 04:04:58

Walter Smith <walter@carezone.com> writes:
> Today we deleted about 15 million rows in one transaction from this table.
> Immediately afterwards, a particular SELECT started running very slowly --
> 500 to 3000 ms rather than the usual <1ms.

> We did an EXPLAIN ANALYZE on this select and it was still doing an index
> scan as usual. The *planning time* for the query is what had gotten slow.
> The query itself was still executing in <1ms.

> Over the next few hours the time slowly improved, until it returned to the
> former performance. You can see a graph at https://imgur.com/a/zIfqkF5.

Were the deleted rows all at one end of the index in question?

If so, this is likely down to the planner trying to use the index to
identify the extremal live value of the column, which it wants to know
in connection with planning mergejoins (so I'm assuming your problem
query involved a join on the indexed column --- whether or not the
final plan did a mergejoin, the planner would consider this).  As
long as there's a live value near the end of the index, this is pretty
cheap.  If the planner has to trawl through a bunch of dead entries
to find the nearest-to-the-end live one, not so much.

Subsequent vacuuming would eventually delete the dead index entries
and return things to normal; although usually the performance comes
back all-of-a-sudden at the next (auto)VACUUM of the table.   So I'm
a bit intrigued by your seeing it "gradually" improve.  Maybe you
had old open transactions that were limiting VACUUM's ability to
remove rows?

We've made a couple of rounds of adjustments of the behavior to try
to avoid/reduce this penalty, but since you didn't say what PG version
you're running, it's hard to tell whether an upgrade would help.

            regards, tom lane

Re: Temporarily very slow planning time after a big delete

From

Walter Smith

Date:

21 May 2019, 05:04:44

I'm so sorry -- I meant to give the version, of course. It's 9.6.13.

Thanks,

Walter

On Mon, May 20, 2019 at 6:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Walter Smith <walter@carezone.com> writes:
> Today we deleted about 15 million rows in one transaction from this table.
> Immediately afterwards, a particular SELECT started running very slowly --
> 500 to 3000 ms rather than the usual <1ms.

> We did an EXPLAIN ANALYZE on this select and it was still doing an index
> scan as usual. The *planning time* for the query is what had gotten slow.
> The query itself was still executing in <1ms.

> Over the next few hours the time slowly improved, until it returned to the
> former performance. You can see a graph at https://imgur.com/a/zIfqkF5.

Were the deleted rows all at one end of the index in question?

If so, this is likely down to the planner trying to use the index to
identify the extremal live value of the column, which it wants to know
in connection with planning mergejoins (so I'm assuming your problem
query involved a join on the indexed column --- whether or not the
final plan did a mergejoin, the planner would consider this). As
long as there's a live value near the end of the index, this is pretty
cheap. If the planner has to trawl through a bunch of dead entries
to find the nearest-to-the-end live one, not so much.

Subsequent vacuuming would eventually delete the dead index entries
and return things to normal; although usually the performance comes
back all-of-a-sudden at the next (auto)VACUUM of the table. So I'm
a bit intrigued by your seeing it "gradually" improve. Maybe you
had old open transactions that were limiting VACUUM's ability to
remove rows?

We've made a couple of rounds of adjustments of the behavior to try
to avoid/reduce this penalty, but since you didn't say what PG version
you're running, it's hard to tell whether an upgrade would help.

regards, tom lane

Re: Temporarily very slow planning time after a big delete

From

Walter Smith

Date:

21 May 2019, 05:12:05

Tom Lane <tgl@sss.pgh.pa.us> writes:

>I'm assuming your problem
>query involved a join on the indexed column --- whether or not the
>final plan did a mergejoin, the planner would consider this

There's no join -- the query is

SELECT "notifications".*

FROM "notifications"

WHERE "notifications"."person_id" = ? AND

"notifications"."app_category" = ? AND

(id > ?)

ORDER BY created_at DESC

LIMIT ?

And the whole query plan is one step:

Index Scan using index_notifications_on_person_id_and_created_at on notifications (cost=0.57..212.16 rows=52 width=231)

>Subsequent vacuuming would eventually delete the dead index entries
>and return things to normal; although usually the performance comes
>back all-of-a-sudden at the next (auto)VACUUM of the table. So I'm
>a bit intrigued by your seeing it "gradually" improve. Maybe you
>had old open transactions that were limiting VACUUM's ability to
>remove rows?'

We shouldn't have any long-running transactions at all, certainly not open for a couple of hours.

Thanks,

Walter

Re: Temporarily very slow planning time after a big delete

From

David Rowley

Date:

21 May 2019, 05:15:09

On Tue, 21 May 2019 at 14:04, Walter Smith <walter@carezone.com> wrote:
> I'm so sorry -- I meant to give the version, of course. It's 9.6.13.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3ca930fc3
has been applied since then.

It would be good if you could confirm the problem is resolved after a
vacuum.  Maybe run VACUUM VERBOSE on the table and double check
there's not some large amount of tuples that are "nonremovable".

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Temporarily very slow planning time after a big delete

From

Walter Smith

Date:

21 May 2019, 21:12:32

On Mon, May 20, 2019 at 7:15 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

It would be good if you could confirm the problem is resolved after a
vacuum. Maybe run VACUUM VERBOSE on the table and double check
there's not some large amount of tuples that are "nonremovable".

As I say, the problem resolved itself over the next couple of hours. Perhaps something autovacuum did? Or if the index extrema hypothesis is correct, perhaps the new rows being inserted for various users slowly changed that situation?

I did a VACUUM overnight and got the following. The thing that stands out to me is that one index (index_unproc_notifications_on_notifiable_type) took 100x longer to scan than the others. That's not the index used in the slow query, though.

INFO: vacuuming "public.notifications"

INFO: scanned index "notifications_pkey" to remove 16596527 row versions

DETAIL: CPU 12.11s/11.04u sec elapsed 39.62 sec

INFO: scanned index "index_notifications_on_person_id_and_created_at" to remove 16596527 row versions

DETAIL: CPU 15.86s/49.85u sec elapsed 92.07 sec

INFO: scanned index "index_unproc_notifications_on_notifiable_type" to remove 16596527 row versions

DETAIL: CPU 224.08s/10934.81u sec elapsed 11208.37 sec

INFO: scanned index "index_notifications_on_person_id_id" to remove 16596527 row versions

DETAIL: CPU 11.58s/59.54u sec elapsed 91.40 sec

INFO: scanned index "index_notifications_on_about_id" to remove 16596527 row versions

DETAIL: CPU 11.70s/57.75u sec elapsed 87.81 sec

INFO: scanned index "index_notifications_on_notifiable_type_and_notifiable_id" to remove 16596527 row versions

DETAIL: CPU 19.95s/70.46u sec elapsed 126.08 sec

INFO: scanned index "index_notifications_on_created_at" to remove 16596527 row versions

DETAIL: CPU 5.87s/13.07u sec elapsed 30.69 sec

INFO: "notifications": removed 16596527 row versions in 2569217 pages

DETAIL: CPU 84.77s/35.24u sec elapsed 295.30 sec

INFO: index "notifications_pkey" now contains 56704023 row versions in 930088 pages

DETAIL: 902246 index row versions were removed.

570997 index pages have been deleted, 570906 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.01 sec.

INFO: index "index_notifications_on_person_id_and_created_at" now contains 56704024 row versions in 473208 pages

DETAIL: 902246 index row versions were removed.

8765 index pages have been deleted, 8743 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: index "index_unproc_notifications_on_notifiable_type" now contains 56705182 row versions in 1549089 pages

DETAIL: 13354803 index row versions were removed.

934133 index pages have been deleted, 182731 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.02 sec.

INFO: index "index_notifications_on_person_id_id" now contains 56705323 row versions in 331156 pages

DETAIL: 16594039 index row versions were removed.

4786 index pages have been deleted, 1674 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.01 sec.

INFO: index "index_notifications_on_about_id" now contains 56705325 row versions in 332666 pages

DETAIL: 16596527 index row versions were removed.

11240 index pages have been deleted, 2835 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: index "index_notifications_on_notifiable_type_and_notifiable_id" now contains 56705325 row versions in 666755 pages

DETAIL: 16596527 index row versions were removed.

52936 index pages have been deleted, 2693 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: index "index_notifications_on_created_at" now contains 56705331 row versions in 196271 pages

DETAIL: 14874162 index row versions were removed.

37884 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: "notifications": found 890395 removable, 56698057 nonremovable row versions in 2797452 out of 2799880 pages

DETAIL: 0 dead row versions cannot be removed yet.

There were 29497175 unused item pointers.

Skipped 0 pages due to buffer pins.

0 pages are entirely empty.

CPU 452.97s/11252.42u sec elapsed 12186.90 sec.

INFO: vacuuming "pg_toast.pg_toast_27436"

INFO: index "pg_toast_27436_index" now contains 72 row versions in 2 pages

DETAIL: 0 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO: "pg_toast_27436": found 0 removable, 2 nonremovable row versions in 1 out of 36 pages

DETAIL: 0 dead row versions cannot be removed yet.

There were 3 unused item pointers.

Skipped 0 pages due to buffer pins.

0 pages are entirely empty.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

Thanks,

Walter

Re: Temporarily very slow planning time after a big delete

From

Peter Geoghegan

Date:

21 May 2019, 21:15:07

On Tue, May 21, 2019 at 11:12 AM Walter Smith <walter@carezone.com> wrote
> I did a VACUUM overnight and got the following. The thing that stands out to me is that one index
(index_unproc_notifications_on_notifiable_type)took 100x longer to scan than the others. That's not the index used in
theslow query, though.
 

What columns are indexed by
index_unproc_notifications_on_notifiable_type, and what are their
datatypes?

-- 
Peter Geoghegan

Re: Temporarily very slow planning time after a big delete

From

Walter Smith

Date:

21 May 2019, 21:16:05

On Tue, May 21, 2019 at 11:15 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, May 21, 2019 at 11:12 AM Walter Smith <walter@carezone.com> wrote
> I did a VACUUM overnight and got the following. The thing that stands out to me is that one index (index_unproc_notifications_on_notifiable_type) took 100x longer to scan than the others. That's not the index used in the slow query, though.

What columns are indexed by
index_unproc_notifications_on_notifiable_type, and what are their
datatypes?

It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and delete for every row, and is normally almost empty.

Index "public.index_unproc_notifications_on_notifiable_type"

Column | Type | Definition

-----------------+------------------------+-----------------

notifiable_type | character varying(255) | notifiable_type

btree, for table "public.notifications", predicate (processed = false)

Thanks,

Walter

Re: Temporarily very slow planning time after a big delete

From

Peter Geoghegan

Date:

21 May 2019, 21:17:31

On Tue, May 21, 2019 at 11:16 AM Walter Smith <walter@carezone.com> wrote:
> It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and
deletefor every row, and is normally almost empty.
 

Is it a very low cardinality index? In other words, is the total
number of distinct keys rather low? Not just at any given time, but
over time?

-- 
Peter Geoghegan

Re: Temporarily very slow planning time after a big delete

From

Walter Smith

Date:

21 May 2019, 21:26:49

On Tue, May 21, 2019 at 11:17 AM Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, May 21, 2019 at 11:16 AM Walter Smith <walter@carezone.com> wrote:
> It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and delete for every row, and is normally almost empty.

Is it a very low cardinality index? In other words, is the total
number of distinct keys rather low? Not just at any given time, but
over time?

Very low. Probably less than ten over all time. I suspect the only use of the index is to rapidly find the processed=false rows, so the notifiable_type value isn’t important, really. It would probably work just as well on any other column.

— Walter

Re: Temporarily very slow planning time after a big delete

From

didier

Date:

21 May 2019, 21:32:06

On Tue, May 21, 2019 at 8:27 PM Walter Smith <walter@carezone.com> wrote:

On Tue, May 21, 2019 at 11:17 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, May 21, 2019 at 11:16 AM Walter Smith <walter@carezone.com> wrote:
> It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and delete for every row, and is normally almost empty.

Is it a very low cardinality index? In other words, is the total
number of distinct keys rather low? Not just at any given time, but
over time?

Very low. Probably less than ten over all time. I suspect the only use of the index is to rapidly find the processed=false rows, so the notifiable_type value isn’t important, really. It would probably work just as well on any other column.

— Walter

Re: Temporarily very slow planning time after a big delete

From

Peter Geoghegan

Date:

21 May 2019, 21:36:30

On Tue, May 21, 2019 at 11:27 AM Walter Smith <walter@carezone.com> wrote:
> Very low. Probably less than ten over all time. I suspect the only use of the index is to rapidly find the
processed=falserows, so the notifiable_type value isn’t important, really. It would probably work just as well on any
othercolumn. 

This problem has been fixed in Postgres 12, which treats heap TID as a
tiebreaker column within B-Tree indexes. It sounds like you have the
right idea about how to work around the problem.

VACUUM will need to kill tuples in random locations in the low
cardinality index, since the order of tuples is unspecified between
duplicate tuples -- it is more or less random. VACUUM will tend to
dirty far more pages than is truly necessary in this scenario, because
there is no natural temporal locality that concentrates dead tuples in
one or two particular places in the index. This has a far more
noticeable impact on VACUUM duration than you might expect, since
autovacuum is throttled by delays that vary according to how many
pages were dirtied (and other such factors).

--
Peter Geoghegan