Thread: Temporarily very slow planning time after a big delete
We had a mysterious (to us) slowdown today that I'm hoping someone can explain just based on PG's principles of operation. It got better by itself so it seems like it was "normal" behavior -- I just don't know what behavior it was exhibiting.
We have a table of user notifications containing about 80 million rows. It gets a lot of inserts continually, and is cleaned up once a day. There are no updates. In all history there have been about 330 million rows created.
Today we deleted about 15 million rows in one transaction from this table. Immediately afterwards, a particular SELECT started running very slowly -- 500 to 3000 ms rather than the usual <1ms.
We did an EXPLAIN ANALYZE on this select and it was still doing an index scan as usual. The *planning time* for the query is what had gotten slow. The query itself was still executing in <1ms.
Over the next few hours the time slowly improved, until it returned to the former performance. You can see a graph at https://imgur.com/a/zIfqkF5.
Is this sort of thing expected after a large delete, and if so, can someone explain the mechanism behind it? I've looked for an explanation of what could cause this kind of excess planning time and haven't found one. I'm hoping someone will just recognize what's going on here.
Here is the pg_class data for the table and index:
relname=notifications
relpages=2799880
reltuples=7.15229e+07
relallvisible=1219791
relkind=r
relnatts=11
relhassubclass=f
reloptions=
pg_table_size=22943326208
relname=index_notifications_on_person_id_and_created_at
relpages=473208
reltuples=7.03404e+07
relallvisible=0
relkind=i
relnatts=2
relhassubclass=f
reloptions=
pg_table_size=3877494784
Thanks,
Walter
On Tue, 21 May 2019 at 12:44, Walter Smith <walter@carezone.com> wrote: > > We had a mysterious (to us) slowdown today that I'm hoping someone can explain just based on PG's principles of operation.It got better by itself so it seems like it was "normal" behavior -- I just don't know what behavior it was exhibiting. > > We have a table of user notifications containing about 80 million rows. It gets a lot of inserts continually, and is cleanedup once a day. There are no updates. In all history there have been about 330 million rows created. > > Today we deleted about 15 million rows in one transaction from this table. Immediately afterwards, a particular SELECTstarted running very slowly -- 500 to 3000 ms rather than the usual <1ms. > > We did an EXPLAIN ANALYZE on this select and it was still doing an index scan as usual. The *planning time* for the queryis what had gotten slow. The query itself was still executing in <1ms. It would be good to know which version you're running here. It basically sounds very much like get_actual_variable_range() will be the culprit. Basically, if a constant value that's being used by the planner to determine row estimates with falls outside the statistic's histogram and a btree index exists that we can use to look up the actual bound of the data, then we do so in that function. If you've just deleted a bunch of rows then that index scan may have to traverse a bunch of dead tuples before finding that first live tuple. This code has changed a few times in recent times, see fccebe421 and 3ca930fc3, which is why your version is of interest. Apart from that, if you want to confirm that's the issue and you just want it fixed, just VACUUM the table. You should likely be doing that anyway directly after your bulk delete. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Walter Smith <walter@carezone.com> writes: > Today we deleted about 15 million rows in one transaction from this table. > Immediately afterwards, a particular SELECT started running very slowly -- > 500 to 3000 ms rather than the usual <1ms. > We did an EXPLAIN ANALYZE on this select and it was still doing an index > scan as usual. The *planning time* for the query is what had gotten slow. > The query itself was still executing in <1ms. > Over the next few hours the time slowly improved, until it returned to the > former performance. You can see a graph at https://imgur.com/a/zIfqkF5. Were the deleted rows all at one end of the index in question? If so, this is likely down to the planner trying to use the index to identify the extremal live value of the column, which it wants to know in connection with planning mergejoins (so I'm assuming your problem query involved a join on the indexed column --- whether or not the final plan did a mergejoin, the planner would consider this). As long as there's a live value near the end of the index, this is pretty cheap. If the planner has to trawl through a bunch of dead entries to find the nearest-to-the-end live one, not so much. Subsequent vacuuming would eventually delete the dead index entries and return things to normal; although usually the performance comes back all-of-a-sudden at the next (auto)VACUUM of the table. So I'm a bit intrigued by your seeing it "gradually" improve. Maybe you had old open transactions that were limiting VACUUM's ability to remove rows? We've made a couple of rounds of adjustments of the behavior to try to avoid/reduce this penalty, but since you didn't say what PG version you're running, it's hard to tell whether an upgrade would help. regards, tom lane
I'm so sorry -- I meant to give the version, of course. It's 9.6.13.
Thanks,
Walter
On Mon, May 20, 2019 at 6:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Walter Smith <walter@carezone.com> writes:
> Today we deleted about 15 million rows in one transaction from this table.
> Immediately afterwards, a particular SELECT started running very slowly --
> 500 to 3000 ms rather than the usual <1ms.
> We did an EXPLAIN ANALYZE on this select and it was still doing an index
> scan as usual. The *planning time* for the query is what had gotten slow.
> The query itself was still executing in <1ms.
> Over the next few hours the time slowly improved, until it returned to the
> former performance. You can see a graph at https://imgur.com/a/zIfqkF5.
Were the deleted rows all at one end of the index in question?
If so, this is likely down to the planner trying to use the index to
identify the extremal live value of the column, which it wants to know
in connection with planning mergejoins (so I'm assuming your problem
query involved a join on the indexed column --- whether or not the
final plan did a mergejoin, the planner would consider this). As
long as there's a live value near the end of the index, this is pretty
cheap. If the planner has to trawl through a bunch of dead entries
to find the nearest-to-the-end live one, not so much.
Subsequent vacuuming would eventually delete the dead index entries
and return things to normal; although usually the performance comes
back all-of-a-sudden at the next (auto)VACUUM of the table. So I'm
a bit intrigued by your seeing it "gradually" improve. Maybe you
had old open transactions that were limiting VACUUM's ability to
remove rows?
We've made a couple of rounds of adjustments of the behavior to try
to avoid/reduce this penalty, but since you didn't say what PG version
you're running, it's hard to tell whether an upgrade would help.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes:
>I'm assuming your problem
>query involved a join on the indexed column --- whether or not the
>final plan did a mergejoin, the planner would consider this
>query involved a join on the indexed column --- whether or not the
>final plan did a mergejoin, the planner would consider this
There's no join -- the query is
And the whole query plan is one step:
SELECT "notifications".*
FROM "notifications"
WHERE "notifications"."person_id" = ? AND
"notifications"."app_category" = ? AND
(id > ?)
ORDER BY created_at DESC
LIMIT ?
And the whole query plan is one step:
Index Scan using index_notifications_on_person_id_and_created_at on notifications (cost=0.57..212.16 rows=52 width=231)
>Subsequent vacuuming would eventually delete the dead index entries
>and return things to normal; although usually the performance comes
>back all-of-a-sudden at the next (auto)VACUUM of the table. So I'm
>a bit intrigued by your seeing it "gradually" improve. Maybe you
>had old open transactions that were limiting VACUUM's ability to
>remove rows?'
>and return things to normal; although usually the performance comes
>back all-of-a-sudden at the next (auto)VACUUM of the table. So I'm
>a bit intrigued by your seeing it "gradually" improve. Maybe you
>had old open transactions that were limiting VACUUM's ability to
>remove rows?'
We shouldn't have any long-running transactions at all, certainly not open for a couple of hours.
Thanks,
Walter
On Tue, 21 May 2019 at 14:04, Walter Smith <walter@carezone.com> wrote: > I'm so sorry -- I meant to give the version, of course. It's 9.6.13. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3ca930fc3 has been applied since then. It would be good if you could confirm the problem is resolved after a vacuum. Maybe run VACUUM VERBOSE on the table and double check there's not some large amount of tuples that are "nonremovable". -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Mon, May 20, 2019 at 7:15 PM David Rowley <david.rowley@2ndquadrant.com> wrote:
It would be good if you could confirm the problem is resolved after a
vacuum. Maybe run VACUUM VERBOSE on the table and double check
there's not some large amount of tuples that are "nonremovable".
As I say, the problem resolved itself over the next couple of hours. Perhaps something autovacuum did? Or if the index extrema hypothesis is correct, perhaps the new rows being inserted for various users slowly changed that situation?
I did a VACUUM overnight and got the following. The thing that stands out to me is that one index (index_unproc_notifications_on_notifiable_type) took 100x longer to scan than the others. That's not the index used in the slow query, though.
INFO: vacuuming "public.notifications"
INFO: scanned index "notifications_pkey" to remove 16596527 row versions
DETAIL: CPU 12.11s/11.04u sec elapsed 39.62 sec
INFO: scanned index "index_notifications_on_person_id_and_created_at" to remove 16596527 row versions
DETAIL: CPU 15.86s/49.85u sec elapsed 92.07 sec
INFO: scanned index "index_unproc_notifications_on_notifiable_type" to remove 16596527 row versions
DETAIL: CPU 224.08s/10934.81u sec elapsed 11208.37 sec
INFO: scanned index "index_notifications_on_person_id_id" to remove 16596527 row versions
DETAIL: CPU 11.58s/59.54u sec elapsed 91.40 sec
INFO: scanned index "index_notifications_on_about_id" to remove 16596527 row versions
DETAIL: CPU 11.70s/57.75u sec elapsed 87.81 sec
INFO: scanned index "index_notifications_on_notifiable_type_and_notifiable_id" to remove 16596527 row versions
DETAIL: CPU 19.95s/70.46u sec elapsed 126.08 sec
INFO: scanned index "index_notifications_on_created_at" to remove 16596527 row versions
DETAIL: CPU 5.87s/13.07u sec elapsed 30.69 sec
INFO: "notifications": removed 16596527 row versions in 2569217 pages
DETAIL: CPU 84.77s/35.24u sec elapsed 295.30 sec
INFO: index "notifications_pkey" now contains 56704023 row versions in 930088 pages
DETAIL: 902246 index row versions were removed.
570997 index pages have been deleted, 570906 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO: index "index_notifications_on_person_id_and_created_at" now contains 56704024 row versions in 473208 pages
DETAIL: 902246 index row versions were removed.
8765 index pages have been deleted, 8743 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "index_unproc_notifications_on_notifiable_type" now contains 56705182 row versions in 1549089 pages
DETAIL: 13354803 index row versions were removed.
934133 index pages have been deleted, 182731 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO: index "index_notifications_on_person_id_id" now contains 56705323 row versions in 331156 pages
DETAIL: 16594039 index row versions were removed.
4786 index pages have been deleted, 1674 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO: index "index_notifications_on_about_id" now contains 56705325 row versions in 332666 pages
DETAIL: 16596527 index row versions were removed.
11240 index pages have been deleted, 2835 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "index_notifications_on_notifiable_type_and_notifiable_id" now contains 56705325 row versions in 666755 pages
DETAIL: 16596527 index row versions were removed.
52936 index pages have been deleted, 2693 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "index_notifications_on_created_at" now contains 56705331 row versions in 196271 pages
DETAIL: 14874162 index row versions were removed.
37884 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "notifications": found 890395 removable, 56698057 nonremovable row versions in 2797452 out of 2799880 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 29497175 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 452.97s/11252.42u sec elapsed 12186.90 sec.
INFO: vacuuming "pg_toast.pg_toast_27436"
INFO: index "pg_toast_27436_index" now contains 72 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_27436": found 0 removable, 2 nonremovable row versions in 1 out of 36 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 3 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
Thanks,
Walter
On Tue, May 21, 2019 at 11:12 AM Walter Smith <walter@carezone.com> wrote > I did a VACUUM overnight and got the following. The thing that stands out to me is that one index (index_unproc_notifications_on_notifiable_type)took 100x longer to scan than the others. That's not the index used in theslow query, though. What columns are indexed by index_unproc_notifications_on_notifiable_type, and what are their datatypes? -- Peter Geoghegan
On Tue, May 21, 2019 at 11:15 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, May 21, 2019 at 11:12 AM Walter Smith <walter@carezone.com> wrote
> I did a VACUUM overnight and got the following. The thing that stands out to me is that one index (index_unproc_notifications_on_notifiable_type) took 100x longer to scan than the others. That's not the index used in the slow query, though.
What columns are indexed by
index_unproc_notifications_on_notifiable_type, and what are their
datatypes?
It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and delete for every row, and is normally almost empty.
Index "public.index_unproc_notifications_on_notifiable_type"
Column | Type | Definition
-----------------+------------------------+-----------------
notifiable_type | character varying(255) | notifiable_type
btree, for table "public.notifications", predicate (processed = false)
Thanks,
Walter
On Tue, May 21, 2019 at 11:16 AM Walter Smith <walter@carezone.com> wrote: > It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and deletefor every row, and is normally almost empty. Is it a very low cardinality index? In other words, is the total number of distinct keys rather low? Not just at any given time, but over time? -- Peter Geoghegan
On Tue, May 21, 2019 at 11:17 AM Peter Geoghegan <pg@bowt.ie> wrote:
On Tue, May 21, 2019 at 11:16 AM Walter Smith <walter@carezone.com> wrote:
> It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and delete for every row, and is normally almost empty.
Is it a very low cardinality index? In other words, is the total
number of distinct keys rather low? Not just at any given time, but
over time?
Very low. Probably less than ten over all time. I suspect the only use of the index is to rapidly find the processed=false rows, so the notifiable_type value isn’t important, really. It would probably work just as well on any other column.
— Walter
On Tue, May 21, 2019 at 8:27 PM Walter Smith <walter@carezone.com> wrote:
On Tue, May 21, 2019 at 11:17 AM Peter Geoghegan <pg@bowt.ie> wrote:On Tue, May 21, 2019 at 11:16 AM Walter Smith <walter@carezone.com> wrote:
> It occurs to me that is a somewhat unusual index -- it tracks unprocessed notifications so it gets an insert and delete for every row, and is normally almost empty.
Is it a very low cardinality index? In other words, is the total
number of distinct keys rather low? Not just at any given time, but
over time?Very low. Probably less than ten over all time. I suspect the only use of the index is to rapidly find the processed=false rows, so the notifiable_type value isn’t important, really. It would probably work just as well on any other column.— Walter
On Tue, May 21, 2019 at 11:27 AM Walter Smith <walter@carezone.com> wrote: > Very low. Probably less than ten over all time. I suspect the only use of the index is to rapidly find the processed=falserows, so the notifiable_type value isn’t important, really. It would probably work just as well on any othercolumn. This problem has been fixed in Postgres 12, which treats heap TID as a tiebreaker column within B-Tree indexes. It sounds like you have the right idea about how to work around the problem. VACUUM will need to kill tuples in random locations in the low cardinality index, since the order of tuples is unspecified between duplicate tuples -- it is more or less random. VACUUM will tend to dirty far more pages than is truly necessary in this scenario, because there is no natural temporal locality that concentrates dead tuples in one or two particular places in the index. This has a far more noticeable impact on VACUUM duration than you might expect, since autovacuum is throttled by delays that vary according to how many pages were dirtied (and other such factors). -- Peter Geoghegan