On Sat, Jan 23, 2021 at 5:27 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Jan 21, 2021 at 9:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Slowing down non-HOT updaters in these extreme cases may actually be a
> > > good thing, even when bottom-up deletion finally becomes ineffective.
> > > It can be thought of as backpressure. I am not worried about slowing
> > > down something that is already hopelessly inefficient and
> > > unsustainable. I'd go even further than that, in fact -- I now wonder
> > > if we should *deliberately* slow them down some more!
> > >
> >
> > Do you have something specific in mind for this?
>
> Maybe something a bit like the VACUUM cost delay stuff could be
> applied at the point that we realize that a given bottom-up deletion
> pass is entirely effective purely due to a long running transaction,
> that gets applied by nbtree caller once it splits the page.
>
> This isn't something that I plan to work on anytime soon. My point was
> mostly that it really would make sense to deliberately throttle
> non-hot updates at the point that they trigger page splits that are
> believed to be more or less caused by a long running transaction.
> They're so incredibly harmful to the general responsiveness of the
> system that having a last line of defense like that
> (backpressure/throttling) really does make sense.
>
> > I have briefly tried that but numbers were not consistent probably
> > because at that time autovacuum was also 'on'. So, I tried switching
> > off autovacuum and dropping/recreating the tables.
>
> It's not at all surprising that they weren't consistent. Clearly
> bottom-up deletion wastes cycles on the first execution (it is wasted
> effort in at least one sense) -- you showed that already. Subsequent
> executions will actually manage to delete some tuples (probably a
> great many tuples), and so will have totally different performance
> profiles/characteristics. Isn't that obvious?
>
Yeah, that sounds obvious but what I remembered happening was that at
some point during/before the second update, the autovacuum kicks in
and removes the bloat incurred by the previous update. In few cases,
the autovacuum seems to clean up the bloat and still we seem to be
taking additional time maybe because of some non-helpful cycles by
bottom-up clean-up in the new pass (like second bulk-update for which
we can't clean up anything). Now, this is more of speculation based on
the few runs so I don't expect any response or any action based on it.
I need to spend more time on benchmarking to study the behavior and I
think without that it would be difficult to make a conclusion in this
regard. So, let's not consider any action on this front till I spend
more time to find the details.
I agree with the other points mentioned by you in the email.
--
With Regards,
Amit Kapila.