Re: WAL insert delay settings - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: WAL insert delay settings
Date
Msg-id 20190221105025.GI6197@tamriel.snowman.net
Whole thread Raw
In response to Re: WAL insert delay settings  (Ants Aasma <ants.aasma@eesti.ee>)
Responses Re: WAL insert delay settings
List pgsql-hackers
Greetings,

* Ants Aasma (ants.aasma@eesti.ee) wrote:
> On Thu, Feb 21, 2019 at 2:20 AM Stephen Frost <sfrost@snowman.net> wrote:
> > The issue with hitting those bandwidth limits is that you end up with
> > queues outside of your control and therefore are unable to prioritize
> > the data going through them.  I agree, that's an issue and it might be
> > necessary to ask the admin to provide what the bandwidth limit is, so
> > that we could then avoid running into issues with downstream queues that
> > are outside of our control causing unexpected/unacceptable lag.
>
> If there is a global rate limit on WAL throughput it could be adjusted by a
> control loop, measuring replication queue length and/or apply delay. I
> don't see any sane way how one would tune a per command rate limit, or even
> worse, a cost-delay parameter. It would have the same problems as work_mem
> settings.

Yeah, having some kind of feedback loop would be interesting.  I agree
that a per-command rate limit would have similar problems to work_mem,
and that's definitely one problem we have with the way VACUUM is tuned
today but the ship has more-or-less sailed on that- I don't think we're
going to be able to simply remove the VACUUM settings.  Avoiding adding
new settings that are per-command would be good though, if we can sort
out a way how.

> Rate limit in front of WAL insertion would allow for allocating the
> throughput between foreground and background tasks, and even allow for
> priority inheritance to alleviate priority inversion due to locks.

I'm not sure how much we have to worry about priority inversion here as
you need to have conflicts for that and if there's actually a conflict,
then it seems like we should just press on.

That is, a non-concurrent REINDEX is going to prevent an UPDATE from
modifying anything in the table, which if the UPDATE is a higher
priority than the REINDEX would be priority inversion, but that doesn't
mean we should slow down the REINDEX to allow the UPDATE to happen
because the UPDATE simply can't happen until the REINDEX is complete.
Now, we might slow down the REINDEX because there's UPDATEs against
*other* tables that aren't conflicting and we want those UPDATEs to be
prioritized over the REINDEX but then that isn't priority inversion.

Basically, I'm not sure that there's anything we can do, or need to do,
differently from what we do today when it comes to priority inversion
risk, at least as it relates to this discussion.  There's an interesting
discussion to be had about if we should delay the REINDEX taking the
lock at all when there's an UPDATE pending, but you run the risk of
starving the REINDEX from ever getting the lock and being able to run in
that case.  A better approach is what we're already working on- arrange
for the REINDEX to not require a conflicting lock, so that both can run
concurrently.

> There is also an implicit assumption here that a maintenance command is a
> background task and a normal DML query is a foreground task. This is not
> true for all cases, users may want to throttle transactions doing lots of
> DML to keep synchronous commit latencies for smaller transactions within
> reasonable limits.

Agreed, that was something that I was contemplating too- and one could
possibly argue in the other direction as well (maybe that REINDEX is on
a small table but has a very high priority and we're willing to accept
that some regular DML is delayed a bit to allow that REINDEX to finish).
Still, I would think we'd basically want to use the heuristic that DDL
is bulk and DML is a higher priority for a baseline/default position,
but then provide users with a way to change the priority on a
per-session level, presumably with a GUC or similar, if they have a case
where that heuristic is wrong.

Again, just to be clear, this is all really 'food for thought' and
interesting discussion and shouldn't keep us from doing something simple
now, if we can, to help alleviate the immediate practical issue that
bulk commands can cause serious WAL lag.  I think it's good to have
these discussions since they may help us to craft the simple solution in
a way that could later be extended (or at least won't get in the way)
for these much larger changes, but even if that's not possible, we
should be open to accepting a simpler, short-term, improvement, as these
larger changes would very likely be multiple major releases away if
they're able to be done at all.

> As a wild idea for how to handle the throttling, what if when all our wal
> insertion credits are used up XLogInsert() sets InterruptPending and the
> actual sleep is done inside ProcessInterrupts()?

This comment might be better if it was made higher up in the thread,
closer to where the discussion was happening about the issues with
critical sections and the current patch's approach for throttle-based
rate limiting.  I'm afraid that it might get lost in this sub-thread
about these much larger and loftier ideas around where we might want to
go in the future.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Gilles Darold
Date:
Subject: Re: [patch] Add schema total size to psql \dn+
Next
From: Julien Rouhaud
Date:
Subject: Re: [patch] Add schema total size to psql \dn+