Re: WAL insert delay settings - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: WAL insert delay settings
Date
Msg-id 20190221002015.GD6197@tamriel.snowman.net
Whole thread Raw
In response to Re: WAL insert delay settings  (Andres Freund <andres@anarazel.de>)
Responses Re: WAL insert delay settings
List pgsql-hackers
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-02-20 18:46:09 -0500, Stephen Frost wrote:
> > * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
> > > On 2/20/19 10:43 PM, Stephen Frost wrote:
> > > > Just to share a few additional thoughts after pondering this for a
> > > > while, but the comment Andres made up-thread really struck a chord- we
> > > > don't necessairly want to throttle anything, what we'd really rather do
> > > > is *prioritize* things, whereby foreground work (regular queries and
> > > > such) have a higher priority than background/bulk work (VACUUM, REINDEX,
> > > > etc) but otherwise we use the system to its full capacity.  We don't
> > > > actually want to throttle a VACUUM run any more than a CREATE INDEX, we
> > > > just don't want those to hurt the performance of regular queries that
> > > > are happening.
> > >
> > > I think you're forgetting the motivation of this very patch was to
> > > prevent replication lag caused by a command generating large amounts of
> > > WAL (like CREATE INDEX / ALTER TABLE etc.). That has almost nothing to
> > > do with prioritization or foreground/background split.
> > >
> > > I'm not arguing against ability to prioritize stuff, but I disagree it
> > > somehow replaces throttling.
> >
> > Why is replication lag an issue though?  I would contend it's an issue
> > because with sync replication, it makes foreground processes wait, and
> > with async replication, it makes the actions of foreground processes
> > show up late on the replicas.
>
> I think reaching the bandwidth limit of either the replication stream,
> or of the startup process is actually more common than these. And for
> that prioritization doesn't help, unless it somehow reduces the total
> amount of WAL.

The issue with hitting those bandwidth limits is that you end up with
queues outside of your control and therefore are unable to prioritize
the data going through them.  I agree, that's an issue and it might be
necessary to ask the admin to provide what the bandwidth limit is, so
that we could then avoid running into issues with downstream queues that
are outside of our control causing unexpected/unacceptable lag.

> > If the actual WAL records for the foreground processes got priority and
> > were pushed out earlier than the background ones, that would eliminate
> > both of those issues with replication lag.  Perhaps there's other issues
> > that replication lag cause but which aren't solved by prioritizing the
> > actual WAL records that you care about getting to the replicas faster,
> > but if so, I'd like to hear what those are.
>
> Wait, what? Are you actually suggesting that different sources of WAL
> records should be streamed out in different order? You're blowing a
> somewhat reasonably doable project up into "let's rewrite a large chunk
> of all of the durability layer in postgres".
>
> Stephen, we gotta stop blowing up projects into something that can't
> ever realistically be finished.

I started this sub-thread specifically the way I did because I was
trying to make it clear that these were just ideas for possible
discussion- I'm *not* suggesting, nor saying, that we have to go
implement this right now instead of implementing the throttling that
started this thread.  I'm also, to be clear, not objecting to
implementing the throttling discussed (though, as mentioned but
seemingly ignored, I'd see it maybe configurable in different ways than
originally suggested).

If there's a way I can get that across more clearly than saying "Just to
share a few additional thoughts", I'm happy to try and do so, but I
don't agree that I should be required to simply keep such thoughts to
myself; indeed, I'll admit that I don't know how large a project this
would actually be and while I figured it'd be *huge*, I wanted to share
the thought in case someone might see a way that we could implement it
with much less work and have a better solution as a result.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: Ryu floating point output patch
Next
From: Euler Taveira
Date:
Subject: Re: Set fallback_application_name for a walreceiver to cluster_name