Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers

From Amul Sul
Subject Re: [Patch] ALTER SYSTEM READ ONLY
Date
Msg-id CAAJ_b97LWmMO=9a9XP7GsjBMH+Yi7w06F18qMTFZTRrVjNpREg@mail.gmail.com
Whole thread Raw
In response to Re: [Patch] ALTER SYSTEM READ ONLY  (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>)
Responses Re: [Patch] ALTER SYSTEM READ ONLY
List pgsql-hackers
On Thu, Jul 23, 2020 at 3:33 AM Soumyadeep Chakraborty
<soumyadeep2007@gmail.com> wrote:
>
> Hello,
>
> I think we should really term this feature, as it stands, as a means to
> solely stop WAL writes from happening.
>

True.

> The feature doesn't truly make the system read-only (e.g. dirty buffer
> flushes may succeed the system being put into a read-only state), which
> does make it confusing to a degree.
>
> Ideally, if we were to have a read-only system, we should be able to run
> pg_checksums on it, or take file-system snapshots etc, without the need
> to shut down the cluster. It would also enable an interesting use case:
> we should also be able to do a live upgrade on any running cluster and
> entertain read-only queries at the same time, given that all the
> cluster's files will be immutable?
>

Read-only is for the queries.

The aim of this feature is preventing new WAL records from being generated, not
preventing them from being flushed to disk, or streamed to standbys, or anything
else. The rest should happen as normal.

If you can't flush WAL, then you might not be able to evict some number of
buffers, which in the worst case could be large. That's because you can't evict
a dirty buffer until WAL has been flushed up to the buffer's LSN (otherwise,
you wouldn't be following the WAL-before-data rule). And having a potentially
large number of unevictable buffers around sounds terrible, not only for
performance, but also for having the system keep working at all.

> So if we are not going to address those cases, we should change the
> syntax and remove the notion of read-only. It could be:
>
> ALTER SYSTEM SET wal_writes TO off|on;
> or
> ALTER SYSTEM SET prohibit_wal TO off|on;
>
> If we are going to try to make it truly read-only, and cater to the
> other use cases, we have to:
>
> Perform a checkpoint before declaring the system read-only (i.e. before
> the command returns). This may be expensive of course, as Andres has
> pointed out in this thread, but it is a price that has to be paid. If we
> do this checkpoint, then we can avoid an additional shutdown checkpoint
> and an end-of-recovery checkpoint (if we restart the primary after a
> crash while in read-only mode). Also, we would have to prevent any
> operation that touches control files, which I am not sure we do today in
> the current patch.
>

The intention is to change the system to read-only ASAP; the checkpoint will
make it much slower.

I don't think we can skip control file updates that need to make read-only
state persistent across the restart.

> Why not have the best of both worlds? Consider:
>
> ALTER SYSTEM SET read_only to {off, on, wal};
>
> -- on: wal writes off + no writes to disk
> -- off: default
> -- wal: only wal writes off
>
> Of course, there can probably be better syntax for the above.
>

Sure, thanks for the suggestions. Syntax change is not a harder part; we can
choose the better one later.

Regards,
Amul



pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Building 12.3 from source on Mac
Next
From: Amul Sul
Date:
Subject: Re: [Patch] ALTER SYSTEM READ ONLY