Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers
From | Amul Sul |
---|---|
Subject | Re: [Patch] ALTER SYSTEM READ ONLY |
Date | |
Msg-id | CAAJ_b97LWmMO=9a9XP7GsjBMH+Yi7w06F18qMTFZTRrVjNpREg@mail.gmail.com Whole thread Raw |
In response to | Re: [Patch] ALTER SYSTEM READ ONLY (Soumyadeep Chakraborty <soumyadeep2007@gmail.com>) |
Responses |
Re: [Patch] ALTER SYSTEM READ ONLY
|
List | pgsql-hackers |
On Thu, Jul 23, 2020 at 3:33 AM Soumyadeep Chakraborty <soumyadeep2007@gmail.com> wrote: > > Hello, > > I think we should really term this feature, as it stands, as a means to > solely stop WAL writes from happening. > True. > The feature doesn't truly make the system read-only (e.g. dirty buffer > flushes may succeed the system being put into a read-only state), which > does make it confusing to a degree. > > Ideally, if we were to have a read-only system, we should be able to run > pg_checksums on it, or take file-system snapshots etc, without the need > to shut down the cluster. It would also enable an interesting use case: > we should also be able to do a live upgrade on any running cluster and > entertain read-only queries at the same time, given that all the > cluster's files will be immutable? > Read-only is for the queries. The aim of this feature is preventing new WAL records from being generated, not preventing them from being flushed to disk, or streamed to standbys, or anything else. The rest should happen as normal. If you can't flush WAL, then you might not be able to evict some number of buffers, which in the worst case could be large. That's because you can't evict a dirty buffer until WAL has been flushed up to the buffer's LSN (otherwise, you wouldn't be following the WAL-before-data rule). And having a potentially large number of unevictable buffers around sounds terrible, not only for performance, but also for having the system keep working at all. > So if we are not going to address those cases, we should change the > syntax and remove the notion of read-only. It could be: > > ALTER SYSTEM SET wal_writes TO off|on; > or > ALTER SYSTEM SET prohibit_wal TO off|on; > > If we are going to try to make it truly read-only, and cater to the > other use cases, we have to: > > Perform a checkpoint before declaring the system read-only (i.e. before > the command returns). This may be expensive of course, as Andres has > pointed out in this thread, but it is a price that has to be paid. If we > do this checkpoint, then we can avoid an additional shutdown checkpoint > and an end-of-recovery checkpoint (if we restart the primary after a > crash while in read-only mode). Also, we would have to prevent any > operation that touches control files, which I am not sure we do today in > the current patch. > The intention is to change the system to read-only ASAP; the checkpoint will make it much slower. I don't think we can skip control file updates that need to make read-only state persistent across the restart. > Why not have the best of both worlds? Consider: > > ALTER SYSTEM SET read_only to {off, on, wal}; > > -- on: wal writes off + no writes to disk > -- off: default > -- wal: only wal writes off > > Of course, there can probably be better syntax for the above. > Sure, thanks for the suggestions. Syntax change is not a harder part; we can choose the better one later. Regards, Amul
pgsql-hackers by date: