Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [Patch] ALTER SYSTEM READ ONLY |
Date | |
Msg-id | CA+TgmoYz9Cx=hFsbG1V5P6-UmF6hcJM5HLd9EW+4HRro5kWRwg@mail.gmail.com Whole thread Raw |
In response to | Re: [Patch] ALTER SYSTEM READ ONLY (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: [Patch] ALTER SYSTEM READ ONLY
Re: [Patch] ALTER SYSTEM READ ONLY |
List | pgsql-hackers |
On Thu, Jun 18, 2020 at 5:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > For buffer replacement, many-a-times we have to also perform > XLogFlush, what do we do for that? We can't proceed without doing > that and erroring out from there means stopping read-only query from > the user perspective. I think we should stop WAL writes, then XLogFlush() once, then declare the system R/O. After that there might be more XLogFlush() calls but there won't be any new WAL, so they won't do anything. > > But there's no reason for the checkpointer to do it: it shouldn't try > > to checkpoint, and therefore it shouldn't write dirty pages either. > > What is the harm in doing the checkpoint before we put the system into > READ ONLY state? The advantage is that we can at least reduce the > recovery time if we allow writing checkpoint record. Well, as Andres says in http://postgr.es/m/20200617180546.yucxtiupvxghxss6@alap3.anarazel.de it can take a really long time. > > Interesting question. I was thinking that we should probably teach the > > autovacuum launcher to stop launching workers while the system is in a > > READ ONLY state, but what about existing workers? Anything that > > generates invalidation messages, acquires an XID, or writes WAL has to > > be blocked in a read-only state; but I'm not sure to what extent the > > first two of those things would be a problem for vacuuming an unlogged > > table. I think you couldn't truncate it, at least, because that > > acquires an XID. > > > > If the truncate operation errors out, then won't the system will again > trigger a new autovacuum worker for the same relation as we update > stats at the end? Not if we do what I said in that paragraph. If we're not launching new workers we can't again trigger a worker for the same relation. > Also, in general for regular tables, if there is an > error while it tries to WAL, it could again trigger the autovacuum > worker for the same relation. If this is true then unnecessarily it > will generate a lot of dirty pages and don't think it will be good for > the system to behave that way? I don't see how this would happen. VACUUM can't really dirty pages without writing WAL, can it? And, anyway, if there's an error, we're not going to try again for the same relation unless we launch new workers. > > What I think should happen is that the end-of-recovery checkpoint > > should be skipped, and then if the system is put back into read-write > > mode later we should do it then. > > But then if we have to perform recovery again, it will start from the > previous checkpoint. I think we have to live with it. Yeah. I don't think it's that bad. The case where you shut down the system while it's read-only should be a somewhat unusual one. Normally you would mark it read only and then promote a standby and shut the old master down (or demote it). But what you want is that if it does happen to go down for some reason before all the WAL is streamed, you can bring it back up and finish streaming the WAL without generating any new WAL. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: