Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
Date | |
Msg-id | CA+TgmoYOR+nkaVNreJyGxUqDh7NnVp6DYM3G=w3zcQOvuvNahA@mail.gmail.com Whole thread Raw |
In response to | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints (Dilip Kumar <dilipbalaut@gmail.com>) |
Responses |
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
List | pgsql-hackers |
On Mon, Mar 21, 2022 at 11:21 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > I tried to debug the case but I realized that somehow > CHECK_FOR_INTERRUPTS() is not calling the > AcceptInvalidationMessages() and I could not find the same while > looking into the code as well. While debugging I noticed that > AcceptInvalidationMessages() is called multiple times but that is only > through LockRelationId() but while locking the relation we had already > closed the previous smgr because at a time we keep only one smgr open. > And that's the reason it is not hitting the issue which we think it > could. Is there any condition under which it will call > AcceptInvalidationMessages() through CHECK_FOR_INTERRUPTS() ? because > I could not see while debugging as well as in code. Yeah, I think the reason you can't find it is that it's not there. I was confused in what I wrote earlier. I think we only process sinval catchups when we're idle, not at every CHECK_FOR_INTERRUPTS(). And I think the reason for that is precisely that it would be hard to write correct code otherwise, since invalidations might then get processed in a lot more places. So ... I guess all we really need to do here is avoid assuming that the results of smgropen() are valid across any code that might acquire relation locks. Which I think the code already does. But on a related note, why doesn't CreateDatabaseUsingWalLog() acquire locks on both the source and destination relations? It looks like you're only taking locks for the source template database ... but I thought the intention here was to make sure that we didn't pull pages into shared_buffers without holding a lock on the relation and/or the database? I suppose the point is that while the template database might be concurrently dropped, nobody can be doing anything concurrently to the target database because nobody knows that it exists yet. Still, I think that this would be the only case where we let pages into shared_buffers without a relation or database lock, though maybe I'm confused about this point, too. If not, perhaps we should consider locking the target database OID and each relation OID as we are copying it? I guess I'm imagining that there might be more code pathways in the future that want to ensure that there are no remaining buffers for some particular database or relation OID. It seems natural to want to be able to take some lock that prevents buffers from being added, and then go and get rid of all the ones that are there already. But I admit I can't quite think of a concrete case where we'd want to do something like this where the patch as coded would be a problem. I'm just thinking perhaps taking locks is fairly harmless and might avoid some hypothetical problem later. Thoughts? -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: