Re: .ready and .done files considered harmful - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: .ready and .done files considered harmful |
Date | |
Msg-id | CA+TgmoaOD+CcEeXmOo2+vFLL0K7jc9=HHF0t=8nJqRaiRJndSw@mail.gmail.com Whole thread Raw |
In response to | Re: .ready and .done files considered harmful ("Bossart, Nathan" <bossartn@amazon.com>) |
Responses |
Re: .ready and .done files considered harmful
Re: .ready and .done files considered harmful |
List | pgsql-hackers |
On Tue, Aug 17, 2021 at 12:33 PM Bossart, Nathan <bossartn@amazon.com> wrote: > Sorry, I think my note was not very clear. I agree that a flag should > be used for this purpose, but I think we should just use a regular > bool protected by a spinlock or LWLock instead of an atomic. The file > atomics.h has the following note: > > * Use higher level functionality (lwlocks, spinlocks, heavyweight locks) > * whenever possible. Writing correct code using these facilities is hard. > > IOW I don't think the extra complexity is necessary. From a > performance standpoint, contention seems unlikely. We only need to > read the flag roughly once per WAL segment, and we only ever set it in > uncommon scenarios such as a timeline switch or the creation of an > out-of-order .ready file. In the interest of full disclosure, I think that I was probably the one who suggested to Dipesh that he should look into using atomics, although I can't quite remember right now why I thought we might want to do that. I do not on general principle very much like code that does LWLockAcquire(whatever); exactly-one-assignment-statement-that-modifies-a-1-2-or-4-byte-quantity; LWLockRelease(whatever). If you had two assignments in there, then you know why you have a lock: it's to make those behave as an atomic, indivisible unit. But when you only have one, what are you protecting against? You're certainly not making anything atomic that would not have been anyway, so you must be using the LWLock as a memory barrier. But then you really kind of have to think about memory barriers anyway: why do you need one at all, and what things need to be separated? It's not clear that spelling pg_memory_barrier() as LWLockAcquire() and/or LWLockRelease() is actually saving you anything in terms of notional complexity. In this patch, it appears to me that the atomic flag is only ever being read unlocked, so I think that we're actually getting no benefit at all from the use of pg_atomic_flag here. We're not making anything atomic, because there's only one bit of shared state, and we're not getting any memory barrier semantics, because it looks to me like the flag is only ever tested using pg_atomic_unlocked_test_flag, which is documented not to have barrier semantics. So as far as I can see, there's no point in using either an LWLock or atomics here. We could just use bool with no lock and the code would do exactly what it does now. So I guess the question is whether that's correct or whether we need some kind of synchronization and, if so, of what sort. I can't actually see that there's any kind of hard synchronization requirement here at all. What we're trying to do is guarantee that if the timeline changes, we'll pick up the timeline history for the new timeline next, and that if files are archived out of order, we'll switch to archiving the oldest file that is now present rather than continuing with consecutive files. But suppose we just use an unsynchronized bool. The worst case is that we'll archive one extra file proceeding in order before we jump to the file that we were supposed to archive next. It's not evident to me that this is all that bad. The same thing would have happened if the previous file had been archived slightly faster than it actually was, so that we began archiving the next file just before, rather than just after, the notification was sent. And if it is bad, wrapping an LWLock around the accesses to the flag variable, or using an atomic, does nothing to stop it. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: