On Fri, Nov 17, 2017 at 5:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Nov 16, 2017 at 2:41 PM, Andres Freund <andres@anarazel.de> wrote:
>>> To me, it seems like SnapBuildWaitSnapshot() is fundamentally
>>> misdesigned
>>
>> Maybe I'm confused, but why is it fundamentally misdesigned? It's not
>> such an absurd idea to wait for an xid in a WAL record. I get that
>> there's a race condition here, which obviously bad, but I don't really
>> see as evidence of the above claim.
>>
>> I actually think this code used to be safe because ProcArrayLock used to
>> be held while generating and logging the running snapshots record. That
>> was removed when fixing some other bug, but perhaps that shouldn't have
>> been done...
>
> OK. Well, I might be overstating the case. My comment about
> fundamental misdesign was really just based on the assumption that
> XactLockTableWait() could be used to wait for an XID the instant it
> was generated. That was never gonna work and there's no obvious clean
> workaround for the problem. Getting snapshot building to work
> properly seems to be Hard (TM).
The patches discussed here deserve tracking, so please note that I
have added an entry in the CF app:
https://commitfest.postgresql.org/16/1381/
--
Michael