> On 12 Apr 2016, at 15:47, Michael Paquier <michael.paquier@gmail.com> wrote:
>
> On Mon, Apr 11, 2016 at 7:16 PM, Stas Kelvich wrote:
>> Michael, it looks like that you are the only one person who can reproduce that bug. I’ve tried on bunch of OS’s and
didn’tobserve that behaviour, also looking at your backtraces I can’t get who is holding this lock (and all of that
happensbefore first prepare record is replayed).
>
> Where did you try it. FWIW, I can reproduce that on Linux and OSX, and
> only manually though:
Thanks a lot, Michael! Now I was able to reproduce that. Seems that when
you was performing manual setup, master instance issued checkpoint, but in
my script that didn’t happened because of shorter timing. There are tests
with checkpoint between prepare/commit in proposed test suite, but none of
them was issuing ddl.
> It looks to be the case... The PREPARE phase replayed after the
> standby is restarted in recovery creates a series of exclusive locks
> on the table created and those are not taken on HEAD. Once those are
> replayed the LOCK_STANDBY record is conflicting with it. In the case
> of the failure, the COMMIT PREPARED record cannot be fetched from
> master via the WAL stream so the relation never becomes visible.
Yep, it is. It is okay for prepared xact hold a locks for created/changed tables,
but code in standby_redo() was written in assumption that there are no prepared
xacts at the time of recovery. I’ll look closer at checkpointer code and will send
updated patch.
And thanks again.
--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company