On Thu, Mar 31, 2016 at 2:02 PM, Noah Misch <noah@leadboat.com> wrote:
> On Thu, Mar 10, 2016 at 01:04:11AM +0900, Masahiko Sawada wrote:
>> As a result of looked into code around the recvoery, ISTM that the
>> cause is related to relation cache clear.
>> In heap_xlog_visible, if the standby server receives WAL record then
>> relation cache is eventually cleared in vm_extend, but If standby
>> server receives FPI then relation cache would not be cleared.
>> For example, after I applied attached patch to HEAD, (it might not be
>> right way but) this problem seems to be resolved.
>>
>> Is this a bug? or not?
>
> It's a bug. I don't expect it causes queries to return wrong answers, because
> visibilitymap.c says "it's always safe to clear a bit in the map from
> correctness point of view." (The bug makes a visibility map bit temporarily
> appear to have been cleared.) I still call it a bug, because recovery
> behavior becomes too difficult to verify when xlog replay produces conditions
> that don't happen outside of recovery. Even if there's no way to get a wrong
> query answer today, this would be too easy to break later. I wonder if we
> make the same omission in other xlog replay functions. Similar omissions may
> cause wrong query answers, even if this particular one does not.
>
> Would you like to bisect for the commit, or at least the major release, at
> which the bug first appeared?
>
> I wonder if your discovery has any relationship to this recently-reported case
> of insufficient smgr invalidation:
> http://www.postgresql.org/message-id/flat/CAB7nPqSBFmh5cQjpRbFBp9Rkv1nF=Nh2o1FxKkJ6yvOBtvYDBA@mail.gmail.com
>
I'm not sure this bug has relationship to another issue you mentioned
but after further investigation, this bug seems to be reproduced even
on more older version.
At least I reproduced it at 9.0.0.
Regards,
--
Masahiko Sawada