Re: [HACKERS] segfault in hot standby for hash indexes - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [HACKERS] segfault in hot standby for hash indexes
Date
Msg-id CAA4eK1LyzMEbDNNGLVW8SWBsg3ZoBfYWT9mHc4ThHTQFU4inHA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] segfault in hot standby for hash indexes  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Responses Re: [HACKERS] segfault in hot standby for hash indexes  (Ashutosh Sharma <ashu.coek88@gmail.com>)
List pgsql-hackers
On Tue, Mar 21, 2017 at 11:49 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>
>> I can confirm that that fixes the seg faults for me.
>
> Thanks for confirmation.
>
>>
>> Did you mean you couldn't reproduce the problem in the first place, or that
>> you could reproduce it and now the patch fixes it?  If the first of those, I
>> forget to say you do have to wait for hot standby to reach a consistency and
>> open for connections, and then connect to the standby ("psql -p 9874"),
>> before the seg fault will be triggered.
>
> I meant that I was not able to reproduce the issue on HEAD.
>
>>
>> But, there are places where hash_xlog_vacuum_get_latestRemovedXid diverges
>> from btree_xlog_delete_get_latestRemovedXid, which I don't understand the
>> reason for the divergence.  Is there a reason we dropped the PANIC if we
>> have not reached consistency?
>
> Well, I'm not quite sure how would standby allow any backend to
> connect to it until it has reached to a consistent state. If you see
> the definition of btree_xlog_delete_get_latestRemovedXid(), just
> before consistency check there is a if-condition 'if
> (CountDBBackends(InvalidOid) == 0)' which means
> we are checking for consistent state only after knowing that there are
> some backends connected to the standby. So, Is there a possibility of
> having some backend connected to standby server without having it in
> consistent state.
>

I don't think so, but I think we should have reachedConsistency check
and elog(PANIC,..) similar to btree.  If you see other conditions
where we PANIC in btree or hash xlog code, you will notice that those
are also theoretically not possible cases.  It seems this is to save
database from getting corrupt or behaving insanely if due to some
reason (like a coding error or others) the check fails.

In a quick look, I don't find any other divergence in both the
function, is there any other divergence in both functions, if so, I
think we should at the very least mention something about it in the
function header.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Questionable tag usage
Next
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)