Re: [HACKERS] segfault in hot standby for hash indexes - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: [HACKERS] segfault in hot standby for hash indexes
Date
Msg-id CAE9k0Pn9zNHgbbd7LCgGT=PzK0S0kiE4aTowv5=1xfyqyUMg4w@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] segfault in hot standby for hash indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] segfault in hot standby for hash indexes  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
Hi,

On Wed, Mar 22, 2017 at 8:41 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Mar 21, 2017 at 11:49 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>
>>> I can confirm that that fixes the seg faults for me.
>>
>> Thanks for confirmation.
>>
>>>
>>> Did you mean you couldn't reproduce the problem in the first place, or that
>>> you could reproduce it and now the patch fixes it?  If the first of those, I
>>> forget to say you do have to wait for hot standby to reach a consistency and
>>> open for connections, and then connect to the standby ("psql -p 9874"),
>>> before the seg fault will be triggered.
>>
>> I meant that I was not able to reproduce the issue on HEAD.
>>
>>>
>>> But, there are places where hash_xlog_vacuum_get_latestRemovedXid diverges
>>> from btree_xlog_delete_get_latestRemovedXid, which I don't understand the
>>> reason for the divergence.  Is there a reason we dropped the PANIC if we
>>> have not reached consistency?
>>
>> Well, I'm not quite sure how would standby allow any backend to
>> connect to it until it has reached to a consistent state. If you see
>> the definition of btree_xlog_delete_get_latestRemovedXid(), just
>> before consistency check there is a if-condition 'if
>> (CountDBBackends(InvalidOid) == 0)' which means
>> we are checking for consistent state only after knowing that there are
>> some backends connected to the standby. So, Is there a possibility of
>> having some backend connected to standby server without having it in
>> consistent state.
>>
>
> I don't think so, but I think we should have reachedConsistency check
> and elog(PANIC,..) similar to btree.  If you see other conditions
> where we PANIC in btree or hash xlog code, you will notice that those
> are also theoretically not possible cases.  It seems this is to save
> database from getting corrupt or behaving insanely if due to some
> reason (like a coding error or others) the check fails.

okay, agreed. I have included it in the attached patch.

However,  I am still able to see the crash reported by Jeff upthread -
[1]. I think there are couple of things that needs to be corrected
inside hash_xlog_vacuum_get_latestRemovedXid().

1) As Amit mentioned in his earlier mail [2], the block id used for
registering deleted items in XLOG_HASH_VACUUM_ONE_PAGE is different
than the block id used for fetching those items. I had fixed this and
shared the patch earlier. With this patch I still see the crash Jeff
reported yesterday [1].

2) When a full page image of registered block is taken, the modified
data associated with that block is not included in the WAL record. In
such a case, we won't be able to fetch the items array that we have
tried to include during xlog record (XLOG_HASH_VACUUM_ONE_PAGE)
creation using following function.

   XLogRegisterBufData(0, (char *) deletable, ndeletable *
sizeof(OffsetNumber));

If above is not included in the WAL record, then fetching the same
using 'XLogRecGetBlockData(record, 0, &len);' will return NULL
pointer.

 ptr = XLogRecGetBlockData(record, 0, &len);
 unused = (OffsetNumber *) ptr;
 ............
 iitemid = PageGetItemId(ipage, unused[i]);

Here, if ptr is NULL, reference to unused will cause a crash.

To fix this, I think we should pass 'REGBUF_KEEP_DATA' while
registering the buffer. Something like this,

-                       XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
+                       XLogRegisterBuffer(0, buf, REGBUF_STANDARD |
REGBUF_KEEP_DATA);

Attached is the patch that fixes this issue. Please have a look and
let me know if you still have any concerns. Thank you.

[1] - https://www.postgresql.org/message-id/CAMkU%3D1w-9Qe%3DFf1o6bSaXpNO9wqpo7_9GL8_CVhw4BoVVHasqg%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAA4eK1%2BYUedok0%2Bmeynnf8K3fqFsfdMpEFz1JiKLyyNv46hjaA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Next
From: Ashutosh Bapat
Date:
Subject: [HACKERS] create_unique_path and GEQO