Home > mailing lists

Re: Write Ahead Logging for Hash Indexes - Mailing list pgsql-hackers

From	Mark Kirkwood
Subject	Re: Write Ahead Logging for Hash Indexes
Date	September 8, 2016 04:32:24
Msg-id	844a643b-fe3d-7d9a-5040-570674fe9321@catalyst.net.nz Whole thread Raw
In response to	Re: Write Ahead Logging for Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Write Ahead Logging for Hash Indexes
List	pgsql-hackers

Tree view

On 07/09/16 21:58, Amit Kapila wrote:

> On Wed, Aug 24, 2016 at 10:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> On Tue, Aug 23, 2016 at 10:05 PM, Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>>> On Wed, Aug 24, 2016 at 2:37 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>
>>>> After an intentionally created crash, I get an Assert triggering:
>>>>
>>>> TRAP: FailedAssertion("!(((freep)[(bitmapbit)/32] &
>>>> (1<<((bitmapbit)%32))))", File: "hashovfl.c", Line: 553)
>>>>
>>>> freep[0] is zero and bitmapbit is 16.
>>>>
>>> Here what is happening is that when it tries to clear the bitmapbit,
>>> it expects it to be set.  Now, I think the reason for why it didn't
>>> find the bit as set could be that after the new overflow page is added
>>> and the bit corresponding to it is set, you might have crashed the
>>> system and the replay would not have set the bit.  Then while freeing
>>> the overflow page it can hit the Assert as mentioned by you.  I think
>>> the problem here could be that I am using REGBUF_STANDARD to log the
>>> bitmap page updates which seems to be causing the issue.  As bitmap
>>> page doesn't follow the standard page layout, it would have omitted
>>> the actual contents while taking full page image and then during
>>> replay, it would not have set the bit, because page doesn't need REDO.
>>> I think here the fix is to use REGBUF_NO_IMAGE as we use for vm
>>> buffers.
>>>
>>> If you can send me the detailed steps for how you have produced the
>>> problem, then I can verify after fixing whether you are seeing the
>>> same problem or something else.
>>
>>
>> The test is rather awkward, it might be easier to just have me test it.
>>
> Okay, I have fixed this issue as explained above.  Apart from that, I
> have fixed another issue reported by Mark Kirkwood upthread and few
> other issues found during internal testing by Ashutosh Sharma.
>
> The locking issue reported by Mark and Ashutosh is that the patch
> didn't maintain the locking order while adding overflow page as it
> maintains in other write operations (lock the bucket pages first and
> then metapage to perform the write operation).  I have added the
> comments in _hash_addovflpage() to explain the locking order used in
> modified patch.
>
> During stress testing with pgbench using master-standby setup, we
> found an issue which indicates that WAL replay machinery doesn't
> expect completely zeroed pages (See explanation of RBM_NORMAL mode
> atop XLogReadBufferExtended).  Previously before freeing the overflow
> page we were zeroing it, now I have changed it to just initialize the
> page such that the page will be empty.
>
> Apart from above, I have added support for old snapshot threshold in
> the hash index code.
>
> Thanks to Ashutosh Sharma for doing the testing of the patch and
> helping me in analyzing some of the above issues.
>
> I forgot to mention in my initial mail that Robert and I had some
> off-list discussions about the design of this patch, many thanks to
> him for providing inputs.
>
>

Repeating my tests with these new patches applied points to the hang 
issue being solved. I tested several 10 minute runs (any of which was 
enough to elicit the hang previously). I'll do some longer ones, but 
looks good!

regards

Mark

pgsql-hackers by date:

From: Michael Paquier
Date: 08 September 2016, 01:23:40
Subject: Useless dependency assumption libxml2 -> libxslt in MSVC scripts

From: Amit Kapila
Date: 08 September 2016, 04:32:51
Subject: Re: Hash Indexes

Re: Write Ahead Logging for Hash Indexes - Mailing list pgsql-hackers

Previous

Next