Home > mailing lists

Re: [HACKERS] Write Ahead Logging for Hash Indexes - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [HACKERS] Write Ahead Logging for Hash Indexes
Date	March 12, 2017 08:36:02
Msg-id	CA+TgmoYgQbhohoSHvMZrgAQWvyhxR2qGpKnid6wWbW4v6+hU1Q@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Write Ahead Logging for Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: [HACKERS] Write Ahead Logging for Hash Indexes (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

On Sat, Mar 11, 2017 at 12:20 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>                  /*
>> +                 * Change the shared buffer state in critical section,
>> +                 * otherwise any error could make it unrecoverable after
>> +                 * recovery.
>> +                 */
>> +                START_CRIT_SECTION();
>> +
>> +                /*
>>                   * Insert tuple on new page, using _hash_pgaddtup to ensure
>>                   * correct ordering by hashkey.  This is a tad inefficient
>>                   * since we may have to shuffle itempointers repeatedly.
>>                   * Possible future improvement: accumulate all the items for
>>                   * the new page and qsort them before insertion.
>>                   */
>>                  (void) _hash_pgaddtup(rel, nbuf, itemsz, new_itup);
>>
>> +                END_CRIT_SECTION();
>>
>> No way.  You have to start the critical section before making any page
>> modifications and keep it alive until all changes have been logged.
>>
>
> I think what we need to do here is to accumulate all the tuples that
> need to be added to new bucket page till either that page has no more
> space or there are no more tuples remaining in an old bucket.  Then in
> a critical section, add them to the page using _hash_pgaddmultitup and
> log the entire new bucket page contents as is currently done in patch
> log_split_page().

I agree.

> Now, here we can choose to log the individual
> tuples as well instead of a complete page, however not sure if there
> is any benefit for doing the same because XLogRecordAssemble() will
> anyway remove the empty space from the page.  Let me know if you have
> something else in mind.

Well, if you have two pages that are 75% full, and you move a third of
the tuples from one of them into the other, it's going to be about
four times more efficient to log only the moved tuples than the whole
page.  But logging the whole filled page wouldn't be wrong, just
somewhat inefficient.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 12 March 2017, 08:24:48
Subject: Re: [HACKERS] WIP: Faster Expression Processing v4

From: "Mengxing Liu"
Date: 12 March 2017, 08:39:57
Subject: [HACKERS] Re: [GSOC 17] Eliminate O(N^2) scaling from rw-conflicttracking in serializable transactions

Re: [HACKERS] Write Ahead Logging for Hash Indexes - Mailing list pgsql-hackers

Previous

Next