Re: [HACKERS] Setting pd_lower in GIN metapage - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [HACKERS] Setting pd_lower in GIN metapage
Date
Msg-id fb626119-94bd-cbfa-fc6b-9ca260df714f@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] Setting pd_lower in GIN metapage  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On 2017/06/19 22:59, Amit Kapila wrote:
> On Mon, Jun 19, 2017 at 11:37 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> What are some arguments against setting pd_lower in the GIN metapage as
>> follows?
>>
>> GinMetaPageData *metad = GinPageGetMeta(page);
>>
>> ((PageHeader) page)->pd_lower =
>>     ((char *) metad + sizeof(GinMetaPageData)) - (char *) page;
>>
>> I saw that _bt_initmetapage() does it, so was wondering why doesn't GIN.
>>
> 
> Actually, hash index also has similar code (See _hash_init_metabuffer)
> and I see no harm in doing this at similar other places.  This helps
> in compressing the hole in metapage during WAL writing.  I think that
> in itself might not be an argument in favor of doing this because
> there is only one meta page for index and saving ~7K WAL is not huge
> but OTOH I don't see a reason for not doing it.

I agree.  Will write a patch.

>> How about porting such a change to the back-branches if we do this at all?
>>  I couldn't find any discussion in the archives about this.  I read in
>> comments that server versions older than 9.4 didn't set pd_lower correctly
>> in any of GIN index pages, so relying on pd_lower value of GIN pages is
>> unreliable in general.
>>
>> The reason I'm asking is that a certain backup tool relies on pd_lower
>> values of data pages (disk blocks in relation files that are known to have
>> a valid PageHeaderData) to be correct to discard the portion of every page
>> that supposedly does not contain any useful information.  The assumption
>> doesn't hold in the case of GIN metapage, so any GIN indexes contain
>> corrupted metapage after recovery (metadata overwritten with zeros).
>>
> 
> Why can't you do a special check for metapage identification?  It
> should be easy to add such a check.  I have seen one of such tools
> (pg_filedump) has similar check to skip metapage in certain code
> paths.

I tried to add a metapage check, but getting such code to compile requires
it to include headers like gin_private.h (in PG versions < 10), which in
turn includes other headers that are forbidden to be included in what's
supposed to be FRONTEND code.    (thread at [1] seems relevant in this
regard.)  Another way to fix this might be to copy the GinMetaPageData
struct definition and a few other symbols into the tool's code and make
necessary checks using the same, instead of including gin_private.h.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZ%3DF%3DGkxV0YEv-A8tb%2BAEGy_Qa7GSiJ8deBKFATnzfEug%40mail.gmail.com




pgsql-hackers by date:

Previous
From: Amit Khandekar
Date:
Subject: Re: [HACKERS] UPDATE of partition key
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] Setting pd_lower in GIN metapage