Re: [HACKERS] pageinspect and hash indexes - Mailing list pgsql-hackers

From Ashutosh Sharma
Subject Re: [HACKERS] pageinspect and hash indexes
Date
Msg-id CAE9k0P=vtL4+30b5E=OkgRay-T+0LidRTR4p1jSKhPTd2y3BTQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] pageinspect and hash indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] pageinspect and hash indexes  (Ashutosh Sharma <ashu.coek88@gmail.com>)
Re: [HACKERS] pageinspect and hash indexes  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>>> While trying to figure out some bloating in the newly logged hash indexes,
>>>>> I'm looking into the type of each page in the index.  But I get an error:
>>>>>
>>>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>>>> generate_series(1650,1650) f(x)"
>>>>>
>>>>> ERROR:  page is not a hash page
>>>>> DETAIL:  Expected 0000ff80, got 00000000.
>>>>>
>>>>> The contents of the page are:
>>>>>
>>>>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>>>>
>>>>> (where the elided characters at the end are all zero)
>>>>>
>>>>> What kind of page is that actually?
>>>>
>>>> it is basically either a newly allocated bucket page or a freed overflow page.
>>>>
>>>
>>> What makes you think that it can be a newly allocated page?
>>> Basically, we always initialize the special space of newly allocated
>>> page, so not sure what makes you deduce that it can be newly allocated
>>> page.
>>
>> I came to know this from the following experiment.
>>
>> I  created a hash index and kept on inserting data in it till the split happens.
>>
>> When split happened, I could see following values for firstblock and
>> lastblock in _hash_alloc_buckets()
>>
>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
>> nblocks=32) at hashpage.c:993
>> (gdb) n
>> (gdb) p    firstblock
>> $15 = 34
>> (gdb) p    nblocks
>> $16 = 32
>> (gdb) n
>> (gdb) p    lastblock
>> $17 = 65
>>
>> AFAIU, this bucket split resulted in creation of new bucket pages from
>> block number 34 to 65.
>>
>> The contents for metap are as follows,
>>
>> (gdb) p    *metap
>> $18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples =
>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
>> hashm_bmshift = 15,
>>   hashm_maxbucket = 32,    hashm_highmask = 63, hashm_lowmask = 31,
>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
>> hashm_procid = 450,
>>   hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats 25 times>},
>> hashm_mapp = {33,    0 <repeats 127 times>}}
>>
>> Now, if i try to check the page type for block number 65, this is what i see,
>>
>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
>> ERROR:  page is not a hash page
>> DETAIL:  Expected 0000ff80, got 00000000.
>> test=#
>>
>
> The contents of such a page should be zero and Jeff has reported some
> valid-looking contents of the page.  If you see this page contents as
> zero, then we can conclude what Jeff is seeing was an freed overflow
> page.

As shown in the mail thread-[1], the contents of metapage are,

(gdb) p    *metap
$18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples
=2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
hashm_bmshift = 15, hashm_maxbucket = 32,    hashm_highmask = 63,
hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0,
hashm_nmaps = 1,
hashm_procid = 450, hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats
25 times>}, hashm_mapp = {33,    0 <repeats 127 times>}}

postgres=# select spares from
hash_metapage_info(get_raw_page('con_hash_index', 0));
                              spares
-------------------------------------------------------------------
 {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
(1 row)

Here, if you see the spare page count  is just 1 which corresponds to
bitmap page. So, that means there is no overflow page in hash index
table and neither I have ran any DELETE statements in my experiment
that would result in freeing an overflow page.

Also, the page header content for such a page is,

$9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0,
pd_flags = 0,    pd_lower = 24, pd_upper = 8176,    pd_special = 8176,
  pd_pagesize_version = 8196, pd_prune_xid = 0,    pd_linp = 0x1f3aa88}

From values of pd_lower and pd_upper it is clear that it is an empty page.

The content of this page is,

\x00000000b0308a01000000001800f01ff01f042000.....

I think it is not just happening for freed overflow but also for newly
allocated bucket page. It would be good if we could mark  freed
overflow page as UNUSED page rather than just initialising it's header
portion and leaving the page type in special area as it is. Attached
is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
marks a freed overflow page as an unused page.

Also, I have now changed pageinspect module to handle unused and empty
hash index page. Attached is the patch
(0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
the same.

[1] -
https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] Inadequate traces in TAP tests
Next
From: Stephen Frost
Date:
Subject: [HACKERS] Re: [COMMITTERS] pgsql: Improve pg_dump regression tests and codecoverage