Thread: [HACKERS] pageinspect and hash indexes

[HACKERS] pageinspect and hash indexes

From
Jeff Janes
Date:
While trying to figure out some bloating in the newly logged hash indexes, I'm looking into the type of each page in the index.  But I get an error:

psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from generate_series(1650,1650) f(x)"

ERROR:  page is not a hash page
DETAIL:  Expected 0000ff80, got 00000000.

The contents of the page are:

\xa4000000d8f203bf65c900001800f01ff01f0420...

(where the elided characters at the end are all zero)

What kind of page is that actually?  And isn't it unhelpful to have the pageinspect module throw errors, rather than returning a dummy value to indicate there was an error?

Thanks,

Jeff

Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> While trying to figure out some bloating in the newly logged hash indexes,
> I'm looking into the type of each page in the index.  But I get an error:
>
> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
> generate_series(1650,1650) f(x)"
>
> ERROR:  page is not a hash page
> DETAIL:  Expected 0000ff80, got 00000000.
>
> The contents of the page are:
>
> \xa4000000d8f203bf65c900001800f01ff01f0420...
>
> (where the elided characters at the end are all zero)
>
> What kind of page is that actually?

it is basically either a newly allocated bucket page or a freed overflow page.

You are seeing this error because of following change in the WAL patch
for hash index,

Basically, when we started working on WAL for hash index, we found
that WAL routine 'XLogReadBufferExtended' does not expect a page to be
completely zero page else it returns Invalid Buffer. To fix this, we
started initializing freed overflow page or new bucket pages using
_hash_pageinit() which basically initialises page header portion but
not it's special area where page type information is present. That's
why you are seeing an ERROR saying 'page is not a hash page'. Actually
pageinspect module needs to handle this type of page. Currently it is
just handling zero pages but not an empty pages. I will submit a patch
for this.

And isn't it unhelpful to have the
> pageinspect module throw errors, rather than returning a dummy value to
> indicate there was an error?

Well, this is something not specific to hash index. So, I have no answer :)

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com





--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Peter Eisentraut
Date:
On 3/17/17 13:24, Jeff Janes wrote:
> And isn't it unhelpful to have the pageinspect module throw errors,
> rather than returning a dummy value to indicate there was an error?

Even if one agreed with that premise, it would be hard to satisfy
reliably, because the user-facing pageinspect functions might reach into
lower-level code to do some of the decoding, which are free to error out.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] pageinspect and hash indexes

From
Tom Lane
Date:
Ashutosh Sharma <ashu.coek88@gmail.com> writes:
> Basically, when we started working on WAL for hash index, we found
> that WAL routine 'XLogReadBufferExtended' does not expect a page to be
> completely zero page else it returns Invalid Buffer. To fix this, we
> started initializing freed overflow page or new bucket pages using
> _hash_pageinit() which basically initialises page header portion but
> not it's special area where page type information is present. That's
> why you are seeing an ERROR saying 'page is not a hash page'. Actually
> pageinspect module needs to handle this type of page. Currently it is
> just handling zero pages but not an empty pages. I will submit a patch
> for this.

That seems like entirely the wrong approach.  You should make the special
space valid, instead, so that tools like pg_filedump can make sense of
the page.
        regards, tom lane



Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Sat, Mar 18, 2017 at 1:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Ashutosh Sharma <ashu.coek88@gmail.com> writes:
>> Basically, when we started working on WAL for hash index, we found
>> that WAL routine 'XLogReadBufferExtended' does not expect a page to be
>> completely zero page else it returns Invalid Buffer. To fix this, we
>> started initializing freed overflow page or new bucket pages using
>> _hash_pageinit() which basically initialises page header portion but
>> not it's special area where page type information is present. That's
>> why you are seeing an ERROR saying 'page is not a hash page'. Actually
>> pageinspect module needs to handle this type of page. Currently it is
>> just handling zero pages but not an empty pages. I will submit a patch
>> for this.
>
> That seems like entirely the wrong approach.  You should make the special
> space valid, instead, so that tools like pg_filedump can make sense of
> the page.
>

We were not aware that external tools like pg_filedump are dependent
on special space of index, but after looking at the code of
pg_filedump, I agree with you that we need to initialize the special
space in this case (free overflow page).  I think we can mark such a
page type as LH_UNUSED_PAGE  and then initialize the other fields of
special space.  Nonetheless, I think we still need modifications in
hashfuncs.c so that it can understand this type of page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> While trying to figure out some bloating in the newly logged hash indexes,
>> I'm looking into the type of each page in the index.  But I get an error:
>>
>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>> generate_series(1650,1650) f(x)"
>>
>> ERROR:  page is not a hash page
>> DETAIL:  Expected 0000ff80, got 00000000.
>>
>> The contents of the page are:
>>
>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>
>> (where the elided characters at the end are all zero)
>>
>> What kind of page is that actually?
>
> it is basically either a newly allocated bucket page or a freed overflow page.
>

What makes you think that it can be a newly allocated page?
Basically, we always initialize the special space of newly allocated
page, so not sure what makes you deduce that it can be newly allocated
page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>> While trying to figure out some bloating in the newly logged hash indexes,
>>> I'm looking into the type of each page in the index.  But I get an error:
>>>
>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>> generate_series(1650,1650) f(x)"
>>>
>>> ERROR:  page is not a hash page
>>> DETAIL:  Expected 0000ff80, got 00000000.
>>>
>>> The contents of the page are:
>>>
>>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>>
>>> (where the elided characters at the end are all zero)
>>>
>>> What kind of page is that actually?
>>
>> it is basically either a newly allocated bucket page or a freed overflow page.
>>
>
> What makes you think that it can be a newly allocated page?
> Basically, we always initialize the special space of newly allocated
> page, so not sure what makes you deduce that it can be newly allocated
> page.

I came to know this from the following experiment.

I  created a hash index and kept on inserting data in it till the split happens.

When split happened, I could see following values for firstblock and
lastblock in _hash_alloc_buckets()

Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
nblocks=32) at hashpage.c:993
(gdb) n
(gdb) p    firstblock
$15 = 34
(gdb) p    nblocks
$16 = 32
(gdb) n
(gdb) p    lastblock
$17 = 65

AFAIU, this bucket split resulted in creation of new bucket pages from
block number 34 to 65.

The contents for metap are as follows,

(gdb) p    *metap
$18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples =
2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
hashm_bmshift = 15, hashm_maxbucket = 32,    hashm_highmask = 63, hashm_lowmask = 31,
hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
hashm_procid = 450, hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats 25 times>},
hashm_mapp = {33,    0 <repeats 127 times>}}

Now, if i try to check the page type for block number 65, this is what i see,

test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
ERROR:  page is not a hash page
DETAIL:  Expected 0000ff80, got 00000000.
test=#

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>> While trying to figure out some bloating in the newly logged hash indexes,
>>>> I'm looking into the type of each page in the index.  But I get an error:
>>>>
>>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>>> generate_series(1650,1650) f(x)"
>>>>
>>>> ERROR:  page is not a hash page
>>>> DETAIL:  Expected 0000ff80, got 00000000.
>>>>
>>>> The contents of the page are:
>>>>
>>>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>>>
>>>> (where the elided characters at the end are all zero)
>>>>
>>>> What kind of page is that actually?
>>>
>>> it is basically either a newly allocated bucket page or a freed overflow page.
>>>
>>
>> What makes you think that it can be a newly allocated page?
>> Basically, we always initialize the special space of newly allocated
>> page, so not sure what makes you deduce that it can be newly allocated
>> page.
>
> I came to know this from the following experiment.
>
> I  created a hash index and kept on inserting data in it till the split happens.
>
> When split happened, I could see following values for firstblock and
> lastblock in _hash_alloc_buckets()
>
> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
> nblocks=32) at hashpage.c:993
> (gdb) n
> (gdb) p    firstblock
> $15 = 34
> (gdb) p    nblocks
> $16 = 32
> (gdb) n
> (gdb) p    lastblock
> $17 = 65
>
> AFAIU, this bucket split resulted in creation of new bucket pages from
> block number 34 to 65.
>
> The contents for metap are as follows,
>
> (gdb) p    *metap
> $18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples =
> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
> hashm_bmshift = 15,
>   hashm_maxbucket = 32,    hashm_highmask = 63, hashm_lowmask = 31,
> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
> hashm_procid = 450,
>   hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats 25 times>},
> hashm_mapp = {33,    0 <repeats 127 times>}}
>
> Now, if i try to check the page type for block number 65, this is what i see,
>
> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
> ERROR:  page is not a hash page
> DETAIL:  Expected 0000ff80, got 00000000.
> test=#
>

The contents of such a page should be zero and Jeff has reported some
valid-looking contents of the page.  If you see this page contents as
zero, then we can conclude what Jeff is seeing was an freed overflow
page.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>>> While trying to figure out some bloating in the newly logged hash indexes,
>>>>> I'm looking into the type of each page in the index.  But I get an error:
>>>>>
>>>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>>>> generate_series(1650,1650) f(x)"
>>>>>
>>>>> ERROR:  page is not a hash page
>>>>> DETAIL:  Expected 0000ff80, got 00000000.
>>>>>
>>>>> The contents of the page are:
>>>>>
>>>>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>>>>
>>>>> (where the elided characters at the end are all zero)
>>>>>
>>>>> What kind of page is that actually?
>>>>
>>>> it is basically either a newly allocated bucket page or a freed overflow page.
>>>>
>>>
>>> What makes you think that it can be a newly allocated page?
>>> Basically, we always initialize the special space of newly allocated
>>> page, so not sure what makes you deduce that it can be newly allocated
>>> page.
>>
>> I came to know this from the following experiment.
>>
>> I  created a hash index and kept on inserting data in it till the split happens.
>>
>> When split happened, I could see following values for firstblock and
>> lastblock in _hash_alloc_buckets()
>>
>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
>> nblocks=32) at hashpage.c:993
>> (gdb) n
>> (gdb) p    firstblock
>> $15 = 34
>> (gdb) p    nblocks
>> $16 = 32
>> (gdb) n
>> (gdb) p    lastblock
>> $17 = 65
>>
>> AFAIU, this bucket split resulted in creation of new bucket pages from
>> block number 34 to 65.
>>
>> The contents for metap are as follows,
>>
>> (gdb) p    *metap
>> $18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples =
>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
>> hashm_bmshift = 15,
>>   hashm_maxbucket = 32,    hashm_highmask = 63, hashm_lowmask = 31,
>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
>> hashm_procid = 450,
>>   hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats 25 times>},
>> hashm_mapp = {33,    0 <repeats 127 times>}}
>>
>> Now, if i try to check the page type for block number 65, this is what i see,
>>
>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
>> ERROR:  page is not a hash page
>> DETAIL:  Expected 0000ff80, got 00000000.
>> test=#
>>
>
> The contents of such a page should be zero and Jeff has reported some
> valid-looking contents of the page.  If you see this page contents as
> zero, then we can conclude what Jeff is seeing was an freed overflow
> page.

As shown in the mail thread-[1], the contents of metapage are,

(gdb) p    *metap
$18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples
=2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
hashm_bmshift = 15, hashm_maxbucket = 32,    hashm_highmask = 63,
hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0,
hashm_nmaps = 1,
hashm_procid = 450, hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats
25 times>}, hashm_mapp = {33,    0 <repeats 127 times>}}

postgres=# select spares from
hash_metapage_info(get_raw_page('con_hash_index', 0));
                              spares
-------------------------------------------------------------------
 {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
(1 row)

Here, if you see the spare page count  is just 1 which corresponds to
bitmap page. So, that means there is no overflow page in hash index
table and neither I have ran any DELETE statements in my experiment
that would result in freeing an overflow page.

Also, the page header content for such a page is,

$9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0,
pd_flags = 0,    pd_lower = 24, pd_upper = 8176,    pd_special = 8176,
  pd_pagesize_version = 8196, pd_prune_xid = 0,    pd_linp = 0x1f3aa88}

From values of pd_lower and pd_upper it is clear that it is an empty page.

The content of this page is,

\x00000000b0308a01000000001800f01ff01f042000.....

I think it is not just happening for freed overflow but also for newly
allocated bucket page. It would be good if we could mark  freed
overflow page as UNUSED page rather than just initialising it's header
portion and leaving the page type in special area as it is. Attached
is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
marks a freed overflow page as an unused page.

Also, I have now changed pageinspect module to handle unused and empty
hash index page. Attached is the patch
(0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
the same.

[1] -
https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
On Mon, Mar 20, 2017 at 6:53 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>>>> While trying to figure out some bloating in the newly logged hash indexes,
>>>>>> I'm looking into the type of each page in the index.  But I get an error:
>>>>>>
>>>>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>>>>> generate_series(1650,1650) f(x)"
>>>>>>
>>>>>> ERROR:  page is not a hash page
>>>>>> DETAIL:  Expected 0000ff80, got 00000000.
>>>>>>
>>>>>> The contents of the page are:
>>>>>>
>>>>>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>>>>>
>>>>>> (where the elided characters at the end are all zero)
>>>>>>
>>>>>> What kind of page is that actually?
>>>>>
>>>>> it is basically either a newly allocated bucket page or a freed overflow page.
>>>>>
>>>>
>>>> What makes you think that it can be a newly allocated page?
>>>> Basically, we always initialize the special space of newly allocated
>>>> page, so not sure what makes you deduce that it can be newly allocated
>>>> page.
>>>
>>> I came to know this from the following experiment.
>>>
>>> I  created a hash index and kept on inserting data in it till the split happens.
>>>
>>> When split happened, I could see following values for firstblock and
>>> lastblock in _hash_alloc_buckets()
>>>
>>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
>>> nblocks=32) at hashpage.c:993
>>> (gdb) n
>>> (gdb) p    firstblock
>>> $15 = 34
>>> (gdb) p    nblocks
>>> $16 = 32
>>> (gdb) n
>>> (gdb) p    lastblock
>>> $17 = 65
>>>
>>> AFAIU, this bucket split resulted in creation of new bucket pages from
>>> block number 34 to 65.
>>>
>>> The contents for metap are as follows,
>>>
>>> (gdb) p    *metap
>>> $18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples =
>>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
>>> hashm_bmshift = 15,
>>>   hashm_maxbucket = 32,    hashm_highmask = 63, hashm_lowmask = 31,
>>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
>>> hashm_procid = 450,
>>>   hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats 25 times>},
>>> hashm_mapp = {33,    0 <repeats 127 times>}}
>>>
>>> Now, if i try to check the page type for block number 65, this is what i see,
>>>
>>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
>>> ERROR:  page is not a hash page
>>> DETAIL:  Expected 0000ff80, got 00000000.
>>> test=#
>>>
>>
>> The contents of such a page should be zero and Jeff has reported some
>> valid-looking contents of the page.  If you see this page contents as
>> zero, then we can conclude what Jeff is seeing was an freed overflow
>> page.
>
> As shown in the mail thread-[1], the contents of metapage are,
>
> (gdb) p    *metap
> $18 = {hashm_magic = 105121344,    hashm_version = 2, hashm_ntuples
> =2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
> hashm_bmshift = 15, hashm_maxbucket = 32,    hashm_highmask = 63,
> hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0,
> hashm_nmaps = 1,
> hashm_procid = 450, hashm_spares = {0, 0,    0, 0, 0, 1, 1, 0 <repeats
> 25 times>}, hashm_mapp = {33,    0 <repeats 127 times>}}
>
> postgres=# select spares from
> hash_metapage_info(get_raw_page('con_hash_index', 0));
>                               spares
> -------------------------------------------------------------------
>  {0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
> (1 row)
>
> Here, if you see the spare page count  is just 1 which corresponds to
> bitmap page. So, that means there is no overflow page in hash index
> table and neither I have ran any DELETE statements in my experiment
> that would result in freeing an overflow page.
>
> Also, the page header content for such a page is,
>
> $9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0,
> pd_flags = 0,    pd_lower = 24, pd_upper = 8176,    pd_special = 8176,
>   pd_pagesize_version = 8196, pd_prune_xid = 0,    pd_linp = 0x1f3aa88}
>
> From values of pd_lower and pd_upper it is clear that it is an empty page.
>
> The content of this page is,
>
> \x00000000b0308a01000000001800f01ff01f042000.....
>
> I think it is not just happening for freed overflow but also for newly
> allocated bucket page. It would be good if we could mark  freed
> overflow page as UNUSED page rather than just initialising it's header
> portion and leaving the page type in special area as it is. Attached
> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
> marks a freed overflow page as an unused page.
>
> Also, I have now changed pageinspect module to handle unused and empty
> hash index page. Attached is the patch
> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
> the same.
>
> [1] -
https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com
>

I think when expanding hash index table we are only initialising the
pages that will be in-use after split or the last block in the index.
The initial few pages where tuples will be moved from old to new pages
has no issues but the last block that is just initialised and is not
marked as hash page needs to be handled along with the freed overflow
page. Please let me know thoughts on this. Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Tue, Mar 21, 2017 at 10:15 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>
>> I think it is not just happening for freed overflow but also for newly
>> allocated bucket page. It would be good if we could mark  freed
>> overflow page as UNUSED page rather than just initialising it's header
>> portion and leaving the page type in special area as it is. Attached
>> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
>> marks a freed overflow page as an unused page.
>>
>> Also, I have now changed pageinspect module to handle unused and empty
>> hash index page. Attached is the patch
>> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
>> the same.
>>
>> [1] -
https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com
>>

I have yet to review your patches, let's first try to clear other
things before doing so.

>
> I think when expanding hash index table we are only initialising the
> pages that will be in-use after split or the last block in the index.
> The initial few pages where tuples will be moved from old to new pages
> has no issues but the last block that is just initialised and is not
> marked as hash page needs to be handled along with the freed overflow
> page.
>

I think PageIsEmpty() check in hashfuncs.c is sufficient for same.  I
don't see any value in treating the last page allocated during
_hash_alloc_buckets() any different from other pages which are prior
to that but will get allocated when required.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Mon, Mar 20, 2017 at 6:53 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> I think it is not just happening for freed overflow but also for newly
> allocated bucket page. It would be good if we could mark  freed
> overflow page as UNUSED page rather than just initialising it's header
> portion and leaving the page type in special area as it is. Attached
> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
> marks a freed overflow page as an unused page.
>
 _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
+
+ ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
+
+ ovflopaque->hasho_prevblkno = InvalidBlockNumber;
+ ovflopaque->hasho_nextblkno = InvalidBlockNumber;
+ ovflopaque->hasho_bucket = -1;
+ ovflopaque->hasho_flag = LH_UNUSED_PAGE;
+ ovflopaque->hasho_page_id = HASHO_PAGE_ID;
+

You need similar initialization in replay routine as well.

> Also, I have now changed pageinspect module to handle unused and empty
> hash index page. Attached is the patch
> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
> the same.
>

1.
@@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags) pageopaque = (HashPageOpaque)
PageGetSpecialPointer(page);
+
+ /* Check if it is an unused hash page. */
+ if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
+ return page;


I don't see we need to have a separate check for unused page, why
can't it be checked with other page types in below check.

if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)

2.
+ /* Check if it is an empty hash page. */
+ if (PageIsEmpty(page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INDEX_CORRUPTED),
+ errmsg("index table contains empty page")));


Do we want to give a separate message for EMPTY and NEW pages?  Isn't
it better that the same error message can be given for both of them as
from user perspective there is not much difference between both the
messages?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
>> I think it is not just happening for freed overflow but also for newly
>> allocated bucket page. It would be good if we could mark  freed
>> overflow page as UNUSED page rather than just initialising it's header
>> portion and leaving the page type in special area as it is. Attached
>> is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
>> marks a freed overflow page as an unused page.
>>
>
>   _hash_pageinit(ovflpage, BufferGetPageSize(ovflbuf));
> +
> + ovflopaque = (HashPageOpaque) PageGetSpecialPointer(ovflpage);
> +
> + ovflopaque->hasho_prevblkno = InvalidBlockNumber;
> + ovflopaque->hasho_nextblkno = InvalidBlockNumber;
> + ovflopaque->hasho_bucket = -1;
> + ovflopaque->hasho_flag = LH_UNUSED_PAGE;
> + ovflopaque->hasho_page_id = HASHO_PAGE_ID;
> +
>
> You need similar initialization in replay routine as well.

I will do that and share the patch shortly. Thanks.

>
>> Also, I have now changed pageinspect module to handle unused and empty
>> hash index page. Attached is the patch
>> (0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
>> the same.
>>
>
> 1.
> @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags)
>   pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
> +
> + /* Check if it is an unused hash page. */
> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
> + return page;
>
>
> I don't see we need to have a separate check for unused page, why
> can't it be checked with other page types in below check.
>
> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)

That is because UNUSED page is as good as an empty page except that
empty page doesn't have any pagetype. If we add condition for checking
UNUSED page in above if check it will never show unused page as an
unsed page rather it will always show it as an empty page. To avoid
this, at the start of verify_hash_page function itself if we recognise
page as UNUSED page we return immediately.

>
> 2.
> + /* Check if it is an empty hash page. */
> + if (PageIsEmpty(page))
> + ereport(ERROR,
> + (errcode(ERRCODE_INDEX_CORRUPTED),
> + errmsg("index table contains empty page")));
>
>
> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
> it better that the same error message can be given for both of them as
> from user perspective there is not much difference between both the
> messages?

I think we should show separate message because they are two different
type of pages. In the sense like, one is initialised whereas other is
completely zero.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Thu, Mar 23, 2017 at 10:02 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>
>>
>> 1.
>> @@ -70,6 +70,17 @@ verify_hash_page(bytea *raw_page, int flags)
>>   pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
>> +
>> + /* Check if it is an unused hash page. */
>> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
>> + return page;
>>
>>
>> I don't see we need to have a separate check for unused page, why
>> can't it be checked with other page types in below check.
>>
>> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
>> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
>
> That is because UNUSED page is as good as an empty page except that
> empty page doesn't have any pagetype. If we add condition for checking
> UNUSED page in above if check it will never show unused page as an
> unsed page rather it will always show it as an empty page.
>

Oh, okay, but my main objection was that we should not check hash page
type (hasho_flag) without ensuring whether it is a hash page.  Can you
try to adjust the above code so that this check can be moved after
hasho_page_id check?

> To avoid
> this, at the start of verify_hash_page function itself if we recognise
> page as UNUSED page we return immediately.
>
>>
>> 2.
>> + /* Check if it is an empty hash page. */
>> + if (PageIsEmpty(page))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>> + errmsg("index table contains empty page")));
>>
>>
>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>> it better that the same error message can be given for both of them as
>> from user perspective there is not much difference between both the
>> messages?
>
> I think we should show separate message because they are two different
> type of pages. In the sense like, one is initialised whereas other is
> completely zero.
>

I understand your point, but not sure if it makes any difference to user.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
Hi Amit,

>>> +
>>> + /* Check if it is an unused hash page. */
>>> + if (pageopaque->hasho_flag == LH_UNUSED_PAGE)
>>> + return page;
>>>
>>>
>>> I don't see we need to have a separate check for unused page, why
>>> can't it be checked with other page types in below check.
>>>
>>> if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
>>> pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
>>
>> That is because UNUSED page is as good as an empty page except that
>> empty page doesn't have any pagetype. If we add condition for checking
>> UNUSED page in above if check it will never show unused page as an
>> unsed page rather it will always show it as an empty page.
>>
>
> Oh, okay, but my main objection was that we should not check hash page
> type (hasho_flag) without ensuring whether it is a hash page.  Can you
> try to adjust the above code so that this check can be moved after
> hasho_page_id check?

Yes, I got your point. I have done that but then i had to remove the
check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
only be true for one page in entire hash index table and can be
ignored. If you wish, I could mention about it in the documentation.

>
>> To avoid
>> this, at the start of verify_hash_page function itself if we recognise
>> page as UNUSED page we return immediately.
>>
>>>
>>> 2.
>>> + /* Check if it is an empty hash page. */
>>> + if (PageIsEmpty(page))
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>>> + errmsg("index table contains empty page")));
>>>
>>>
>>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>>> it better that the same error message can be given for both of them as
>>> from user perspective there is not much difference between both the
>>> messages?
>>
>> I think we should show separate message because they are two different
>> type of pages. In the sense like, one is initialised whereas other is
>> completely zero.
>>
>
> I understand your point, but not sure if it makes any difference to user.
>

okay, I have now anyways removed the check for PageIsEmpty. Please
refer to the attached '0002
allow_pageinspect_handle_UNUSED_hash_pages.patch'


Also, I have attached
'0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
handles your comment mentioned in [1].

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BVE_TDRLWpyeOf%2B7%2B6if68kgPNwO4guKo060rm_t3O5w%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>
>> Oh, okay, but my main objection was that we should not check hash page
>> type (hasho_flag) without ensuring whether it is a hash page.  Can you
>> try to adjust the above code so that this check can be moved after
>> hasho_page_id check?
>
> Yes, I got your point. I have done that but then i had to remove the
> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
> only be true for one page in entire hash index table and can be
> ignored. If you wish, I could mention about it in the documentation.
>

Yeah, I think it is worth adding a Note in docs, but we can do that
separately if required.

>>
>>> To avoid
>>> this, at the start of verify_hash_page function itself if we recognise
>>> page as UNUSED page we return immediately.
>>>
>>>>
>>>> 2.
>>>> + /* Check if it is an empty hash page. */
>>>> + if (PageIsEmpty(page))
>>>> + ereport(ERROR,
>>>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>>>> + errmsg("index table contains empty page")));
>>>>
>>>>
>>>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>>>> it better that the same error message can be given for both of them as
>>>> from user perspective there is not much difference between both the
>>>> messages?
>>>
>>> I think we should show separate message because they are two different
>>> type of pages. In the sense like, one is initialised whereas other is
>>> completely zero.
>>>
>>
>> I understand your point, but not sure if it makes any difference to user.
>>
>
> okay, I have now anyways removed the check for PageIsEmpty. Please
> refer to the attached '0002
> allow_pageinspect_handle_UNUSED_hash_pages.patch'
>

+       if (pageopaque->hasho_page_id != HASHO_PAGE_ID)               ereport(ERROR,

spurious white space.

>
> Also, I have attached
> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
> handles your comment mentioned in [1].
>

In general, we have to initialize prevblock with max_bucket, but here
it is okay, because we anyway initialize it after this page is
allocated as overflow page.

Your patches look good to me except for small white space change.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
On Fri, Mar 24, 2017 at 9:21 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>
>>> Oh, okay, but my main objection was that we should not check hash page
>>> type (hasho_flag) without ensuring whether it is a hash page.  Can you
>>> try to adjust the above code so that this check can be moved after
>>> hasho_page_id check?
>>
>> Yes, I got your point. I have done that but then i had to remove the
>> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
>> only be true for one page in entire hash index table and can be
>> ignored. If you wish, I could mention about it in the documentation.
>>
>
> Yeah, I think it is worth adding a Note in docs, but we can do that
> separately if required.
>
>>>
>>>> To avoid
>>>> this, at the start of verify_hash_page function itself if we recognise
>>>> page as UNUSED page we return immediately.
>>>>
>>>>>
>>>>> 2.
>>>>> + /* Check if it is an empty hash page. */
>>>>> + if (PageIsEmpty(page))
>>>>> + ereport(ERROR,
>>>>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>>>>> + errmsg("index table contains empty page")));
>>>>>
>>>>>
>>>>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>>>>> it better that the same error message can be given for both of them as
>>>>> from user perspective there is not much difference between both the
>>>>> messages?
>>>>
>>>> I think we should show separate message because they are two different
>>>> type of pages. In the sense like, one is initialised whereas other is
>>>> completely zero.
>>>>
>>>
>>> I understand your point, but not sure if it makes any difference to user.
>>>
>>
>> okay, I have now anyways removed the check for PageIsEmpty. Please
>> refer to the attached '0002
>> allow_pageinspect_handle_UNUSED_hash_pages.patch'
>>
>
> +
>         if (pageopaque->hasho_page_id != HASHO_PAGE_ID)
>                 ereport(ERROR,
>
> spurious white space.
>
>>
>> Also, I have attached
>> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
>> handles your comment mentioned in [1].
>>
>
> In general, we have to initialize prevblock with max_bucket, but here
> it is okay, because we anyway initialize it after this page is
> allocated as overflow page.
>
> Your patches look good to me except for small white space change.

Thanks for reviewing my patch. I have removed the extra white space.
Attached are both the patches.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pageinspect and hash indexes

From
Ashutosh Sharma
Date:
On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> On Fri, Mar 24, 2017 at 9:21 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Thu, Mar 23, 2017 at 6:46 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>>
>>>> Oh, okay, but my main objection was that we should not check hash page
>>>> type (hasho_flag) without ensuring whether it is a hash page.  Can you
>>>> try to adjust the above code so that this check can be moved after
>>>> hasho_page_id check?
>>>
>>> Yes, I got your point. I have done that but then i had to remove the
>>> check for PageIsEmpty(). Anyways, I think PageIsEmpty() condition will
>>> only be true for one page in entire hash index table and can be
>>> ignored. If you wish, I could mention about it in the documentation.
>>>
>>
>> Yeah, I think it is worth adding a Note in docs, but we can do that
>> separately if required.
>>
>>>>
>>>>> To avoid
>>>>> this, at the start of verify_hash_page function itself if we recognise
>>>>> page as UNUSED page we return immediately.
>>>>>
>>>>>>
>>>>>> 2.
>>>>>> + /* Check if it is an empty hash page. */
>>>>>> + if (PageIsEmpty(page))
>>>>>> + ereport(ERROR,
>>>>>> + (errcode(ERRCODE_INDEX_CORRUPTED),
>>>>>> + errmsg("index table contains empty page")));
>>>>>>
>>>>>>
>>>>>> Do we want to give a separate message for EMPTY and NEW pages?  Isn't
>>>>>> it better that the same error message can be given for both of them as
>>>>>> from user perspective there is not much difference between both the
>>>>>> messages?
>>>>>
>>>>> I think we should show separate message because they are two different
>>>>> type of pages. In the sense like, one is initialised whereas other is
>>>>> completely zero.
>>>>>
>>>>
>>>> I understand your point, but not sure if it makes any difference to user.
>>>>
>>>
>>> okay, I have now anyways removed the check for PageIsEmpty. Please
>>> refer to the attached '0002
>>> allow_pageinspect_handle_UNUSED_hash_pages.patch'
>>>
>>
>> +
>>         if (pageopaque->hasho_page_id != HASHO_PAGE_ID)
>>                 ereport(ERROR,
>>
>> spurious white space.
>>
>>>
>>> Also, I have attached
>>> '0001-Mark-freed-overflow-page-as-UNUSED-pagetype-v2.patch' that
>>> handles your comment mentioned in [1].
>>>
>>
>> In general, we have to initialize prevblock with max_bucket, but here
>> it is okay, because we anyway initialize it after this page is
>> allocated as overflow page.
>>
>> Your patches look good to me except for small white space change.
>
> Thanks for reviewing my patch. I have removed the extra white space.
> Attached are both the patches.

Sorry, I have mistakenly attached wrong patch. Here are the correct
set of patches.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pageinspect and hash indexes

From
Amit Kapila
Date:
On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>
>> Thanks for reviewing my patch. I have removed the extra white space.
>> Attached are both the patches.
>
> Sorry, I have mistakenly attached wrong patch. Here are the correct
> set of patches.
>

The latest version of patches looks fine to me.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pageinspect and hash indexes

From
Robert Haas
Date:
On Fri, Mar 24, 2017 at 3:54 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Mar 24, 2017 at 9:50 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> On Fri, Mar 24, 2017 at 9:46 AM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>>>
>>> Thanks for reviewing my patch. I have removed the extra white space.
>>> Attached are both the patches.
>>
>> Sorry, I have mistakenly attached wrong patch. Here are the correct
>> set of patches.
>
> The latest version of patches looks fine to me.

I don't really like 0002.  What about this, instead?

--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)    /* Check that page type is sane. */    pagetype =
pageopaque->hasho_flag& LH_PAGE_TYPE;    if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
 
-        pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
+        pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
+        pagetype != LH_UNUSED_PAGE)        ereport(ERROR,                (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
            errmsg("invalid hash page type %08x", pagetype)));
 

The advantage of that is (1) it won't get confused if in the future we
have an unused page that has some flag bit not in LH_PAGE_TYPE set,
and (2) if in the future we want to add any other checks to this
function which should apply to unused pages also, they won't get
bypassed by an early return statement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pageinspect and hash indexes

From
Ashutosh Sharma
Date:
>>>> Thanks for reviewing my patch. I have removed the extra white space.
>>>> Attached are both the patches.
>>>
>>> Sorry, I have mistakenly attached wrong patch. Here are the correct
>>> set of patches.
>>
>> The latest version of patches looks fine to me.
>
> I don't really like 0002.  What about this, instead?
>
> --- a/contrib/pageinspect/hashfuncs.c
> +++ b/contrib/pageinspect/hashfuncs.c
> @@ -80,7 +80,8 @@ verify_hash_page(bytea *raw_page, int flags)
>      /* Check that page type is sane. */
>      pagetype = pageopaque->hasho_flag & LH_PAGE_TYPE;
>      if (pagetype != LH_OVERFLOW_PAGE && pagetype != LH_BUCKET_PAGE &&
> -        pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE)
> +        pagetype != LH_BITMAP_PAGE && pagetype != LH_META_PAGE &&
> +        pagetype != LH_UNUSED_PAGE)
>          ereport(ERROR,
>                  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>                   errmsg("invalid hash page type %08x", pagetype)));
>
> The advantage of that is (1) it won't get confused if in the future we
> have an unused page that has some flag bit not in LH_PAGE_TYPE set,
> and (2) if in the future we want to add any other checks to this
> function which should apply to unused pages also, they won't get
> bypassed by an early return statement.

Agreed. Moreover, previous approach might even change the current
behaviour of functions like hash_page_items() and hash_page_stats()
basically when we pass UNUSED pages to these functions. Attached is
the newer version of patch.

Attachment

Re: pageinspect and hash indexes

From
Robert Haas
Date:
On Fri, Mar 24, 2017 at 3:44 PM, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
> Agreed. Moreover, previous approach might even change the current
> behaviour of functions like hash_page_items() and hash_page_stats()
> basically when we pass UNUSED pages to these functions. Attached is
> the newer version of patch.

This patch doesn't completely solve the problem:

pgbench -i -s 40
alter table pgbench_accounts drop constraint pgbench_accounts_pkey;
create index on pgbench_accounts using hash (aid);
insert into pgbench_accounts select * from pgbench_accounts;
select hash_page_type, min(n), max(n), count(*) from (select n,
hash_page_type(get_raw_page('pgbench_accounts_aid_idx'::text, n)) from
generate_series(0,
(pg_relation_size('pgbench_accounts_aid_idx')/8192)::integer - 1) n) x
group by 1;
ERROR:  index table contains zero page

This happens because of the sparse allocation forced by
_hash_alloc_buckets.  Perhaps the real fix is to have that function
initialize all of the pages instead of just the last one, but unless
and until we decide to do that, we've got to cope with zero pages in
the index.  Actually, even if we did fix that I think we'd still need
to do this, because the way relation extension works in general is
that we first write a page of zeroes and then go back and fill in the
data; an intervening crash can leave behind the intermediate state.

A second problem is that the one page initialized by
_hash_alloc_buckets needs to end up with a valid special space;
otherwise, you get the same error Jeff complained about originally
when you try to use hash_page_type() on it.

I fixed those issues and committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company