Thread: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

[HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Tomas Vondra
Date:
Hi,

while investigating some checksum-related issues, I needed to perform 
some forensics on a copy of a btree page (taken from another instance 
using 'dd').

But I've ran into two pageinspect limitations, hopefully addressed by 
this patch:

1) bt_page_items(bytea) not defined

We have heap_page_items(bytea) but not bt_page_items(bytea). I suspect 
this is partially historical API inconsistence, and partially due to the 
need to handle the btree metapage explicitly.

The original function simply threw an error with blkno=0, the new 
function simply checks for BTP_META page.

I believe this is sufficient, assuming the instance without data 
corruption (which pageinspect assumes anyway). With data corruption all 
bets are off anyway - for example the metapage might be written to a 
different block (essentially what I saw in the investigated issue). 
Actually, the flag check is better in this case - it detects the 
metapage, while the blkno=0 check fails to do that (leading to crash).

2) page_checksum()

When everything is fine, you can do page_header() which also includes 
the checksum. When the checksum gets broken, you may still dump the page 
using 'dd+pg_read_binary_file' to see the header, but clearly that 
checksum is wrong - and it's interesting to see the correct one and 
compare it to the checksum in the header.

This function makes it possible - it accepts the bytea image of the 
page, and blkno (so it's possible to see how would the block look if it 
was written somewhere else, for example).


BTW I've noticed the pageinspect version is 1.6, but we only have 
pageinspect--1.5.sql (and upgrade script to 1.6). Not sure that's 
entirely intentional?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
On Mon, Feb 20, 2017 at 9:43 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> BTW I've noticed the pageinspect version is 1.6, but we only have
> pageinspect--1.5.sql (and upgrade script to 1.6). Not sure that's entirely
> intentional?

Actually, that's the New Way.  See 40b449ae84dcf71177d7749a7b0c582b64dc15f0.

+extern Datum bt_metap(PG_FUNCTION_ARGS);
+extern Datum bt_page_items(PG_FUNCTION_ARGS);
+extern Datum bt_page_items_bytea(PG_FUNCTION_ARGS);
+extern Datum bt_page_stats(PG_FUNCTION_ARGS);

Not needed.  PG_FUNCTION_INFO_V1 now does it.

-    values[j++] = psprintf("%d", stat.blkno);
-    values[j++] = psprintf("%c", stat.type);
-    values[j++] = psprintf("%d", stat.live_items);
-    values[j++] = psprintf("%d", stat.dead_items);
-    values[j++] = psprintf("%d", stat.avg_item_size);
-    values[j++] = psprintf("%d", stat.page_size);
-    values[j++] = psprintf("%d", stat.free_size);
-    values[j++] = psprintf("%d", stat.btpo_prev);
-    values[j++] = psprintf("%d", stat.btpo_next);
-    values[j++] = psprintf("%d", (stat.type == 'd') ? stat.btpo.xact : stat.btp
o.level);
-    values[j++] = psprintf("%d", stat.btpo_flags);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.blkno);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%c", stat.type);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.live_items);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.dead_items);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.avg_item_size);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.page_size);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.free_size);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.btpo_prev);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.btpo_next);
+    values[j] = palloc(32);
+    if (stat.type == 'd')
+        snprintf(values[j++], 32, "%d", stat.btpo.xact);
+    else
+        snprintf(values[j++], 32, "%d", stat.btpo.level);
+    values[j] = palloc(32);
+    snprintf(values[j++], 32, "%d", stat.btpo_flags);

This does not seem like a good idea in any way, and the patch has
several instances of it.

I don't object to the concept, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
On 03/03/2017 05:09 AM, Robert Haas wrote:
> On Mon, Feb 20, 2017 at 9:43 PM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> BTW I've noticed the pageinspect version is 1.6, but we only have
>> pageinspect--1.5.sql (and upgrade script to 1.6). Not sure that's entirely
>> intentional?
>
> Actually, that's the New Way.  See 40b449ae84dcf71177d7749a7b0c582b64dc15f0.
>

Ah, great! Didn't notice that change.

> +extern Datum bt_metap(PG_FUNCTION_ARGS);
> +extern Datum bt_page_items(PG_FUNCTION_ARGS);
> +extern Datum bt_page_items_bytea(PG_FUNCTION_ARGS);
> +extern Datum bt_page_stats(PG_FUNCTION_ARGS);
>
> Not needed.  PG_FUNCTION_INFO_V1 now does it.

OK.

>
> -    ...
>
> This does not seem like a good idea in any way, and the patch has
> several instances of it.
>

Damn. In my defense, the patch was originally created for an older 
PostgreSQL version (to investigate issue on a production system), which 
used that approach to building values. Should have notice it, though.

Attached is v2, fixing both issues.

regard

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
On 3/3/17 09:03, Tomas Vondra wrote:
> Damn. In my defense, the patch was originally created for an older 
> PostgreSQL version (to investigate issue on a production system), which 
> used that approach to building values. Should have notice it, though.
> 
> Attached is v2, fixing both issues.

Can we have a test case for page_checksum(), or is that too difficult to
get running deterministicly?

Also, perhaps page_checksum() doesn't need to be superuser-only, but I
can see arguments for keeping it that way for consistency.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
On 03/04/2017 02:08 PM, Peter Eisentraut wrote:
> On 3/3/17 09:03, Tomas Vondra wrote:
>> Damn. In my defense, the patch was originally created for an older
>> PostgreSQL version (to investigate issue on a production system), which
>> used that approach to building values. Should have notice it, though.
>>
>> Attached is v2, fixing both issues.
>
> Can we have a test case for page_checksum(), or is that too difficult to
> get running deterministicly?
>

I'm not sure it can be made deterministic. Certainly not on heap pages, 
because then it'd be susceptible to xmin/xmax changes, but we have other 
bits that change even on index pages (say pd_lsn).

So I'm afraid that's not going to fly.
>
> Also, perhaps page_checksum() doesn't need to be superuser-only, but
> I can see arguments for keeping it that way for consistency.
>

Yes, I'll change that.

It won't have much impact in practice, because all functions providing 
the page data (get_raw_page etc.) do require superuser. But you can 
still input the page as a bytea directly, and it's good practice not to 
add unnecessary superuser checks.

regard

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
On 3/4/17 12:39, Tomas Vondra wrote:
>> Can we have a test case for page_checksum(), or is that too difficult to
>> get running deterministicly?
> 
> I'm not sure it can be made deterministic. Certainly not on heap pages, 
> because then it'd be susceptible to xmin/xmax changes, but we have other 
> bits that change even on index pages (say pd_lsn).
> 
> So I'm afraid that's not going to fly.

Then just run it and throw away the result.  See sql/page.sql for some
examples.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
On 3/3/17 09:03, Tomas Vondra wrote:
> Attached is v2, fixing both issues.

I wonder if

+   bytea      *raw_page = PG_GETARG_BYTEA_P(0);

+       uargs->page = VARDATA(raw_page);

is expected to work reliably, without copying the argument to a
different memory context.

I think it would be better not to maintain so much duplicate code
between bt_page_items(text, int) and bt_page_items(bytea).  How about
just redefining bt_page_items(text, int) as an SQL-language function
calling bt_page_items(get_raw_page($1, $2))?

For page_checksum(), the checksums are internally unsigned, so maybe
return type int on the SQL level, so that there is no confusion about
the sign?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
On 03/06/2017 10:13 PM, Peter Eisentraut wrote:
> On 3/3/17 09:03, Tomas Vondra wrote:
>> Attached is v2, fixing both issues.
>
> I wonder if
>
> +   bytea      *raw_page = PG_GETARG_BYTEA_P(0);
>
> +       uargs->page = VARDATA(raw_page);
>
> is expected to work reliably, without copying the argument to a
> different memory context.
>
> I think it would be better not to maintain so much duplicate code
> between bt_page_items(text, int) and bt_page_items(bytea).  How about
> just redefining bt_page_items(text, int) as an SQL-language function
> calling bt_page_items(get_raw_page($1, $2))?
>

Maybe. We can also probably share the code at the C level, so I'll look 
into that.
>
> For page_checksum(), the checksums are internally unsigned, so maybe
> return type int on the SQL level, so that there is no confusion about
> the sign?
>

That ship already sailed, I'm afraid. We already have checksum in the 
page_header() output, and it's defined as smallint there. So using int 
here would be inconsistent.

A similar issue is that get_raw_page() uses int for block number, while 
internally it's uint32. That often requires a fair amount of gymnastics 
on large tables.

I'm not against changing either of those bits - using int for checksums 
and bigint for block numbers, but I think it should be a separate patch.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
On 3/6/17 16:33, Tomas Vondra wrote:
>> I think it would be better not to maintain so much duplicate code
>> between bt_page_items(text, int) and bt_page_items(bytea).  How about
>> just redefining bt_page_items(text, int) as an SQL-language function
>> calling bt_page_items(get_raw_page($1, $2))?
>>
> 
> Maybe. We can also probably share the code at the C level, so I'll look 
> into that.

I think SQL would be easier, but either way some reduction in
duplication would be good.

>> For page_checksum(), the checksums are internally unsigned, so maybe
>> return type int on the SQL level, so that there is no confusion about
>> the sign?
>>
> 
> That ship already sailed, I'm afraid. We already have checksum in the 
> page_header() output, and it's defined as smallint there. So using int 
> here would be inconsistent.

OK, no worries then.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Ashutosh Sharma
Date:
Hi,

I had a look into this patch and would like to share some of my review
comments that requires author's attention.

1) The comment for page_checksum() needs to be corrected. It seems
like it has been copied from page_header and not edited it further.

/** page_header** Allows inspection of page header fields of a raw page*/

PG_FUNCTION_INFO_V1(page_checksum);

Datum
page_checksum(PG_FUNCTION_ARGS)

2) It seems like you have choosen wrong datatype for page_checksum. I
am getting negative checksum value when trying to run below query. I
think the return type for the SQL function page_checksum should be
'integer' instead of 'smallint'.

postgres=# SELECT page_checksum(get_raw_page('pg_class', 0), 100);page_checksum
---------------       -19532
(1 row)

Above problem also exist in case of page_header. I am facing similar
problem with page_header as well.

postgres=# SELECT page_header(get_raw_page('pg_class', 0));                page_header
---------------------------------------------(0/2891538,-28949,1,220,7216,8192,8192,4,0)
(1 row)


3) Any reason for not marking bt_page_items or page_checksum as PARALLEL_SAFE.

4) Error messages in new bt_page_items are not consistent with old
bt_page_items. For eg. if i pass meta page blocknumber as input to
bt_page_items the error message thrown by new and old bt_page_items
are different.

postgres=# SELECT * FROM bt_page_items('btree_index', 0) LIMIT 1;
ERROR:  block 0 is a meta page

postgres=# SELECT * FROM bt_page_items(get_raw_page('btree_index', 0)) LIMIT 1;
ERROR:  block is a meta page


postgres=# SELECT * FROM bt_page_items(get_raw_page('btree_index',
1024)) LIMIT 1;
ERROR:  block number 1024 is out of range for relation "btree_index"


postgres=# SELECT * FROM bt_page_items('btree_index', 1024) LIMIT 1;
ERROR:  block number out of range

5) Code duplication in bt_page_items() and bt_page_items_bytea() needs
to be handled.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Tue, Mar 7, 2017 at 9:39 PM, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> On 3/6/17 16:33, Tomas Vondra wrote:
>>> I think it would be better not to maintain so much duplicate code
>>> between bt_page_items(text, int) and bt_page_items(bytea).  How about
>>> just redefining bt_page_items(text, int) as an SQL-language function
>>> calling bt_page_items(get_raw_page($1, $2))?
>>>
>>
>> Maybe. We can also probably share the code at the C level, so I'll look
>> into that.
>
> I think SQL would be easier, but either way some reduction in
> duplication would be good.
>
>>> For page_checksum(), the checksums are internally unsigned, so maybe
>>> return type int on the SQL level, so that there is no confusion about
>>> the sign?
>>>
>>
>> That ship already sailed, I'm afraid. We already have checksum in the
>> page_header() output, and it's defined as smallint there. So using int
>> here would be inconsistent.
>
> OK, no worries then.
>
> --
> Peter Eisentraut              http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
On 03/13/2017 06:49 AM, Ashutosh Sharma wrote:
> Hi,
>
> I had a look into this patch and would like to share some of my review
> comments that requires author's attention.
>
> 1) The comment for page_checksum() needs to be corrected. It seems
> like it has been copied from page_header and not edited it further.
>
> /*
>  * page_header
>  *
>  * Allows inspection of page header fields of a raw page
>  */
>
> PG_FUNCTION_INFO_V1(page_checksum);
>
> Datum
> page_checksum(PG_FUNCTION_ARGS)
>

True, will fix.

> 2) It seems like you have choosen wrong datatype for page_checksum. I
> am getting negative checksum value when trying to run below query. I
> think the return type for the SQL function page_checksum should be
> 'integer' instead of 'smallint'.
>
> postgres=# SELECT page_checksum(get_raw_page('pg_class', 0), 100);
>  page_checksum
> ---------------
>         -19532
> (1 row)
>
> Above problem also exist in case of page_header. I am facing similar
> problem with page_header as well.
>
> postgres=# SELECT page_header(get_raw_page('pg_class', 0));
>                  page_header
> ---------------------------------------------
>  (0/2891538,-28949,1,220,7216,8192,8192,4,0)
> (1 row)
>

No. This is consistent with page_header() which is also using smallint 
for the checksum value.

>
> 3) Any reason for not marking bt_page_items or page_checksum as PARALLEL_SAFE.
>

No, not really. It's an oversight.

> 4) Error messages in new bt_page_items are not consistent with old
> bt_page_items. For eg. if i pass meta page blocknumber as input to
> bt_page_items the error message thrown by new and old bt_page_items
> are different.
>
> postgres=# SELECT * FROM bt_page_items('btree_index', 0) LIMIT 1;
> ERROR:  block 0 is a meta page
>
> postgres=# SELECT * FROM bt_page_items(get_raw_page('btree_index', 0)) LIMIT 1;
> ERROR:  block is a meta page
>

Well, the new function does not actually know the block number, so how 
could it include it in the error message? You only get the page itself, 
and it might be read from anywhere. Granted, the meta page should only 
be stored in block 0, but when the storage mixes up the pages somehow, 
that's not reliable.

>
> postgres=# SELECT * FROM bt_page_items(get_raw_page('btree_index',
> 1024)) LIMIT 1;
> ERROR:  block number 1024 is out of range for relation "btree_index"
>
>
> postgres=# SELECT * FROM bt_page_items('btree_index', 1024) LIMIT 1;
> ERROR:  block number out of range
>

Well, that error message does not come from the new function, it comes 
from get_raw_page(), so I'm not sure what am I supposed to do with that? 
Similarly to the previous case, the new function does not actually know 
the block number at all.

> 5) Code duplication in bt_page_items() and bt_page_items_bytea() needs
> to be handled.
>

Yes. If we adopt the approach proposed by Peter Eisentraut (redirecting 
the old bt_page_items using a SQL function calling the new one), it will 
also make the error messages consistent.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Ashutosh Sharma
Date:
Hi,

>
>> 2) It seems like you have choosen wrong datatype for page_checksum. I
>> am getting negative checksum value when trying to run below query. I
>> think the return type for the SQL function page_checksum should be
>> 'integer' instead of 'smallint'.
>>
>> postgres=# SELECT page_checksum(get_raw_page('pg_class', 0), 100);
>>  page_checksum
>> ---------------
>>         -19532
>> (1 row)
>>
>> Above problem also exist in case of page_header. I am facing similar
>> problem with page_header as well.
>>
>> postgres=# SELECT page_header(get_raw_page('pg_class', 0));
>>                  page_header
>> ---------------------------------------------
>>  (0/2891538,-28949,1,220,7216,8192,8192,4,0)
>> (1 row)
>>
>
> No. This is consistent with page_header() which is also using smallint for
> the checksum value.
>

Yes. But, as i said earlier I am getting negative checksum value for
page_header as well. Isn't that wrong. For eg. When I debug the
following query, i could pd_checksum value as '40074' in gdb where
page_header shows it as '-25462'.

SELECT page_header(get_raw_page('pg_class', 0));

(gdb) p    page->pd_checksum
$2 = 40074

postgres=# SELECT page_header(get_raw_page('pg_class', 0));                page_header
---------------------------------------------(0/304EDE0,-25462,1,220,7432,8192,8192,4,0)
(1 row)

I think pd_checksum in PageHeaderData is defined as uint16 (0 to
65,535) whereas in SQL function for page_header it is defined as
smallint (-32768 to +32767).

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Alvaro Herrera
Date:
Ashutosh Sharma wrote:

> Yes. But, as i said earlier I am getting negative checksum value for
> page_header as well. Isn't that wrong. For eg. When I debug the
> following query, i could pd_checksum value as '40074' in gdb where
> page_header shows it as '-25462'.

Yes; the point is that this is a pre-existing problem, not one being
introduced with this patch.  In SQL we don't have unsigned types, and we
have historically chosen a type with the same width instead of one with
enough width to represent all possible unsigned values.  IMO this patch
is doing the right thing; if in the future we want to change that
decision, that's fine but it's a separate patch.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Ashutosh Sharma
Date:
On Mar 14, 2017 5:37 PM, "Alvaro Herrera" <alvherre@2ndquadrant.com> wrote:
Ashutosh Sharma wrote:

> Yes. But, as i said earlier I am getting negative checksum value for
> page_header as well. Isn't that wrong. For eg. When I debug the
> following query, i could pd_checksum value as '40074' in gdb where
> page_header shows it as '-25462'.

Yes; the point is that this is a pre-existing problem, not one being
introduced with this patch.  In SQL we don't have unsigned types, and we
have historically chosen a type with the same width instead of one with
enough width to represent all possible unsigned values.  IMO this patch
is doing the right thing; if in the future we want to change that
decision, that's fine but it's a separate patch

Okay, understood . Thank you.

With Regards,
Ashutosh Sharma

Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
I have committed the page_checksum function, will work on the bt stuff next.

I left in the superuser check, because I was not confident how well
pg_checksum_page() would handle messed up data.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
I'm struggling to find a good way to share code between
bt_page_items(text, int4) and bt_page_items(bytea).

If we do it via the SQL route, as I had suggested, it makes the
extension non-relocatable, and it will also create a bit of a mess
during upgrades.

If doing it in C, it will be a bit tricky to pass the SRF context
around.  There is no "DirectFunctionCall within SRF context", AFAICT.

I'm half tempted to just rip out the (text, int4) variants.

In any case, I think we should add bytea variants to all the btree
functions, not just the bt_page_items one.

Attached is the remaining patch after the previous commits.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
Hi,

On 03/17/2017 05:23 PM, Peter Eisentraut wrote:
> I'm struggling to find a good way to share code between
> bt_page_items(text, int4) and bt_page_items(bytea).
>
> If we do it via the SQL route, as I had suggested, it makes the
> extension non-relocatable, and it will also create a bit of a mess
> during upgrades.
>
> If doing it in C, it will be a bit tricky to pass the SRF context
> around.  There is no "DirectFunctionCall within SRF context", AFAICT.
>

Not sure what it has to do with DirectFunctionCall? You want to call the 
bytea variant from the existing one? Wouldn't it be easier to simply 
define a static function with the shared parts, and pass around the 
fctx/fcinfo? Not quite pretty, but should work.
>
> I'm half tempted to just rip out the (text, int4) variants.
>

Perhaps. I see pageinspect as a tool for ad-hoc investigations, and I 
can't really imagine it being hard-wired into something.
>
> In any case, I think we should add bytea variants to all the btree
> functions, not just the bt_page_items one.
>

I agree, but I think we need to find a way to share the code between the 
text/bytea variants. Unless we rip the text ones out, obviously.

Thanks for the work on the patch, BTW.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
On 3/17/17 18:35, Tomas Vondra wrote:
> On 03/17/2017 05:23 PM, Peter Eisentraut wrote:
>> I'm struggling to find a good way to share code between
>> bt_page_items(text, int4) and bt_page_items(bytea).
>>
>> If we do it via the SQL route, as I had suggested, it makes the
>> extension non-relocatable, and it will also create a bit of a mess
>> during upgrades.
>>
>> If doing it in C, it will be a bit tricky to pass the SRF context
>> around.  There is no "DirectFunctionCall within SRF context", AFAICT.
> 
> Not sure what it has to do with DirectFunctionCall? You want to call the 
> bytea variant from the existing one? Wouldn't it be easier to simply 
> define a static function with the shared parts, and pass around the 
> fctx/fcinfo? Not quite pretty, but should work.

Perhaps what was added in
<http://git.postgresql.org/pg/commitdiff/29bf5016835a2c2c23786f7cda347716c083d95f>
would actually work here.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
David Steele
Date:
On 3/23/17 11:27 PM, Peter Eisentraut wrote:
> On 3/17/17 18:35, Tomas Vondra wrote:
>> On 03/17/2017 05:23 PM, Peter Eisentraut wrote:
>>> I'm struggling to find a good way to share code between
>>> bt_page_items(text, int4) and bt_page_items(bytea).
>>>
>>> If we do it via the SQL route, as I had suggested, it makes the
>>> extension non-relocatable, and it will also create a bit of a mess
>>> during upgrades.
>>>
>>> If doing it in C, it will be a bit tricky to pass the SRF context
>>> around.  There is no "DirectFunctionCall within SRF context", AFAICT.
>>
>> Not sure what it has to do with DirectFunctionCall? You want to call the
>> bytea variant from the existing one? Wouldn't it be easier to simply
>> define a static function with the shared parts, and pass around the
>> fctx/fcinfo? Not quite pretty, but should work.
>
> Perhaps what was added in
> <http://git.postgresql.org/pg/commitdiff/29bf5016835a2c2c23786f7cda347716c083d95f>
> would actually work here.

This thread has been idle for five days.  Please respond with a new 
patch by 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked 
"Returned with Feedback".

-- 
-David
david@pgmasters.net



Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:

On 03/24/2017 04:27 AM, Peter Eisentraut wrote:
> On 3/17/17 18:35, Tomas Vondra wrote:
>> On 03/17/2017 05:23 PM, Peter Eisentraut wrote:
>>> I'm struggling to find a good way to share code between
>>> bt_page_items(text, int4) and bt_page_items(bytea).
>>>
>>> If we do it via the SQL route, as I had suggested, it makes the
>>> extension non-relocatable, and it will also create a bit of a mess
>>> during upgrades.
>>>
>>> If doing it in C, it will be a bit tricky to pass the SRF context
>>> around.  There is no "DirectFunctionCall within SRF context", AFAICT.
>>
>> Not sure what it has to do with DirectFunctionCall? You want to call the
>> bytea variant from the existing one? Wouldn't it be easier to simply
>> define a static function with the shared parts, and pass around the
>> fctx/fcinfo? Not quite pretty, but should work.
> 
> Perhaps what was added in
> <http://git.postgresql.org/pg/commitdiff/29bf5016835a2c2c23786f7cda347716c083d95f>
> would actually work here.
>

I've tried to refactor the code using this, but the result was rather 
ugly because (a) it really is quite tricky to pass around the contexts 
and (b) the sanity checks are quite different for the two input types, 
so mixing those parts (essentially the SRF_IS_FIRSTCALL bits) does not 
make much sense IMHO.

The attached patch is the best I came up with - it essentially shares 
just the tuple-forming part, which is exactly the same in both cases.

It also adds the P_ISMETA(opaque) check to the original function, which 
seems like a useful defense against page written to a different place 
(which is essentially the issue I was originally investigating).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
David Steele
Date:
On 3/29/17 11:08 AM, Tomas Vondra wrote:
>
> The attached patch is the best I came up with - it essentially shares
> just the tuple-forming part, which is exactly the same in both cases.

I have marked this submission "Needs Review".

-- 
-David
david@pgmasters.net



Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Ashutosh Sharma
Date:
Hi,

On Wed, Mar 29, 2017 at 8:38 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
>
> On 03/24/2017 04:27 AM, Peter Eisentraut wrote:
>>
>> On 3/17/17 18:35, Tomas Vondra wrote:
>>>
>>> On 03/17/2017 05:23 PM, Peter Eisentraut wrote:
>>>>
>>>> I'm struggling to find a good way to share code between
>>>> bt_page_items(text, int4) and bt_page_items(bytea).
>>>>
>>>> If we do it via the SQL route, as I had suggested, it makes the
>>>> extension non-relocatable, and it will also create a bit of a mess
>>>> during upgrades.
>>>>
>>>> If doing it in C, it will be a bit tricky to pass the SRF context
>>>> around.  There is no "DirectFunctionCall within SRF context", AFAICT.
>>>
>>>
>>> Not sure what it has to do with DirectFunctionCall? You want to call the
>>> bytea variant from the existing one? Wouldn't it be easier to simply
>>> define a static function with the shared parts, and pass around the
>>> fctx/fcinfo? Not quite pretty, but should work.
>>
>>
>> Perhaps what was added in
>>
>> <http://git.postgresql.org/pg/commitdiff/29bf5016835a2c2c23786f7cda347716c083d95f>
>> would actually work here.
>>
>
> I've tried to refactor the code using this, but the result was rather ugly
> because (a) it really is quite tricky to pass around the contexts and (b)
> the sanity checks are quite different for the two input types, so mixing
> those parts (essentially the SRF_IS_FIRSTCALL bits) does not make much sense
> IMHO.
>
> The attached patch is the best I came up with - it essentially shares just
> the tuple-forming part, which is exactly the same in both cases.
>
> It also adds the P_ISMETA(opaque) check to the original function, which
> seems like a useful defense against page written to a different place (which
> is essentially the issue I was originally investigating).
>

It seems like the latest patch(v4) shared by Tomas upthread is an
empty patch. If I am not wrong, please share the correct patch.
Thanks.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com



Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
On 03/31/2017 06:01 PM, Ashutosh Sharma wrote:
> 
> It seems like the latest patch(v4) shared by Tomas upthread is an
> empty patch. If I am not wrong, please share the correct patch.
> Thanks.
> 

OMG, this is the second time I managed to generate an empty patch. I 
really need to learn not to do that ..

Attached is v5 of the patch, hopefully correct this time ;-)

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Ashutosh Sharma
Date:
Hi Tomas,

On Fri, Mar 31, 2017 at 11:05 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 03/31/2017 06:01 PM, Ashutosh Sharma wrote:
>>
>>
>> It seems like the latest patch(v4) shared by Tomas upthread is an
>> empty patch. If I am not wrong, please share the correct patch.
>> Thanks.
>>
>
> OMG, this is the second time I managed to generate an empty patch. I really
> need to learn not to do that ..
>
> Attached is v5 of the patch, hopefully correct this time ;-)

Yes, it was correct. I have spent some time on reviewing your patch today and the review comments are as follows.

1. It would be good to have a blank line in between following set of lines describing bt_page_print_tuples.

+/*
+ * bt_page_print_tuples
+ * form a tuple describing index tuple at a given offset
+ */


Something like this,

+/* ---------------------------------------------------------------------
+ * bt_page_print_tuples()
+ *
+ * Form a tuple describing index tuple at a given offset
+ * ---------------------------------------------------------------------
+ */

Please note that, if you do not agree with my suggestion, you may ignore it :)

2. I think the following comments for bt_page_items needs to be moved to the right place in the code.

/*-------------------------------------------------------------------------
 * bt_page_items()
 *
 * Get IndexTupleData set in a btree page
 *
 * Usage: SELECT * FROM bt_page_items('t1_pkey', 1);
 *-------------------------------------------------------------------------
 */

/*
 * cross-call data structure for SRF
 */
struct user_args
{
    Page        page;
    OffsetNumber offset;
};


With your patch, above structure definition is followed by the definition for bt_page_print_tuples() not bt_page_items(), hence you may need to put the comments for bt_page_items just above it's definition.

3. I am seeing a server crash when running the sql statement 'SELECT * FROM bt_page_items('bt_i4_index', 1);'. Here is the backtrace.

I think this crash is happening because in bt_page_items, without reading opaque region of a page, you have added this if statement,

        if (P_ISMETA(opaque))
            ereport(ERROR,
                    (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("block is a meta page")));


Please pull back above if statement below the following line to get rid of this crash.

opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);

OR, the best thing to do would be to retain the earlier if statement for checking meta page because that way you can avoid reading a buffer unnecessarily.

#0  0x00007f592975030c in bt_page_items (fcinfo=0x7fffe5fdb860) at btreefuncs.c:352
352 if (P_ISMETA(opaque))
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.el7.x86_64
(gdb) bt
#0  0x00007f592975030c in bt_page_items (fcinfo=0x7fffe5fdb860) at btreefuncs.c:352
#1  0x00000000006797e0 in ExecMakeTableFunctionResult (setexpr=0x21269f8, econtext=0x2126738, argContext=0x2136c98, expectedDesc=0x2155178,
    randomAccess=0 '\000') at execSRF.c:230
#2  0x0000000000689dda in FunctionNext (node=0x2126628) at nodeFunctionscan.c:94
#3  0x0000000000678f0e in ExecScanFetch (node=0x2126628, accessMtd=0x689d23 <FunctionNext>, recheckMtd=0x68a108 <FunctionRecheck>) at execScan.c:95
#4  0x0000000000678f7d in ExecScan (node=0x2126628, accessMtd=0x689d23 <FunctionNext>, recheckMtd=0x68a108 <FunctionRecheck>) at execScan.c:143

4. If you agree with the reason mentioned by me in comment #3, please remove the following if statement from bt_page_items(),

        if (P_ISMETA(opaque))
            ereport(ERROR,
                    (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                  errmsg("block is a meta page")));


Apart from above comments, your patch looks good to me. I have also marked this patch as 'Waiting for Author' in the commitfest. Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
David Steele
Date:
Hi Tomas,

On 4/1/17 5:40 AM, Ashutosh Sharma wrote:
> 
> Apart from above comments, your patch looks good to me. I have also
> marked this patch as 'Waiting for Author' in the commitfest. Thanks.

The CF has been extended until April 7 but time is still growing short
and this thread has been idle for three days.  Please respond with a new
patch by 2017-04-05 00:00 AoE (UTC-12) or this submission will be marked
"Returned with Feedback".

Thank,
-- 
-David
david@pgmasters.net



Re: Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Simon Riggs
Date:
On 4 April 2017 at 09:05, David Steele <david@pgmasters.net> wrote:
> Hi Tomas,
>
> On 4/1/17 5:40 AM, Ashutosh Sharma wrote:
>>
>> Apart from above comments, your patch looks good to me. I have also
>> marked this patch as 'Waiting for Author' in the commitfest. Thanks.
>
> The CF has been extended until April 7 but time is still growing short
> and this thread has been idle for three days.  Please respond with a new
> patch by 2017-04-05 00:00 AoE (UTC-12) or this submission will be marked
> "Returned with Feedback".

We know time is growing short. Is it necessary to say that on every
single thread? What benefit do we get from that?

We voted to extend the time until the deadline. In some cases we may
well take it all the way to the deadline, so fighting you or others
trying to reject things *before* the deadline *because* of the
deadline seems weird.

How about we just leave everything until the deadline, then apply the
sword swiftly to anything that remains?

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
David Steele
Date:
On 4/4/17 9:11 AM, Simon Riggs wrote:
> On 4 April 2017 at 09:05, David Steele <david@pgmasters.net> wrote:
>> Hi Tomas,
>>
>> On 4/1/17 5:40 AM, Ashutosh Sharma wrote:
>>>
>>> Apart from above comments, your patch looks good to me. I have also
>>> marked this patch as 'Waiting for Author' in the commitfest. Thanks.
>>
>> The CF has been extended until April 7 but time is still growing short
>> and this thread has been idle for three days.  Please respond with a new
>> patch by 2017-04-05 00:00 AoE (UTC-12) or this submission will be marked
>> "Returned with Feedback".
> 
> We know time is growing short. Is it necessary to say that on every
> single thread? What benefit do we get from that?

Hackers has become such a fire hose that I don't take it for granted
that all messages are read by everyone.  This may be news to some
authors, especially those that have not responded on a thread since
March 31.

> We voted to extend the time until the deadline. In some cases we may
> well take it all the way to the deadline, so fighting you or others
> trying to reject things *before* the deadline *because* of the
> deadline seems weird.

My goal is to help people focus on patches that have a chance.  At this
point I think that includes poking authors who are not being responsive
using the limited means at my disposal.

> How about we just leave everything until the deadline, then apply the
> sword swiftly to anything that remains?

I didn't take the extension to mean that I should stop CF management.
However, If you think I have exceeded my mandate over the course of the
CF, then I'm happy to discuss that.

Regards,
-- 
-David
david@pgmasters.net



Re: Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Robert Haas
Date:
On Tue, Apr 4, 2017 at 9:32 AM, David Steele <david@pgmasters.net> wrote:
> My goal is to help people focus on patches that have a chance.  At this
> point I think that includes poking authors who are not being responsive
> using the limited means at my disposal.

+1.  Pings on specific threads can help clear things up when, for
example, the author and reviewer are each waiting for the other.  And,
as you say, they also help avoid the situation where a patch just
falls off the radar and misses the release for no especially good
reason, which naturally causes people frustration.

I think your pings have been quite helpful, although I think it would
have been better in some cases if you'd done them sooner.  Pinging
after a week, with a 3 day deadline, when there are only a few days
left in the CommitFest isn't really leaving a lot of room to maneuver.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
David Steele
Date:
On 4/4/17 9:43 AM, Robert Haas wrote:
> On Tue, Apr 4, 2017 at 9:32 AM, David Steele <david@pgmasters.net> wrote:
>> My goal is to help people focus on patches that have a chance.  At this
>> point I think that includes poking authors who are not being responsive
>> using the limited means at my disposal.
> 
> +1.  Pings on specific threads can help clear things up when, for
> example, the author and reviewer are each waiting for the other.  And,
> as you say, they also help avoid the situation where a patch just
> falls off the radar and misses the release for no especially good
> reason, which naturally causes people frustration.
> 
> I think your pings have been quite helpful, although I think it would
> have been better in some cases if you'd done them sooner.  Pinging
> after a week, with a 3 day deadline, when there are only a few days
> left in the CommitFest isn't really leaving a lot of room to maneuver.

Thanks for the feedback!  My thinking is that I don't want to bug people
too soon, but there's a maximum number of days a thread should be idle.
Over the course of the CF that has gone from 10 days, to 7, 5, and 3.

I don't look at all patches every day so it can be a bit uneven, i.e.,
all patches are allowed certain amount of idle time, but some may get a
bit more depending on when I check up on them.

Thanks,
-- 
-David
david@pgmasters.net



Re: Re: PATCH: pageinspect / add page_checksum and bt_page_items(bytea)

From
Ashutosh Sharma
Date:
Hi,

As I am not seeing any response from Tomas for last 2-3 days and since
the commit-fest is coming towards end, I have planned to work on the
review comments that I had given few days back and submit the updated
patch. PFA new version of patch that takes care of all the review
comments given by me. I have also ran pgindent on btreefuncs.c file
and have run some basic test cases. All looks fine to me now!

Please note that this patch still belongs to Tomas not me. I still
remain the reviewer of this patch. The same thing has been very
clearly mentioned in the attached patch. The only intention behind
Ashutosh (reviewer) working on this patch is to ensure that we do not
miss the things that can easily get committed in this commit-fest.
Thanks.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Tue, Apr 4, 2017 at 7:23 PM, David Steele <david@pgmasters.net> wrote:
> On 4/4/17 9:43 AM, Robert Haas wrote:
>> On Tue, Apr 4, 2017 at 9:32 AM, David Steele <david@pgmasters.net> wrote:
>>> My goal is to help people focus on patches that have a chance.  At this
>>> point I think that includes poking authors who are not being responsive
>>> using the limited means at my disposal.
>>
>> +1.  Pings on specific threads can help clear things up when, for
>> example, the author and reviewer are each waiting for the other.  And,
>> as you say, they also help avoid the situation where a patch just
>> falls off the radar and misses the release for no especially good
>> reason, which naturally causes people frustration.
>>
>> I think your pings have been quite helpful, although I think it would
>> have been better in some cases if you'd done them sooner.  Pinging
>> after a week, with a 3 day deadline, when there are only a few days
>> left in the CommitFest isn't really leaving a lot of room to maneuver.
>
> Thanks for the feedback!  My thinking is that I don't want to bug people
> too soon, but there's a maximum number of days a thread should be idle.
> Over the course of the CF that has gone from 10 days, to 7, 5, and 3.
>
> I don't look at all patches every day so it can be a bit uneven, i.e.,
> all patches are allowed certain amount of idle time, but some may get a
> bit more depending on when I check up on them.
>
> Thanks,
> --
> -David
> david@pgmasters.net

Attachment

Re: Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Tomas Vondra
Date:
Thanks. I planned to look into this today, but you've been faster ;-)

regards
Tomas

On 04/04/2017 06:55 PM, Ashutosh Sharma wrote:
> Hi,
>
> As I am not seeing any response from Tomas for last 2-3 days and since
> the commit-fest is coming towards end, I have planned to work on the
> review comments that I had given few days back and submit the updated
> patch. PFA new version of patch that takes care of all the review
> comments given by me. I have also ran pgindent on btreefuncs.c file
> and have run some basic test cases. All looks fine to me now!
>
> Please note that this patch still belongs to Tomas not me. I still
> remain the reviewer of this patch. The same thing has been very
> clearly mentioned in the attached patch. The only intention behind
> Ashutosh (reviewer) working on this patch is to ensure that we do not
> miss the things that can easily get committed in this commit-fest.
> Thanks.
>

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
David Steele
Date:
On 4/4/17 12:55 PM, Ashutosh Sharma wrote:
> 
> As I am not seeing any response from Tomas for last 2-3 days and since
> the commit-fest is coming towards end, I have planned to work on the
> review comments that I had given few days back and submit the updated
> patch. PFA new version of patch that takes care of all the review
> comments given by me. I have also ran pgindent on btreefuncs.c file
> and have run some basic test cases. All looks fine to me now!

Awesome, Ashutosh!

-- 
-David
david@pgmasters.net



Re: Re: PATCH: pageinspect / add page_checksum andbt_page_items(bytea)

From
Peter Eisentraut
Date:
On 4/4/17 12:55, Ashutosh Sharma wrote:
> As I am not seeing any response from Tomas for last 2-3 days and since
> the commit-fest is coming towards end, I have planned to work on the
> review comments that I had given few days back and submit the updated
> patch. PFA new version of patch that takes care of all the review
> comments given by me. I have also ran pgindent on btreefuncs.c file
> and have run some basic test cases. All looks fine to me now!

committed

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services