Thread: pageinspect patch, for showing tuple data

pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
Hi!

I've created a patch for pageinspect that allows to see data stored in the
tuple.

This patch has two main purposes:

1. Practical: Make manual DB recovery more simple
2. Educational: Seeing what data is actually stored in tuple, allows to get
better understanding of how does postgres actually works.

This patch adds several new functions, available from SQL queries. All these
functions are based on heap_page_items, but accept slightly different
arguments and has one additional column at the result set:

heap_page_tuples - accepts relation name, and bulkno, and returns usual
heap_page_items set with additional column that contain snapshot of tuple data
area stored in bytea.

heap_page_tuples_attributes - same as heap_page_tuples, but instead of single
tuple data bytea snapshot, it has array of bytea values, that were splitted
into attributes as they would be spitted by nocachegetattr function (I
actually reimplemented this function main algorithm to get this done)

heap_page_tuples_attrs_detoasted - same as heap_page_tuples_attrs, but all
varlen attributes values that were compressed or TOASTed, are replaced with
unTOASTed and uncompressed values.


There is also one strange function: _heap_page_items it is useless for
practical purposes. As heap_page_items it accepts page data as bytea, but also
get a relation name. It tries to apply tuple descriptor of that relation to
that page data.
This would allow you to try to read page data from one table using tuple
descriptor from anther. A strange idea, one should say. But this will allow
you: a) See how wrong data can be interpreted (educational purpose).
b) I have plenty of sanity check while reading parsing that tuple, for this
function I've changed error level from ERROR to WARNING. This function will
allow to write proper tests that all these checks work as they were designed
(I hope to write these tests sooner or later)

I've also added raw tuple data output to original heap_page_items function,
thought I am not sure if it is good idea. I just can add it there so I did it.
May be it would be better to change it back for better backward compatibility.

Attached patched is in "almost ready" state. It has some formatting issues.
I'd like to hear HACKER's opinion before finishing it and sending to
commitfest.

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Mon, Aug 3, 2015 at 1:03 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
> Hi!
>
> I've created a patch for pageinspect that allows to see data stored in the
> tuple.
>
> This patch has two main purposes:
>
> 1. Practical: Make manual DB recovery more simple

To what are you referring to in this case? Manual manipulation of
on-disk data manually?

> b) I have plenty of sanity check while reading parsing that tuple, for this
> function I've changed error level from ERROR to WARNING. This function will
> allow to write proper tests that all these checks work as they were designed
> (I hope to write these tests sooner or later)

+                ereport(inter_call_data->error_level,
+                        (errcode(ERRCODE_DATA_CORRUPTED),
+                            errmsg("Data corruption: Iterating over
tuple data reached out of actual tuple size")));
I don't think that the possibility to raise a WARNING is a good thing
in those code paths. If data is corrupted this may crash, and I am not
sure that anybody would want that even for educational purposes.
-- 
Michael



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 3 августа 2015 14:30:46 пользователь Michael Paquier написал:
> On Mon, Aug 3, 2015 at 1:03 AM, Nikolay Shaplov
>
> <n.shaplov@postgrespro.ru> wrote:
> > Hi!
> >
> > I've created a patch for pageinspect that allows to see data stored in the
> > tuple.
> >
> > This patch has two main purposes:
> >
> > 1. Practical: Make manual DB recovery more simple
>
> To what are you referring to in this case? Manual manipulation of
> on-disk data manually?
Yes, when DB is broken for example

>
> > b) I have plenty of sanity check while reading parsing that tuple, for
> > this
> > function I've changed error level from ERROR to WARNING. This function
> > will
> > allow to write proper tests that all these checks work as they were
> > designed (I hope to write these tests sooner or later)
>
> +                ereport(inter_call_data->error_level,
> +                        (errcode(ERRCODE_DATA_CORRUPTED),
> +                            errmsg("Data corruption: Iterating over
> tuple data reached out of actual tuple size")));
> I don't think that the possibility to raise a WARNING is a good thing
> in those code paths. If data is corrupted this may crash, and I am not
> sure that anybody would want that even for educational purposes.
Hm... I considered _heap_page_items really very dangerous function, with big
red "Do not call it if you not sure what are you doing" warning. Reading data
with not proper attribute descriptor is dangerous any way. But when I wrote
that code, I did not have that checks at first, and it was really interesting
to subst one data with another and see what will happen. And I thought that
may be other explorers will like to do the same. And it is really possible
only in warning mode. So I left warnings only in _heap_page_items, and set
errors for all other cases.


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Alvaro Herrera
Date:
Nikolay Shaplov wrote:

> This patch adds several new functions, available from SQL queries. All these 
> functions are based on heap_page_items, but accept slightly different 
> arguments and has one additional column at the result set:
> 
> heap_page_tuples - accepts relation name, and bulkno, and returns usual 
> heap_page_items set with additional column that contain snapshot of tuple data 
> area stored in bytea.

I think the API around get_raw_page is more useful generally.  You might
have table blocks stored in a bytea column for instance, because you
extracted from some remote server and inserted into a local table for
examination.  If you only accept relname/blkno, there's no way to
examine data other than blocks directly in the database dir, which is
limiting.

> There is also one strange function: _heap_page_items it is useless for 
> practical purposes. As heap_page_items it accepts page data as bytea, but also 
> get a relation name. It tries to apply tuple descriptor of that relation to 
> that page data. 

For BRIN, I added something similar (brin_page_items), but it receives
the index OID rather than name, and constructs a tuple descriptor to
read the data.  I think OID is better than name generally.  (You can
cast the relation name to regclass).

It's easy to misuse, but these functions are superuser-only, so there
should be no security issue at least.  The possibility of a crash
remains real, though, so maybe we should try to come up with a way to
harden that.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 3 августа 2015 15:35:23 Вы написали:
> Nikolay Shaplov wrote:
> > This patch adds several new functions, available from SQL queries. All
> > these functions are based on heap_page_items, but accept slightly
> > different arguments and has one additional column at the result set:
> >
> > heap_page_tuples - accepts relation name, and bulkno, and returns usual
> > heap_page_items set with additional column that contain snapshot of tuple
> > data area stored in bytea.
>
> I think the API around get_raw_page is more useful generally.  You might
> have table blocks stored in a bytea column for instance, because you
> extracted from some remote server and inserted into a local table for
> examination.  If you only accept relname/blkno, there's no way to
> examine data other than blocks directly in the database dir, which is
> limiting.
>
> > There is also one strange function: _heap_page_items it is useless for
> > practical purposes. As heap_page_items it accepts page data as bytea, but
> > also get a relation name. It tries to apply tuple descriptor of that
> > relation to that page data.
>
> For BRIN, I added something similar (brin_page_items), but it receives
> the index OID rather than name, and constructs a tuple descriptor to
> read the data.  I think OID is better than name generally.  (You can
> cast the relation name to regclass).
>
> It's easy to misuse, but these functions are superuser-only, so there
> should be no security issue at least.  The possibility of a crash
> remains real, though, so maybe we should try to come up with a way to
> harden that.

Hmm... may be you are right. I'll try to change it would take raw page and
OID.

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
> Nikolay Shaplov wrote:
> > This patch adds several new functions, available from SQL queries. All
> > these functions are based on heap_page_items, but accept slightly
> > different arguments and has one additional column at the result set:
> >
> > heap_page_tuples - accepts relation name, and bulkno, and returns usual
> > heap_page_items set with additional column that contain snapshot of tuple
> > data area stored in bytea.
>
> I think the API around get_raw_page is more useful generally.  You might
> have table blocks stored in a bytea column for instance, because you
> extracted from some remote server and inserted into a local table for
> examination.  If you only accept relname/blkno, there's no way to
> examine data other than blocks directly in the database dir, which is
> limiting.
>
> > There is also one strange function: _heap_page_items it is useless for
> > practical purposes. As heap_page_items it accepts page data as bytea, but
> > also get a relation name. It tries to apply tuple descriptor of that
> > relation to that page data.
>
> For BRIN, I added something similar (brin_page_items), but it receives
> the index OID rather than name, and constructs a tuple descriptor to
> read the data.  I think OID is better than name generally.  (You can
> cast the relation name to regclass).
>
> It's easy to misuse, but these functions are superuser-only, so there
> should be no security issue at least.  The possibility of a crash
> remains real, though, so maybe we should try to come up with a way to
> harden that.

I've done as you've said: Now patch adds two functions heap_page_item_attrs
and heap_page_item_detoast_attrs and these function takes raw_page and
relation OID as arguments. They also have third optional parameter that allows
to change error mode from ERROR to WARNING.

Is it ok now?


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:


On Wed, Sep 2, 2015 at 6:58 PM, Nikolay Shaplov <n.shaplov@postgrespro.ru> wrote:
В письме от 3 августа 2015 15:35:23 пользователь Alvaro Herrera написал:
> Nikolay Shaplov wrote:
> > This patch adds several new functions, available from SQL queries. All
> > these functions are based on heap_page_items, but accept slightly
> > different arguments and has one additional column at the result set:
> >
> > heap_page_tuples - accepts relation name, and bulkno, and returns usual
> > heap_page_items set with additional column that contain snapshot of tuple
> > data area stored in bytea.
>
> I think the API around get_raw_page is more useful generally.  You might
> have table blocks stored in a bytea column for instance, because you
> extracted from some remote server and inserted into a local table for
> examination.  If you only accept relname/blkno, there's no way to
> examine data other than blocks directly in the database dir, which is
> limiting.
>
> > There is also one strange function: _heap_page_items it is useless for
> > practical purposes. As heap_page_items it accepts page data as bytea, but
> > also get a relation name. It tries to apply tuple descriptor of that
> > relation to that page data.
>
> For BRIN, I added something similar (brin_page_items), but it receives
> the index OID rather than name, and constructs a tuple descriptor to
> read the data.  I think OID is better than name generally.  (You can
> cast the relation name to regclass).
>
> It's easy to misuse, but these functions are superuser-only, so there
> should be no security issue at least.  The possibility of a crash
> remains real, though, so maybe we should try to come up with a way to
> harden that.

I've done as you've said: Now patch adds two functions heap_page_item_attrs
and heap_page_item_detoast_attrs and these function takes raw_page and
relation OID as arguments. They also have third optional parameter that allows
to change error mode from ERROR to WARNING.

Is it ok now?

Yeah, I think that's acceptable to have a switch, defaulting to ERROR if caller specifies nothing.

Here are some observations after looking at the code, no functional testing.

+       int             error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2) :NULL;

When handling additional arguments, it seems to me that it is more readable to use something like that:
if (PG_NARGS >= 3)
{
     arg = PG_GETARG_XXX(2);
     //etc.
}

+       error_mode = error_mode?WARNING:ERROR;

This generates warnings at compilation.

+       if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
+       {
+               ereport(WARNING,
+                       (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                               (errmsg("Runnung heap_page_item_attrs in WARNING mode."))));
+       }

I am not convinced that this is necessary.

+               inter_call_data->raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
+               if (inter_call_data->raw_page_size < SizeOfPageHeaderData)
Including raw_page_size in the status data does not seem necessary to me. The page data is already available for each loop so you could just extract it from it.

+               ereport(inter_call_data->error_level,
+                               (errcode(ERRCODE_DATA_CORRUPTED),
+                                       errmsg("Data corruption: number of attributes in tuple header is greater than number of attributes in tuple descripor")));
I'd rather switch the error reports related to data corruption from ereport to elog, which is more suited for internal errors, and it seems to me that it is the case here.

heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                if (!is_null)
                    ^~~~~~~~
heapfuncs.c:419:43: note: uninitialized use occurs here
                raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null, BYTEAOID, fctx->multi_call_memory_ctx);
                                                        ^~~~~~~~
heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
                if (!is_null)
                ^~~~~~~~~~~~~
heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence this warning
                Datum raw_attr;
My compiler complains about uninitialized variables.

+--
+-- _heap_page_items()
+--
+CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text,
I am not sure why this is necessary and visibly it overlaps with the existing declaration of heap_page_items. The last argument is different though, and I recall that we decided not to use anymore the relation name specified as text, but its OID instead in this module.

Documentation is missing, that would be good to have to understand what each function is intended to do. Code has some whitespaces.
--
Michael

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:

> Yeah, I think that's acceptable to have a switch, defaulting to ERROR if
> caller specifies nothing.
>
> Here are some observations after looking at the code, no functional testing.
>
> +       int             error_mode = PG_NARGS()==3 ? PG_GETARG_BOOL(2)
>
> :NULL;
>
> When handling additional arguments, it seems to me that it is more readable
> to use something like that:
> if (PG_NARGS >= 3)
> {
>      arg = PG_GETARG_XXX(2);
>      //etc.
> }
>
> +       error_mode = error_mode?WARNING:ERROR;
>
> This generates warnings at compilation.
>
yeah, what I have done here is too complex with no reason. I've simplified this
code now.

> +       if (SRF_IS_FIRSTCALL() && (error_mode == WARNING))
> +       {
> +               ereport(WARNING,
> +                       (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                               (errmsg("Runnung heap_page_item_attrs in
> WARNING mode."))));
> +       }
>
> I am not convinced that this is necessary.
I've removed it.

>
> +               inter_call_data->raw_page_size = VARSIZE(raw_page) -
> VARHDRSZ;
> +               if (inter_call_data->raw_page_size < SizeOfPageHeaderData)
> Including raw_page_size in the status data does not seem necessary to me.
> The page data is already available for each loop so you could just extract
> it from it.
>
> +               ereport(inter_call_data->error_level,
> +                               (errcode(ERRCODE_DATA_CORRUPTED),
> +                                       errmsg("Data corruption: number of
> attributes in tuple header is greater than number of attributes in tuple
> descripor")));
> I'd rather switch the error reports related to data corruption from ereport
> to elog, which is more suited for internal errors, and it seems to me that
> it is the case here.
Hm... First, since we have proper error code, do not see why not give it.

Second, this is not exactly internal error, in some cases user tries to parse
corrupted data on purpose. So for user it is not an internal error, error from
deep below, but error on the level he is currently working at. So I would not
call it internal error.

> heapfuncs.c:370:7: warning: variable 'raw_attr' is used uninitialized
> whenever 'if' condition is false [-Wsometimes-uninitialized]
>                 if (!is_null)
>                     ^~~~~~~~
> heapfuncs.c:419:43: note: uninitialized use occurs here
>                 raw_attrs = accumArrayResult(raw_attrs, raw_attr, is_null,
> BYTEAOID, fctx->multi_call_memory_ctx);
>                                                         ^~~~~~~~
> heapfuncs.c:370:3: note: remove the 'if' if its condition is always true
>                 if (!is_null)
>                 ^~~~~~~~~~~~~
> heapfuncs.c:357:17: note: initialize the variable 'raw_attr' to silence
> this warning
>                 Datum raw_attr;
> My compiler complains about uninitialized variables.
Fixed it

> +--
> +-- _heap_page_items()
> +--
> +CREATE FUNCTION _heap_page_items(IN page bytea, IN relname text,
> I am not sure why this is necessary and visibly it overlaps with the
> existing declaration of heap_page_items. The last argument is different
> though, and I recall that we decided not to use anymore the relation name
> specified as text, but its OID instead in this module.
Oh! This should not go to the final patch :-( Sorry. Removed it.


> Documentation is missing, that would be good to have to understand what
> each function is intended to do.

I were going to add documentation when this patch is commited, or at least
approved for commit. So I would know for sure that function definition would
not change, so had not to rewrite it again and again.

So if it is possible I would write documentation and test when this patch is
already approved.

> Code has some whitespaces.
I've found and removed some. Hope that was all of them...


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:


On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:
В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
> Documentation is missing, that would be good to have to understand what
> each function is intended to do.

I were going to add documentation when this patch is commited, or at least
approved for commit. So I would know for sure that function definition would
not change, so had not to rewrite it again and again.

I doubt that this patch would be committed without documentation, this is a requirement for any patch.
 
So if it is possible I would write documentation and test when this patch is
already approved.

I'm fine with that. Let's discuss its shape and come up with an agreement. It would have been good to post test cases of what this stuff actually does though, people reviewing this thread are limited to guess on what is tried to be achieved just by reading the code. That's actually where documentation, even a draft of documentation helps a lot in the review to see if what is expected by the developer matches what the code actually does.

> Code has some whitespaces.
I've found and removed some. Hope that was all of them...

Yeah, it looks that you took completely rid of them.

In details, this patch introduces two independent concepts:
- add tuple data as a new bytea column of heap_page_items. This is indeed where it should be, and note that this is where the several corruption checks are done by checking the state of the tuple data.
- add heap_page_item_attrs and heap_page_item_detoast_attrs, which is very similar to heap_page_items, at the difference that it can take an OID to be able to use a TupleDesc and return a bytea array with the data of each attribute.

Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too similar to what heap_page_items does, leading to a code maze that is going to make future extensions more difficult, which is what lead to the refactoring your patch does.
Hence, instead of all those complications, why not simply introducing two functions that take as input the tuple data and the OID of the relation, returning those bytea arrays? It seems to be that this would be a more handy interface, and as this is for educational purposes I guess that the addition of the overhead induced by the extra function call would be acceptable.

Regards,
--
Michael

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Tue, Sep 8, 2015 at 11:53 AM, Michael Paquier wrote:
> Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
> similar to what heap_page_items does, leading to a code maze that is going
> to make future extensions more difficult, which is what lead to the
> refactoring your patch does.
> Hence, instead of all those complications, why not simply introducing two
> functions that take as input the tuple data and the OID of the relation,
> returning those bytea arrays? It seems to be that this would be a more handy
> interface, and as this is for educational purposes I guess that the addition
> of the overhead induced by the extra function call would be acceptable.

Actually not two functions, but just one, with an extra flag to be
able to enforce detoasting on the field values where this can be done.
-- 
Michael



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 8 сентября 2015 11:53:24 Вы написали:
> On Sat, Sep 5, 2015 at 1:05 AM, Nikolay Shaplov wrote:
> > В письме от 4 сентября 2015 14:58:29 пользователь Michael Paquier написал:
> > > Documentation is missing, that would be good to have to understand what
> > > each function is intended to do.
> >
> > I were going to add documentation when this patch is commited, or at least
> > approved for commit. So I would know for sure that function definition
> > would
> > not change, so had not to rewrite it again and again.
>
> I doubt that this patch would be committed without documentation, this is a
> requirement for any patch.
Ok. I can do it.

> > So if it is possible I would write documentation and test when this patch
> > is
> > already approved.
>
> I'm fine with that. Let's discuss its shape and come up with an agreement.
> It would have been good to post test cases of what this stuff actually does
> though, people reviewing this thread are limited to guess on what is tried
> to be achieved just by reading the code.

To test non detoasted functions one should do:

CREATE EXTENSION pageinspect;

create table test (a int, b smallint,c varchar);
insert into test VALUES
(-1,2,'111'),
(2,null,'222'),
(3,8,'333'),
(null,16,null);

Then

# select * from heap_page_items(get_raw_page('test', 0));lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 |
t_ctid| t_infomask2 | t_infomask | t_hoff |  t_bits  | t_oid |         t_data          

----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+----------+-------+------------------------
1|   8152 |        1 |     34 |    763 |      0 |        0 | (0,1)  |           3 |       2050 |     24 |          |
  | \xffffffff020009313131 2 |   8120 |        1 |     32 |    763 |      0 |        0 | (0,2)  |           3 |
2051|     24 | 10100000 |       | \x0200000009323232 3 |   8080 |        1 |     34 |    763 |      0 |        0 |
(0,3) |           3 |       2050 |     24 |          |       | \x03000000080009333333 4 |   8048 |        1 |     26 |
 763 |      0 |        0 | (0,4)  |           3 |       2049 |     24 | 01000000 |       | \x1000 

or

# select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);lp | lp_off | lp_flags | lp_len | t_xmin
|t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff |  t_bits  | t_oid |                 t_attrs
     

----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+----------+-------+-----------------------------------------
1|   8152 |        1 |     34 |    763 |      0 |        0 | (0,1)  |           3 |       2050 |     24 |          |
  | {"\\xffffffff","\\x0200","\\x09313131"} 2 |   8120 |        1 |     32 |    763 |      0 |        0 | (0,2)  |
    3 |       2051 |     24 | 10100000 |       | {"\\x02000000",NULL,"\\x09323232"} 3 |   8080 |        1 |     34 |
763|      0 |        0 | (0,3)  |           3 |       2050 |     24 |          |       |
{"\\x03000000","\\x0800","\\x09333333"}4 |   8048 |        1 |     26 |    763 |      0 |        0 | (0,4)  |
3|       2049 |     24 | 01000000 |       | {NULL,"\\x1000",NULL} 
(4 rows)

For detoasted function you should do something like this:

CREATE EXTENSION pageinspect;

create table test2 (c varchar);

insert into test2 VALUES ('++');
insert into test2 VALUES (repeat('+',2100));

Then  heap_page_item_attrs will show you compressed attr data:

# select * from heap_page_item_attrs(get_raw_page('test2', 0),'test2'::regclass);lp | lp_off | lp_flags | lp_len |
t_xmin| t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid |
    t_attrs                                     

----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+-------------------------------------------------------------------------------
1|   8160 |        1 |     27 |    768 |      0 |        0 | (0,1)  |           1 |       2050 |     24 |        |
| {"\\x072b2b"} 2 |   8096 |        1 |     59 |    769 |      0 |        0 | (0,2)  |           1 |       2050 |
24|        |       | {"\\x8e00000034080000fe2b0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff0f01ff010f01aa"} 
(2 rows)

and heap_page_item_detoast_attrs will show you original data

# select * from heap_page_item_detoast_attrs(get_raw_page('test2', 0),'test2'::regclass);
[will not paste output here it is too long, see it for yourself]


> That's actually where
> documentation, even a draft of documentation helps a lot in the review to
> see if what is expected by the developer matches what the code actually
> does.
>
> > Code has some whitespaces.
> > I've found and removed some. Hope that was all of them...
>
> Yeah, it looks that you took completely rid of them.
>
> In details, this patch introduces two independent concepts:
> - add tuple data as a new bytea column of heap_page_items. This is indeed
> where it should be, and note that this is where the several corruption
> checks are done by checking the state of the tuple data.
Sorry do not understand what do you want to say in the second part of the last
sentence. About corruption checks.

> - add heap_page_item_attrs and heap_page_item_detoast_attrs, which is very
> similar to heap_page_items, at the difference that it can take an OID to be
> able to use a TupleDesc and return a bytea array with the data of each
> attribute.
That's right.

And more:

- heap_page_item_attrs and heap_page_item_detoast_attrs has third optional
attribute, set it to true if you want these functions to report about problems in
WARNING mode. This will allow to force attribute parsing even if some consistency
checks are failed. This is necessary for educational purposes, so everyone can
make fake data and see how postgres will try to parse it. Also this will be needed
for test. So we can test that pageinspect will is properly doing all checks. And it is
much easier to do this when it is done in warning mode.

> Honestly, heap_page_item_attrs and heap_page_item_detoast_attrs are way too
> similar to what heap_page_items does, leading to a code maze that is going
> to make future extensions more difficult, which is what lead to the
> refactoring your patch does.
> Hence, instead of all those complications, why not simply introducing two
> functions that take as input the tuple data and the OID of the relation,
> returning those bytea arrays? It seems to be that this would be a more
> handy interface, and as this is for educational purposes I guess that the
> addition of the overhead induced by the extra function call would be
> acceptable.

When I looked at this task I considered doing the same thing. Bun unfortunately it is
not possible. (Or if be more correct it is possible, but if I do so, it would be hard to us
e it)

The thing is, that to properly split tuple data into attributes, we need some values from
tuple header:

t_infomask2: where postgres store actual number of stored attributes, that may differ
from one in tuple data. That will allow to properly parse tuples after
ALTER TABLE ADD COLUMN when it was done without SET DEFAULT option
t_infomask: has bit that indicates that there is some null attributes in tuple.
t_bits: has a bit mask that shows what attributes were set to null.

So if move tuple data parsing into separate function, then we have to pass these
values alongside the tuple data. This is not convenient at all.
So I did it in a way you see in the patch.


> Actually not two functions, but just one, with an extra flag to be
> able to enforce detoasting on the field values where this can be done.

Yeah, I also thought about it. But did not come to any final conclusion. Should
we add forth argument, that enables detoasting, instead of adding separate
function?

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Wed, Sep 9, 2015 at 8:39 PM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
> В письме от 8 сентября 2015 11:53:24 Вы написали:
>> Hence, instead of all those complications, why not simply introducing two
>> functions that take as input the tuple data and the OID of the relation,
>> returning those bytea arrays? It seems to be that this would be a more
>> handy interface, and as this is for educational purposes I guess that the
>> addition of the overhead induced by the extra function call would be
>> acceptable.
>
> When I looked at this task I considered doing the same thing. Bun unfortunately it is
> not possible. (Or if be more correct it is possible, but if I do so, it would be hard to us
> e it)
> The thing is, that to properly split tuple data into attributes, we need some values from
> tuple header:
> t_infomask2: where postgres store actual number of stored attributes, that may differ
> from one in tuple data. That will allow to properly parse tuples after
> ALTER TABLE ADD COLUMN when it was done without SET DEFAULT option
> t_infomask: has bit that indicates that there is some null attributes in tuple.
> t_bits: has a bit mask that shows what attributes were set to null.
>
> So if move tuple data parsing into separate function, then we have to pass these
> values alongside the tuple data. This is not convenient at all.
> So I did it in a way you see in the patch.

Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]

Note that the data corruption checks apply only to this function as
far as I understand, so I think that things could be actually split
into two independent patches:
1) Upgrade heap_page_items to add the tuple data as bytea.
2) Add the new function able to parse those fields appropriately.

As this code, as you justly mentioned, is aimed mainly for educational
purposes to understand a page structure, we should definitely make it
as simple as possible at code level, and it seems to me that this
comes with a separate SQL interface to control tuple data parsing as a
bytea[]. We are doing no favor to our users to complicate the code of
pageinspect.c as this patch is doing in its current state, especially
if beginners want to have a look at it.

>> Actually not two functions, but just one, with an extra flag to be
>> able to enforce detoasting on the field values where this can be done.
>
> Yeah, I also thought about it. But did not come to any final conclusion. Should
> we add forth argument, that enables detoasting, instead of adding separate
> function?

This is definitely something you want to control with a switch.
--
Michael



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:

> > So if move tuple data parsing into separate function, then we have to pass
> > these values alongside the tuple data. This is not convenient at all.
> > So I did it in a way you see in the patch.
>
> Why is it not convenient at all? Yes, you have a point, we need those
> fields to be able to parse the t_data properly. Still the possibility
> to show individual fields of a tuple as a bytea array either with
> toasted or detoasted values is a concept completely different from
> simply showing the page items, which is what, it seems to me,
> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
> and t_bits are already available as return fields of heap_page_items,
> we should simply add a function like that:
> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
> returns bytea[]

Just compare two expressions:

select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);

and

select  lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid,
t_infomask2, t_infomask, t_hoff, t_bits, t_oid,  tuple_data_parse (
t_data,  t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false)
from heap_page_item_attrs(get_raw_page('test', 0));

The second variant is a total mess and almost unsable...

Though I've discussed this issue with friedns, and we came to conclusion that
it might be good to implement tuple_data_parse and then implement
easy to use heap_page_item_attrs in pure SQL, using heap_page_items and
tuple_data_parse.

That would keep usage simplicity, and make code more simple as you offer.

The only one argument against it is that in internal representation t_bist is
binary, and in sql output it is string with '0' and '1' characters. We will
have to convert it back to binary mode. This is not difficult, but just useless
convertations there and back again.

What do you think about this solution?


> Note that the data corruption checks apply only to this function as
> far as I understand, so I think that things could be actually split
> into two independent patches:
> 1) Upgrade heap_page_items to add the tuple data as bytea.
> 2) Add the new function able to parse those fields appropriately.
>
> As this code, as you justly mentioned, is aimed mainly for educational
> purposes to understand a page structure, we should definitely make it
> as simple as possible at code level, and it seems to me that this
> comes with a separate SQL interface to control tuple data parsing as a
> bytea[]. We are doing no favor to our users to complicate the code of
> pageinspect.c as this patch is doing in its current state, especially
> if beginners want to have a look at it.
>
> >> Actually not two functions, but just one, with an extra flag to be
> >> able to enforce detoasting on the field values where this can be done.
> >
> > Yeah, I also thought about it. But did not come to any final conclusion.
> > Should we add forth argument, that enables detoasting, instead of adding
> > separate function?
>
> This is definitely something you want to control with a switch.
Ok.Let's come to the final decision with tuple_data_parse, and i will add this
switch there and to pure sql heap_page_item_attrs


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Bruce Momjian
Date:
On Thu, Sep 10, 2015 at 03:46:25PM +0900, Michael Paquier wrote:
> Why is it not convenient at all? Yes, you have a point, we need those
> fields to be able to parse the t_data properly. Still the possibility
> to show individual fields of a tuple as a bytea array either with
> toasted or detoasted values is a concept completely different from
> simply showing the page items, which is what, it seems to me,
> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
> and t_bits are already available as return fields of heap_page_items,
> we should simply add a function like that:
> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
> returns bytea[]

Should pageinspect create a table that contains some of the constants
used to interpret infomask?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Fri, Sep 11, 2015 at 12:08 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
> В письме от 10 сентября 2015 15:46:25 пользователь Michael Paquier написал:
>
>> > So if move tuple data parsing into separate function, then we have to pass
>> > these values alongside the tuple data. This is not convenient at all.
>> > So I did it in a way you see in the patch.
>>
>> Why is it not convenient at all? Yes, you have a point, we need those
>> fields to be able to parse the t_data properly. Still the possibility
>> to show individual fields of a tuple as a bytea array either with
>> toasted or detoasted values is a concept completely different from
>> simply showing the page items, which is what, it seems to me,
>> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
>> and t_bits are already available as return fields of heap_page_items,
>> we should simply add a function like that:
>> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
>> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
>> returns bytea[]
>
> Just compare two expressions:
>
> select * from heap_page_item_attrs(get_raw_page('test', 0),'test'::regclass);
>
> and
>
> select  lp, lp_off, lp_flags, lp_len, t_xmin, t_xmax, t_field3, t_ctid,
> t_infomask2, t_infomask, t_hoff, t_bits, t_oid,  tuple_data_parse (
> t_data,  t_infomask, t_infomask2, t_bits, 'test'::regclass, true, false)
> from heap_page_item_attrs(get_raw_page('test', 0));
>
> The second variant is a total mess and almost unsable...

It is hard to believe as well that any sane application would use * as
well in a SELECT query. Users would though and we are talking about
user's education here :)

> Though I've discussed this issue with friedns, and we came to conclusion that
> it might be good to implement tuple_data_parse and then implement
> easy to use heap_page_item_attrs in pure SQL, using heap_page_items and
> tuple_data_parse.
> That would keep usage simplicity, and make code more simple as you offer.

Yep. That's doable with a simple SQL function. I am not sure that it
is worth including in pageinspect though.

> The only one argument against it is that in internal representation t_bits is
> binary, and in sql output it is string with '0' and '1' characters. We will
> have to convert it back to binary mode. This is not difficult, but just useless
> convertations there and back again.

The reason why this is visibly converted from bit to text is that the
in-core bit data type has a fixed length, and in the case of
HeapTupleHeaderData there is a variable length.

> What do you think about this solution?

For code simplicity's sake this seems worth the cost.

>> Note that the data corruption checks apply only to this function as
>> far as I understand, so I think that things could be actually split
>> into two independent patches:
>> 1) Upgrade heap_page_items to add the tuple data as bytea.
>> 2) Add the new function able to parse those fields appropriately.
>>
>> As this code, as you justly mentioned, is aimed mainly for educational
>> purposes to understand a page structure, we should definitely make it
>> as simple as possible at code level, and it seems to me that this
>> comes with a separate SQL interface to control tuple data parsing as a
>> bytea[]. We are doing no favor to our users to complicate the code of
>> pageinspect.c as this patch is doing in its current state, especially
>> if beginners want to have a look at it.
>>
>> >> Actually not two functions, but just one, with an extra flag to be
>> >> able to enforce detoasting on the field values where this can be done.
>> >
>> > Yeah, I also thought about it. But did not come to any final conclusion.
>> > Should we add forth argument, that enables detoasting, instead of adding
>> > separate function?
>>
>> This is definitely something you want to control with a switch.
> Ok.Let's come to the final decision with tuple_data_parse, and i will add this
> switch there and to pure sql heap_page_item_attrs

Fine for me.
--
Michael



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Fri, Sep 11, 2015 at 12:12 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Thu, Sep 10, 2015 at 03:46:25PM +0900, Michael Paquier wrote:
>> Why is it not convenient at all? Yes, you have a point, we need those
>> fields to be able to parse the t_data properly. Still the possibility
>> to show individual fields of a tuple as a bytea array either with
>> toasted or detoasted values is a concept completely different from
>> simply showing the page items, which is what, it seems to me,
>> heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
>> and t_bits are already available as return fields of heap_page_items,
>> we should simply add a function like that:
>> heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
>> t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
>> returns bytea[]
>
> Should pageinspect create a table that contains some of the constants
> used to interpret infomask?

Interesting idea. It may be indeed useful to show to a user mappings
between t_infomask flags <=> textual meaning. I guess that we could
have an SRF function with a view on top of it that returns such a
list. The same can apply to t_infomask2.
-- 
Michael



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 11 сентября 2015 15:12:04 Вы написали:

> > Ok.Let's come to the final decision with tuple_data_parse, and i will add
> > this switch there and to pure sql heap_page_item_attrs
>
> Fine for me.

So I've modified the code, now we have:

heap_page_items - have a column with raw tuple data

tuple_data_split - takes oid, raw tuple data, infomask, infomask2 and bits
parsed as string and returns bytea[] with attribute raw values. It also have
two optional arguments do_detoast that forces function to detoast attribute,
and warning_mode that allows to set this function to warning mode, and do not
stop working if some inconsistency were found.

there is also pure sql function heap_page_item_attrs that takes page data, and
table oid, and returns same data as heap_page_items but bytea[] of attributes
instead of one whole piece of raw data. It also has optional argument
do_detoast that allows to get bytea[] of detoasted attribute data.

I've decided that there is no real need in warning_mode in
heap_page_item_attrs so there is no such argument there.

So now it is still RFC. Final patch with documentation will come soon

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 11 сентября 2015 15:12:04 пользователь Michael Paquier написал:

> > Ok.Let's come to the final decision with tuple_data_parse, and i will add
> > this switch there and to pure sql heap_page_item_attrs
>
> Fine for me.

Here is final version with documentation.

Hope it will be the last one. :-)

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Fri, Sep 25, 2015 at 8:30 PM, Nikolay Shaplov wrote:
> Here is final version with documentation.

Thanks! I just had a short look at it:
- I am not convinced that it is worth declaring 3 versions of tuple_data_split.
- The patch does not respect the project code style, particularly
one-line "if condition {foo;}" are not adapted, code line is limited
to 80 characters, etc. The list is basically here:
http://www.postgresql.org/docs/current/static/source.html
- Be aware of typos: s/whitch/which is one.

+        <entry><structfield>t_infomask2</structfield></entry>
+        <entry><type>integer</type></entry>
+        <entry>stores number of attributes (11 bits of the word). The
rest are used for flag bits:
+<screen>
+0x2000 - tuple was updated and key cols modified, or tuple deleted
+0x4000 - tuple was HOT-updated
+0x8000 - this is heap-only tuple
+0xE000 - visibility-related bits (so called "hint bits")
+</screen>
This large chunk of documentation is a duplicate of storage.sgml. If
that's really necessary, it looks adapted to me to have more detailed
comments at code level directly in heapfuncs.c.

The example of tuple_data_split in the docs would be more interesting
if embedded with a call to heap_page_items.

> Hope it will be the last one. :-)

Unfortunately not :) But this is definitely going in the right
direction thinking that this code is mainly targeted for educational
purposes.
-- 
Michael



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 25 сентября 2015 20:59:29 пользователь Michael Paquier написал:

> > Here is final version with documentation.
>
> Thanks! I just had a short look at it:
> - I am not convinced that it is worth declaring 3 versions of
> tuple_data_split.
How which of them should we leave?

> - The patch does not respect the project code style,
> particularly one-line "if condition {foo;}" are not adapted, code line is
limited
> to 80 characters, etc. The list is basically here:
> http://www.postgresql.org/docs/current/static/source.html
I did my best. Results are attached.

> - Be aware of typos: s/whitch/which is one.
I've run spellchecker on all comments. Hope that I removed most of the
mistakes. But as I am not native speaker, I will not be able to eliminate them
all. I will need help here.

> +        <entry><structfield>t_infomask2</structfield></entry>
> +        <entry><type>integer</type></entry>
> +        <entry>stores number of attributes (11 bits of the word). The
> rest are used for flag bits:
> +<screen>
> +0x2000 - tuple was updated and key cols modified, or tuple deleted
> +0x4000 - tuple was HOT-updated
> +0x8000 - this is heap-only tuple
> +0xE000 - visibility-related bits (so called "hint bits")
> +</screen>
> This large chunk of documentation is a duplicate of storage.sgml. If
> that's really necessary, it looks adapted to me to have more detailed
> comments at code level directly in heapfuncs.c.
Hm... There is no explanation of t_infomask/t_infomask2 bits in storage.sgml.

If there is no source of information other then source code, then the
documentation is not good.

If there were information about t_infomask/t_infomask2 in storage.sgml, then I
would add "See storage.sgml for more info" into pageinspect doc, and thats
all. But since there is no such information there, I  think that the best
thing is to quote comments from source code there, so you can get all
information from documentation, not looking for it in the code.

So I would consider two options: Either move t_infomask/t_infomask2 details
into storage.sgml or leave as it is.

I am lazy, and does not feel confidence about touching main documentation, so I
would prefer to leave as it is.


> The example of tuple_data_split in the docs would be more interesting
> if embedded with a call to heap_page_items.

This example would be almost not readable. May be I should add one more praise
explaining where did we take arguments for tuple_data_split


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Sat, Sep 26, 2015 at 1:46 AM, Nikolay Shaplov wrote:
> В письме от 25 сентября 2015 20:59:29 пользователь Michael Paquier написал:
>> Thanks! I just had a short look at it:
>> - I am not convinced that it is worth declaring 3 versions of
>> tuple_data_split.
> How which of them should we leave?

The one with the most arguments. Now perhaps we could have as well two
of them: one which has the minimum number of arguments with default
values, and the second one with the maximum number of arguments.

>> - The patch does not respect the project code style,
>> particularly one-line "if condition {foo;}" are not adapted, code line is
> limited
>> to 80 characters, etc. The list is basically here:
>> http://www.postgresql.org/docs/current/static/source.html
> I did my best. Results are attached.

Thanks, it looks better.

>> - Be aware of typos: s/whitch/which is one.
> I've run spellchecker on all comments. Hope that I removed most of the
> mistakes. But as I am not native speaker, I will not be able to eliminate them
> all. I will need help here.

I have not spotted new mistakes regarding that.

>> +        <entry><structfield>t_infomask2</structfield></entry>
>> +        <entry><type>integer</type></entry>
>> +        <entry>stores number of attributes (11 bits of the word). The
>> rest are used for flag bits:
>> +<screen>
>> +0x2000 - tuple was updated and key cols modified, or tuple deleted
>> +0x4000 - tuple was HOT-updated
>> +0x8000 - this is heap-only tuple
>> +0xE000 - visibility-related bits (so called "hint bits")
>> +</screen>
>> This large chunk of documentation is a duplicate of storage.sgml. If
>> that's really necessary, it looks adapted to me to have more detailed
>> comments at code level directly in heapfuncs.c.
> Hm... There is no explanation of t_infomask/t_infomask2 bits in storage.sgml.
>
> If there is no source of information other then source code, then the
> documentation is not good.

pageinspect is already something that works at low-level. My guess is
that users of this module are going to have a look at the code either
way to understand how tuple data is manipulated within a page.

> So I would consider two options: Either move t_infomask/t_infomask2 details
> into storage.sgml or leave as it is.

The documentation redirects the reader to look at htup_details.h (the
documentation is actually incorrect, I sent a separate patch), and I
see no point in duplicating such low-level descriptions, that would be
double maintenance for the same result.

> I am lazy, and does not feel confidence about touching main documentation, so I
> would prefer to leave as it is.

Your patch is proving the contrary :) Honestly I would just rip out
the part you have added to describe all the fields related to
HeapTupleHeaderData, and have only a simple self-contained example to
show how to use the new function tuple_data_split.

>> The example of tuple_data_split in the docs would be more interesting
>> if embedded with a call to heap_page_items.
>
> This example would be almost not readable. May be I should add one more praise
> explaining where did we take arguments for tuple_data_split

I would as well just remove heap_page_item_attrs, an example in the
docs would be just enough IMO (note that I never mind being outvoted).

Btw, shouldn't the ereport messages in tuple_data_split use
error_level instead of ERROR?
Regards,
--
Michael



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 26 сентября 2015 20:57:25 Вы написали:

> >> Thanks! I just had a short look at it:
> >> - I am not convinced that it is worth declaring 3 versions of
> >> tuple_data_split.
> >
> > How which of them should we leave?
>
> The one with the most arguments. Now perhaps we could have as well two
> of them: one which has the minimum number of arguments with default
> values, and the second one with the maximum number of arguments.

I've been thinking about this warning mode for a while...

I've added this argument to the heap_page_item_atts, because if somebody would
like to force it to parse fake data, it would be really difficult to provide
proper fake data if one heap to perfectly fit tuple descriptor of another heap.
So to do this you have to lover error level.

But since now we actually parse data with tuple_data_split, we can provide a
precisely formed  fake information, so you are not limited with how it is
actually stored in page. You just pass any arguments you want. So you does not
need warning mode anymore.

So may be I should get rid of "warning mode" now.

Concerning do_detoast argument. I do not have any clear opinion. I thing that
tuple_data_split is a kind of internal function. So it is ok if it is not so
convenient to use.

But in  heap_page_item_attrs should be optional. Because it is used from plsql
console, and in most cases user does not want deTOASing.

So if in one function do_detoast is optional, may be it would be good to have
it optional in other functions to keep similarity...

So summorising: if you want make things simpler, I would first get totally rid
of warning_mode first of all, then if you would insist, make do_detoast non-
optional.

>
> >> +        <entry><structfield>t_infomask2</structfield></entry>
> >> +        <entry><type>integer</type></entry>
> >> +        <entry>stores number of attributes (11 bits of the word). The
> >> rest are used for flag bits:
> >> +<screen>
> >> +0x2000 - tuple was updated and key cols modified, or tuple deleted
> >> +0x4000 - tuple was HOT-updated
> >> +0x8000 - this is heap-only tuple
> >> +0xE000 - visibility-related bits (so called "hint bits")
> >> +</screen>
> >> This large chunk of documentation is a duplicate of storage.sgml. If
> >> that's really necessary, it looks adapted to me to have more detailed
> >> comments at code level directly in heapfuncs.c.
> >
> > Hm... There is no explanation of t_infomask/t_infomask2 bits in
> > storage.sgml.
> >
> > If there is no source of information other then source code, then the
> > documentation is not good.
>
> pageinspect is already something that works at low-level. My guess is
> that users of this module are going to have a look at the code either
> way to understand how tuple data is manipulated within a page.


> > So I would consider two options: Either move t_infomask/t_infomask2
> > details
> > into storage.sgml or leave as it is.
>
> The documentation redirects the reader to look at htup_details.h (the
> documentation is actually incorrect, I sent a separate patch)
I do not see any redirect at
http://www.postgresql.org/docs/9.4/static/pageinspect.html

> , and I
> see no point in duplicating such low-level descriptions, that would be
> double maintenance for the same result.
>
> > I am lazy, and does not feel confidence about touching main documentation,
> > so I would prefer to leave as it is.
>
> Your patch is proving the contrary :) Honestly I would just rip out
> the part you have added to describe all the fields related to
> HeapTupleHeaderData, and have only a simple self-contained example to
> show how to use the new function tuple_data_split.
May be. I would think about it a little bit more, and discuss it with my
friends

> >> The example of tuple_data_split in the docs would be more interesting
> >> if embedded with a call to heap_page_items.
> >
> > This example would be almost not readable. May be I should add one more
> > praise explaining where did we take arguments for tuple_data_split
>
> I would as well just remove heap_page_item_attrs, an example in the
> docs would be just enough IMO (note that I never mind being outvoted).

This is function that pageinspect user is definitely will use this function.
And as a user I would like to see function output in documentation the way I
would see it, in order to be better prepared to see it in real life. There are
limits however, get_raw_page output should not be shown in documentation ;-)
heap_page_item_attrs is close to these limits, but did not reach it, in my
opinion... So I'd prefer to keep it there.

> Btw, shouldn't the ereport messages in tuple_data_split use
> error_level instead of ERROR?
These errors are in the part where I parse t_bits from text back to bit
representation.  Here there is not chance to get non-strict behavior, since
there is no way to guess what should we do if we met characters then '0' or
'1'. Or what to do if there is only 7 characters but we should write whole
byte.etc...
Moreover as I wrote above bay be we just do not need warning_mode at all for
current situation.

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 26 сентября 2015 20:57:25 пользователь Michael Paquier написал:

> > So I would consider two options: Either move t_infomask/t_infomask2
> > details
> > into storage.sgml or leave as it is.
>
> The documentation redirects the reader to look at htup_details.h (the
> documentation is actually incorrect, I sent a separate patch), and I
> see no point in duplicating such low-level descriptions, that would be
> double maintenance for the same result.

What do you think about this? (I've changed only heap_tuple_items
documentation there)



--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:

On Tue, Sep 29, 2015 at 11:39 PM, Nikolay Shaplov wrote:
> But since now we actually parse data with tuple_data_split, we can provide a
> precisely formed  fake information, so you are not limited with how it is
> actually stored in page. You just pass any arguments you want. So you does not
> need warning mode anymore.

Yeah, I agree with you here, let's simplify it then. One could as well catch the error in a plpgsql wrapper function if that's really necessary and log the failed events at the same time in a custom way.

-                                 errmsg("input page too small (%d bytes)", raw_page_size)));
+                                        errmsg("input page too small (%d bytes)", raw_page_size)));
Please be careful of spurious diffs. Those just make the life of committers more difficult than it already is.

+     <para>
+     General idea about output columns: <function>lp_*</function> attributes
+     are about where tuple is located inside the page;
+     <function>t_xmin</function>, <function>t_xmax</function>,
+     <function>t_field3</function>, <function>t_ctid</function> are about
+     visibility of this tuplue inside or outside of the transaction;
+     <function>t_infomask2</function>, <function>t_infomask</function> has some
+     information about properties of attributes stored in tuple data.
+     <function>t_hoff</function> sais where tuple data begins and
+     <function>t_bits</function> sais which attributes are NULL and which are
+     not. Please notice that t_bits here is not an actual value that is stored
+     in tuple data, but it's text representation  with '0' and '1' charactrs.
+     </para>
I would remove that as well. htup_details.h contains all this information.

+     <para>
+     For more detailed information see documentation:
+     <xref linkend="storage-page-layout">, <xref linkend="ddl-system-columns">
+     and source code: <filename>src/include/storage/itemid.h</>,
+     <filename>src/include/access/htup_details.h</>.
+     </para>
This looks cool to me though.

+<screen>
+test=# select * from heap_page_item_attrs(get_raw_page('pg_range', 0),'pg_range'::regclass);
+[loooooong tuple data]
This example is too large in character per lines, I think that you should cut a major part of the fields, why not just keeping lp and t_attrs for example.

+      <tbody>
+       <row>
+        <entry><structfield>rel_oid</structfield></entry>
+        <entry><type>oid</type></entry>
+        <entry>OID of the relation, of the tuple we want to split</entry>
+       </row>
+
+       <row>
+        <entry><structfield>tuple_data</structfield></entry>
+        <entry><type>bytea</type></entry>
+        <entry>tuple raw data to split
+        </entry>
+       </row>
In the description of tuple_data_split, I am not sure it is worth listing all the argument of the function like that. IMO, we should just say: those are the fields returned by "heap_page_items". tuple_data should as well be renamed to t_data.

+      tuple attributes instead of one peace of raw tuple data. All other return
This is not that "peaceful" to me. It should be "piece" :)

+                       values[13] = PointerGetDatum(tuple_data_bytea);
+                       nulls[13] = false;
There is no point to set nulls[13] here.

+<screen>
+test=# select tuple_data_split('pg_range'::regclass, '\x400f00001700000000000000ba0700004a0f0000520f0000'::bytea, 2304, 6, null);
+                                   tuple_data_split
+---------------------------------------------------------------------------------------
+ {"\\x400f0000","\\x17000000","\\x00000000","\\xba070000","\\x4a0f0000","\\x520f0000"}
+(1 row)
This would be more demonstrative if combined with heap_page_items, like that for example:
SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask, t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
And actually this query crashes.
--
Michael

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier написал:



>
> -                                 errmsg("input page too small (%d bytes)",
> raw_page_size)));
> +                                        errmsg("input page too small (%d
> bytes)", raw_page_size)));
> Please be careful of spurious diffs. Those just make the life of committers
> more difficult than it already is.

Oh, that's not diff. That's I've fixed indent on the code I did not write :-)


> +     <para>
> +     General idea about output columns: <function>lp_*</function>
> attributes
> +     are about where tuple is located inside the page;
> +     <function>t_xmin</function>, <function>t_xmax</function>,
> +     <function>t_field3</function>, <function>t_ctid</function> are about
> +     visibility of this tuplue inside or outside of the transaction;
> +     <function>t_infomask2</function>, <function>t_infomask</function> has
> some
> +     information about properties of attributes stored in tuple data.
> +     <function>t_hoff</function> sais where tuple data begins and
> +     <function>t_bits</function> sais which attributes are NULL and which
> are
> +     not. Please notice that t_bits here is not an actual value that is
> stored
> +     in tuple data, but it's text representation  with '0' and '1'
> charactrs.
> +     </para>
> I would remove that as well. htup_details.h contains all this information.
Made it even more shorter. Just links and comments about representation of
t_bits.


> +<screen>
> +test=# select * from heap_page_item_attrs(get_raw_page('pg_range',
> 0),'pg_range'::regclass);
> +[loooooong tuple data]
> This example is too large in character per lines, I think that you should
> cut a major part of the fields, why not just keeping lp and t_attrs for
> example.
Did id.

>
> +      <tbody>
> +       <row>
> +        <entry><structfield>rel_oid</structfield></entry>
> +        <entry><type>oid</type></entry>
> +        <entry>OID of the relation, of the tuple we want to split</entry>
> +       </row>
> +
> +       <row>
> +        <entry><structfield>tuple_data</structfield></entry>
> +        <entry><type>bytea</type></entry>
> +        <entry>tuple raw data to split
> +        </entry>
> +       </row>
> In the description of tuple_data_split, I am not sure it is worth listing
> all the argument of the function like that. IMO, we should just say: those
> are the fields returned by "heap_page_items". tuple_data should as well be
> renamed to t_data.
Did it.

>
> +      tuple attributes instead of one peace of raw tuple data. All other
> return
> This is not that "peaceful" to me. It should be "piece" :)
Oops ;-)

> +                       values[13] = PointerGetDatum(tuple_data_bytea);
> +                       nulls[13] = false;
> There is no point to set nulls[13] here.
Oh, you are right!

>
> +<screen>
> +test=# select tuple_data_split('pg_range'::regclass,
> '\x400f00001700000000000000ba0700004a0f0000520f0000'::bytea, 2304, 6, null);
> +                                   tuple_data_split
> +---------------------------------------------------------------------------
> ------------ +
> {"\\x400f0000","\\x17000000","\\x00000000","\\xba070000","\\x4a0f0000","\\x5
> 20f0000"} +(1 row)
> This would be more demonstrative if combined with heap_page_items, like
> that for example:
> SELECT tuple_data_split('pg_class'::regclass, t_data, t_infomask,
> t_infomask2, t_bits) FROM heap_page_items(get_raw_page('pg_class', 0));
> And actually this query crashes.
Oh... It crached because I did not check that t_data can actually be NULL.

There also was a bug in original pageinspect, that did not get t_bits right
when there was OID in the tuple.  I've fixed it too.


Here is next patch in attachment.


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:


On Thu, Oct 1, 2015 at 8:13 PM, Nikolay Shaplov wrote:
> В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier написал:
>>
>> -                                 errmsg("input page too small (%d bytes)",
>> raw_page_size)));
>> +                                        errmsg("input page too small (%d
>> bytes)", raw_page_size)));
>> Please be careful of spurious diffs. Those just make the life of committers
>> more difficult than it already is.
>
> Oh, that's not diff. That's I've fixed indent on the code I did not write :-)

pgindent would have taken care of that if needed. There is no need to add noise in the code for this patch.

>> +     <para>
>> +     General idea about output columns: <function>lp_*</function>
>> attributes
>> +     are about where tuple is located inside the page;
>> +     <function>t_xmin</function>, <function>t_xmax</function>,
>> +     <function>t_field3</function>, <function>t_ctid</function> are about
>> +     visibility of this tuplue inside or outside of the transaction;
>> +     <function>t_infomask2</function>, <function>t_infomask</function> has
>> some
>> +     information about properties of attributes stored in tuple data.
>> +     <function>t_hoff</function> sais where tuple data begins and
>> +     <function>t_bits</function> sais which attributes are NULL and which
>> are
>> +     not. Please notice that t_bits here is not an actual value that is
>> stored
>> +     in tuple data, but it's text representation  with '0' and '1'
>> charactrs.
>> +     </para>
>> I would remove that as well. htup_details.h contains all this information.
> Made it even more shorter. Just links and comments about representation of
> t_bits.

I would completely remove this part.

> There also was a bug in original pageinspect, that did not get t_bits right
> when there was OID in the tuple.  I've fixed it too.

Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an OID it will be stored at the end of HeapTupleHeader and t_hoff includes it, so bits_len is definitely larger of 4 bytes should an OID be present.
               if (tuphdr->t_infomask & HEAP_HASNULL)
               {
-                       bits_len = tuphdr->t_hoff -
-                           offsetof(HeapTupleHeaderData, t_bits);
+                      int     bits_len =
+                          ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 + 1)*8;
 
                        values[11] = CStringGetTextDatum(
-                                           bits_to_text(tuphdr->t_bits, bits_len * 8));
+                                          bits_to_text(tuphdr->t_bits, bits_len));
                                }
And here is what you are referring to in your patch. I think that we should instead check for HEAP_HASOID and reduce bits_len by the size of Oid should one be present. As this is a bug that applies to all the existing versions of Postgres it would be good to extract it as a separate patch and then apply your own patch on top of it instead of including in your feature. Attached is a patch, this should be applied down to 9.0 I guess. Could you rebase your patch on top of it?

> Here is next patch in attachment.

Cool, thanks.

-test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
+
+test=# select * from heap_page_items(get_raw_page('pg_range', 0));
This example in the docs is far too long in character length... We should get that reduced.

+      Please notice that <function>t_bits</function> in tuple header structure
+      is a binary bitmap, but <function>heap_page_items</function> returns
+      <function>t_bits</function> as human readable text representation of
+      original <function>t_bits</function> bitmap.
This had better remove the mention to "please notice". Still as this is already described in htup_details.h there is no real point to describe it again here: that's a bitmap of NULLs.
--
Michael
Attachment

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 2 октября 2015 13:10:22 Вы написали:

> > There also was a bug in original pageinspect, that did not get t_bits
>
> right
>
> > when there was OID in the tuple.  I've fixed it too.
>
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>                if (tuphdr->t_infomask & HEAP_HASNULL)
>                {
> -                       bits_len = tuphdr->t_hoff -
> -                           offsetof(HeapTupleHeaderData, t_bits);
> +                      int     bits_len =
> +                          ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
>
>                         values[11] = CStringGetTextDatum(
> -                                           bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +                                          bits_to_text(tuphdr->t_bits,
> bits_len));
>                                 }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
No we should not do it, because after t_bits there always goes padding bytes.
So OID or the top of tuple data is properly aligned. So we should not use
t_hoff for determinating t_bit's size at all.

Here is an example. I create a table with 10 columns and OID. (ten is greater
then 8, so there should be two bytes of t_bits data)

create table test3 (a1 int, a2 int, a3 int, a4 int,a5 int,a6 int, a7 int,a8 int, a9 int, a10 int) with oids;
insert into test3 VALUES
(1,2,3,4,5,6,7,8,null,10);

With your patch we get

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));lp |                  t_bits
     |                                   t_data                                    

----+------------------------------------------+----------------------------------------------------------------------------
1| 1111111101000000000000000000000000000000 |
\x01000000020000000300000004000000050000000600000007000000080000000a000000
(1 row)


here we get 40 bits of t_bits.

With my way to calculate t_bits length we get

test=# select lp, t_bits, t_data from heap_page_items(get_raw_page('test3', 0));lp |      t_bits      |
                 t_data                                    
----+------------------+---------------------------------------------------------------------------- 1 |
1111111101000000| \x01000000020000000300000004000000050000000600000007000000080000000a000000 
(1 row)

16 bits as expected.

So I would keep my version of bits_len calculation


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 2 октября 2015 13:10:22 пользователь Michael Paquier написал:
> >> +     <para>
> >> +     General idea about output columns: <function>lp_*</function>
> >> attributes
> >> +     are about where tuple is located inside the page;
> >> +     <function>t_xmin</function>, <function>t_xmax</function>,
> >> +     <function>t_field3</function>, <function>t_ctid</function> are
>
> about
>
> >> +     visibility of this tuplue inside or outside of the transaction;
> >> +     <function>t_infomask2</function>, <function>t_infomask</function>
>
> has
>
> >> some
> >> +     information about properties of attributes stored in tuple data.
> >> +     <function>t_hoff</function> sais where tuple data begins and
> >> +     <function>t_bits</function> sais which attributes are NULL and
>
> which
>
> >> are
> >> +     not. Please notice that t_bits here is not an actual value that is
> >> stored
> >> +     in tuple data, but it's text representation  with '0' and '1'
> >> charactrs.
> >> +     </para>
> >> I would remove that as well. htup_details.h contains all this
>
> information.
>
> > Made it even more shorter. Just links and comments about representation of
> > t_bits.
>
> I would completely remove this part.

Michael my hand would not do it. I've been working as a lecturer for six
years. If I want to pass an information in a comfortable way to reader, there
should go some binding phrase. It may be very vague, but it should outline
(may be in a very brief way, but still outline) an information that would be
found if he follows the links.

If we just give links "knowledge flow" will be uncomfortable for person who
reads it.


> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.

May be should do

\x and limit 1 like this:

test=# select * from heap_page_items(get_raw_page('pg_range', 0)) limit 1;
-[ RECORD 1 ]---------------------------------------------------
lp          | 1
lp_off      | 8144
lp_flags    | 1
lp_len      | 48
t_xmin      | 1
t_xmax      | 0
t_field3    | 0
t_ctid      | (0,1)
t_infomask2 | 6
t_infomask  | 2304
t_hoff      | 24
t_bits      |
t_oid       |
t_data      | \x400f00001700000000000000ba0700004a0f0000520f0000

????


> +      Please notice that <function>t_bits</function> in tuple header
> structure
> +      is a binary bitmap, but <function>heap_page_items</function> returns
> +      <function>t_bits</function> as human readable text representation of
> +      original <function>t_bits</function> bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.

heap_page_items returns text(!) as t_bits. This is unexpected. This is not
described in page layout documentation. We should tell about it here
explicitly.


--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
So what's next?
We need something else to discuss?
We need somebody else's opinion to rule this out?
Or it's ready to commit, and just not marked this way?

I am going to make report based on this patch in Vienna. It would be
nice to have it committed until then :)

On 02.10.2015 07:10, Michael Paquier wrote:
> On Thu, Oct 1, 2015 at 8:13 PM, Nikolay Shaplov wrote:
>> В письме от 30 сентября 2015 13:49:00 пользователь Michael Paquier
> написал:
>>> -                                 errmsg("input page too small (%d
> bytes)",
>>> raw_page_size)));
>>> +                                        errmsg("input page too small (%d
>>> bytes)", raw_page_size)));
>>> Please be careful of spurious diffs. Those just make the life of
> committers
>>> more difficult than it already is.
>> Oh, that's not diff. That's I've fixed indent on the code I did not write
> :-)
>
> pgindent would have taken care of that if needed. There is no need to add
> noise in the code for this patch.
>
>>> +     <para>
>>> +     General idea about output columns: <function>lp_*</function>
>>> attributes
>>> +     are about where tuple is located inside the page;
>>> +     <function>t_xmin</function>, <function>t_xmax</function>,
>>> +     <function>t_field3</function>, <function>t_ctid</function> are
> about
>>> +     visibility of this tuplue inside or outside of the transaction;
>>> +     <function>t_infomask2</function>, <function>t_infomask</function>
> has
>>> some
>>> +     information about properties of attributes stored in tuple data.
>>> +     <function>t_hoff</function> sais where tuple data begins and
>>> +     <function>t_bits</function> sais which attributes are NULL and
> which
>>> are
>>> +     not. Please notice that t_bits here is not an actual value that is
>>> stored
>>> +     in tuple data, but it's text representation  with '0' and '1'
>>> charactrs.
>>> +     </para>
>>> I would remove that as well. htup_details.h contains all this
> information.
>> Made it even more shorter. Just links and comments about representation of
>> t_bits.
> I would completely remove this part.
>
>> There also was a bug in original pageinspect, that did not get t_bits
> right
>> when there was OID in the tuple.  I've fixed it too.
> Aha. Good catch! By looking at HeapTupleHeaderGetOid if the tuple has an
> OID it will be stored at the end of HeapTupleHeader and t_hoff includes it,
> so bits_len is definitely larger of 4 bytes should an OID be present.
>                if (tuphdr->t_infomask & HEAP_HASNULL)
>                {
> -                       bits_len = tuphdr->t_hoff -
> -                           offsetof(HeapTupleHeaderData, t_bits);
> +                      int     bits_len =
> +                          ((tuphdr->t_infomask2 & HEAP_NATTS_MASK)/8 +
> 1)*8;
>
>                         values[11] = CStringGetTextDatum(
> -                                           bits_to_text(tuphdr->t_bits,
> bits_len * 8));
> +                                          bits_to_text(tuphdr->t_bits,
> bits_len));
>                                 }
> And here is what you are referring to in your patch. I think that we should
> instead check for HEAP_HASOID and reduce bits_len by the size of Oid should
> one be present. As this is a bug that applies to all the existing versions
> of Postgres it would be good to extract it as a separate patch and then
> apply your own patch on top of it instead of including in your feature.
> Attached is a patch, this should be applied down to 9.0 I guess. Could you
> rebase your patch on top of it?
>
>> Here is next patch in attachment.
> Cool, thanks.
>
> -test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
> +
> +test=# select * from heap_page_items(get_raw_page('pg_range', 0));
> This example in the docs is far too long in character length... We should
> get that reduced.
>
> +      Please notice that <function>t_bits</function> in tuple header
> structure
> +      is a binary bitmap, but <function>heap_page_items</function> returns
> +      <function>t_bits</function> as human readable text representation of
> +      original <function>t_bits</function> bitmap.
> This had better remove the mention to "please notice". Still as this is
> already described in htup_details.h there is no real point to describe it
> again here: that's a bitmap of NULLs.
>
>




Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
> So what's next?

Wait and see a bit.

> We need something else to discuss?
> We need somebody else's opinion to rule this out?

The spec of the patch looks clear to me.

> Or it's ready to commit, and just not marked this way?

No, I don't think we have reached this state yet.

> I am going to make report based on this patch in Vienna. It would be
> nice to have it committed until then :)

This is registered in this November's CF and the current one is not
completely wrapped up, so I haven't rushed into looking at it.
Regards,
-- 
Michael



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
> On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
>> Or it's ready to commit, and just not marked this way?
>
> No, I don't think we have reached this state yet.
>
>> I am going to make report based on this patch in Vienna. It would be
>> nice to have it committed until then :)
>
> This is registered in this November's CF and the current one is not
> completely wrapped up, so I haven't rushed into looking at it.

So, I have finally been able to look at this patch in more details,
resulting in the attached. I noticed a couple of things that should be
addressed, mainly:
- addition of a new routine text_to_bits to perform the reverse
operation of bits_to_text. This was previously part of
tuple_data_split, I think that it deserves its own function.
- split_tuple_data should be static
- t_bits_str should not be a text, rather a char* fetched using
text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
perform extra calculations with VARSIZE and VARHDRSZ
- split_tuple_data can directly use the relation OID instead of the
tuple descriptor of the relation
- t_bits was leaking memory. For correctness I think that it is better
to free it after calling split_tuple_data.
- PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
well in raw_attr actually. I refactored the code such as a bytea* is
used and always freed when allocated to avoid leaks. Removing raw_attr
actually simplified the code a bit.
- I simplified the docs, that was largely too verbose in my opinion.
- Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
me, those other ones are much more low-level and not really spread in
the backend code.
- Found some typos in the code, the docs and some comments. I reworked
the error messages as well.
- The code format was not really in line with the project guidelines.
I fixed that as well.
- I removed heap_page_item_attrs for now to get this patch in a
bare-bone state. Though I would not mind if this is re-added (I
personally don't think that's much necessary in the module
actually...).
- The calculation of the length of t_bits using HEAP_NATTS_MASK is
more correct as you mentioned earlier, so I let it in its state.
That's actually more accurate for error handling as well.
That's everything I recall I have. How does this look?
--
Michael

Attachment

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 28 октября 2015 16:57:36 пользователь Michael Paquier написал:
> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
> >> Or it's ready to commit, and just not marked this way?
> >
> > No, I don't think we have reached this state yet.
> >
> >> I am going to make report based on this patch in Vienna. It would be
> >> nice to have it committed until then :)
> >
> > This is registered in this November's CF and the current one is not
> > completely wrapped up, so I haven't rushed into looking at it.
>
> So, I have finally been able to look at this patch in more details,
> resulting in the attached. I noticed a couple of things that should be
> addressed, mainly:
> - addition of a new routine text_to_bits to perform the reverse
> operation of bits_to_text. This was previously part of
> tuple_data_split, I think that it deserves its own function.
> - split_tuple_data should be static
> - t_bits_str should not be a text, rather a char* fetched using
> text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
> perform extra calculations with VARSIZE and VARHDRSZ
> - split_tuple_data can directly use the relation OID instead of the
> tuple descriptor of the relation
> - t_bits was leaking memory. For correctness I think that it is better
> to free it after calling split_tuple_data.
> - PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
> well in raw_attr actually. I refactored the code such as a bytea* is
> used and always freed when allocated to avoid leaks. Removing raw_attr
> actually simplified the code a bit.
> - I simplified the docs, that was largely too verbose in my opinion.
> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
> VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
> me, those other ones are much more low-level and not really spread in
> the backend code.
> - Found some typos in the code, the docs and some comments. I reworked
> the error messages as well.
> - The code format was not really in line with the project guidelines.
> I fixed that as well.
> - I removed heap_page_item_attrs for now to get this patch in a
> bare-bone state. Though I would not mind if this is re-added (I
> personally don't think that's much necessary in the module
> actually...).
> - The calculation of the length of t_bits using HEAP_NATTS_MASK is
> more correct as you mentioned earlier, so I let it in its state.
> That's actually more accurate for error handling as well.
> That's everything I recall I have. How does this look?
You've completely rewrite everything ;-)

Let everything be the way you wrote. This code is better than mine.

With one exception. I really  need heap_page_item_attrs function. I used it in
my Tuples Internals presentation
https://github.com/dhyannataraj/tuple-internals-presentation
and I am 100% sure that this function is needed for educational purposes, and
this function should be as simple as possible, so students cat use it without
extra efforts.

I still have an opinion that documentation should be more verbose, than your
version, but I can accept your version.

Who is going to add heap_page_item_attrs to your patch? me or you?

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Thu, Nov 12, 2015 at 12:41 AM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
> В письме от 28 октября 2015 16:57:36 пользователь Michael Paquier написал:
>> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier wrote:
>> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:
>> >> Or it's ready to commit, and just not marked this way?
>> >
>> > No, I don't think we have reached this state yet.
>> >
>> >> I am going to make report based on this patch in Vienna. It would be
>> >> nice to have it committed until then :)
>> >
>> > This is registered in this November's CF and the current one is not
>> > completely wrapped up, so I haven't rushed into looking at it.
>>
>> So, I have finally been able to look at this patch in more details,
>> resulting in the attached. I noticed a couple of things that should be
>> addressed, mainly:
>> - addition of a new routine text_to_bits to perform the reverse
>> operation of bits_to_text. This was previously part of
>> tuple_data_split, I think that it deserves its own function.
>> - split_tuple_data should be static
>> - t_bits_str should not be a text, rather a char* fetched using
>> text_to_cstring(PG_GETARG_TEXT_PP(4)). This way there is no need to
>> perform extra calculations with VARSIZE and VARHDRSZ
>> - split_tuple_data can directly use the relation OID instead of the
>> tuple descriptor of the relation
>> - t_bits was leaking memory. For correctness I think that it is better
>> to free it after calling split_tuple_data.
>> - PG_DETOAST_DATUM_COPY allocates some memory, this was leaking as
>> well in raw_attr actually. I refactored the code such as a bytea* is
>> used and always freed when allocated to avoid leaks. Removing raw_attr
>> actually simplified the code a bit.
>> - I simplified the docs, that was largely too verbose in my opinion.
>> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using
>> VARATT_IS_EXTERNAL and VARATT_IS_EXTERNAL_ONDISK seems more adapted to
>> me, those other ones are much more low-level and not really spread in
>> the backend code.
>> - Found some typos in the code, the docs and some comments. I reworked
>> the error messages as well.
>> - The code format was not really in line with the project guidelines.
>> I fixed that as well.
>> - I removed heap_page_item_attrs for now to get this patch in a
>> bare-bone state. Though I would not mind if this is re-added (I
>> personally don't think that's much necessary in the module
>> actually...).
>> - The calculation of the length of t_bits using HEAP_NATTS_MASK is
>> more correct as you mentioned earlier, so I let it in its state.
>> That's actually more accurate for error handling as well.
>> That's everything I recall I have. How does this look?
> You've completely rewrite everything ;-)
>
> Let everything be the way you wrote. This code is better than mine.
>
> With one exception. I really  need heap_page_item_attrs function. I used it in
> my Tuples Internals presentation
> https://github.com/dhyannataraj/tuple-internals-presentation
> and I am 100% sure that this function is needed for educational purposes, and
> this function should be as simple as possible, so students can use it without
> extra efforts.

Fine. That's your patch after all.

> I still have an opinion that documentation should be more verbose, than your
> version, but I can accept your version.

I am not sure that's necessary, pageinspect is for developers.

> Who is going to add heap_page_item_attrs to your patch? me or you?

I recall some parts of the code I still did not like much :) I'll grab
some room to have an extra look at it.
--
Michael



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Thu, Nov 12, 2015 at 9:04 AM, Michael Paquier wrote:
> On Thu, Nov 12, 2015 at 12:41 AM, Nikolay Shaplov wrote:
>> I still have an opinion that documentation should be more verbose, than your
>> version, but I can accept your version.
>
> I am not sure that's necessary, pageinspect is for developers.
>
>> Who is going to add heap_page_item_attrs to your patch? me or you?
>
> I recall some parts of the code I still did not like much :) I'll grab
> some room to have an extra look at it.

I have added back heap_page_item_attrs the patch attached, fixing at
the same time some typos found while looking again at the code. If you
are fine with this version, let's have a committer look at it.
--
Michael

Attachment

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
> >> I still have an opinion that documentation should be more verbose, than
> >> your version, but I can accept your version.
> >
> > I am not sure that's necessary, pageinspect is for developers.
> >
> >> Who is going to add heap_page_item_attrs to your patch? me or you?
> >
> > I recall some parts of the code I still did not like much :) I'll grab
> > some room to have an extra look at it.
>
> I have added back heap_page_item_attrs the patch attached, fixing at
> the same time some typos found while looking again at the code. If you
> are fine with this version, let's have a committer look at it.

Everything seems to be ok. I've changed only one thing in your version
of the patch. I've renamed split_tuple_data to
tuple_data_split_internal, because when there are split_tuple_data and
tuple_data_split in the same file, nobody will guess what the difference
between these function and why are they named in such a strange way.

If it is ok, set proper status, and let commiter do his job :-)



--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
On Wed, Nov 18, 2015 at 3:10 AM, Nikolay Shaplov wrote:
> Everything seems to be ok. I've changed only one thing in your version
> of the patch. I've renamed split_tuple_data to
> tuple_data_split_internal, because when there are split_tuple_data and
> tuple_data_split in the same file, nobody will guess what the difference
> between these function and why are they named in such a strange way.

Yep, that sounds better this way.

> If it is ok, set proper status, and let commiter do his job :-)

OK. I have switched the status of this patch to "Ready for committer"
(please, committer-san, double-check the area around
tuple_data_split_internal when fetching data for each attribute, I
think that we got that right but I may be missing something as well).
-- 
Michael



Re: pageinspect patch, for showing tuple data

From
Guillaume Lelarge
Date:
<p dir="ltr">Hi,<p dir="ltr">Le 12 nov. 2015 1:05 AM, "Michael Paquier" <<a
href="mailto:michael.paquier@gmail.com">michael.paquier@gmail.com</a>>a écrit :<br /> ><br /> > On Thu, Nov
12,2015 at 12:41 AM, Nikolay Shaplov<br /> > <<a
href="mailto:n.shaplov@postgrespro.ru">n.shaplov@postgrespro.ru</a>>wrote:<br /> > > В письме от 28 октября
201516:57:36 пользователь Michael Paquier написал:<br /> > >> On Sat, Oct 17, 2015 at 1:48 AM, Michael Paquier
wrote:<br/> > >> > On Sat, Oct 17, 2015 at 5:15 AM, Nikolay Shaplov wrote:<br /> > >> >> Or
it'sready to commit, and just not marked this way?<br /> > >> ><br /> > >> > No, I don't think
wehave reached this state yet.<br /> > >> ><br /> > >> >> I am going to make report based on
thispatch in Vienna. It would be<br /> > >> >> nice to have it committed until then :)<br /> >
>>><br /> > >> > This is registered in this November's CF and the current one is not<br /> >
>>> completely wrapped up, so I haven't rushed into looking at it.<br /> > >><br /> > >> So,
Ihave finally been able to look at this patch in more details,<br /> > >> resulting in the attached. I noticed
acouple of things that should be<br /> > >> addressed, mainly:<br /> > >> - addition of a new routine
text_to_bitsto perform the reverse<br /> > >> operation of bits_to_text. This was previously part of<br />
>>> tuple_data_split, I think that it deserves its own function.<br /> > >> - split_tuple_data should
bestatic<br /> > >> - t_bits_str should not be a text, rather a char* fetched using<br /> > >>
text_to_cstring(PG_GETARG_TEXT_PP(4)).This way there is no need to<br /> > >> perform extra calculations with
VARSIZEand VARHDRSZ<br /> > >> - split_tuple_data can directly use the relation OID instead of the<br /> >
>>tuple descriptor of the relation<br /> > >> - t_bits was leaking memory. For correctness I think that
itis better<br /> > >> to free it after calling split_tuple_data.<br /> > >> - PG_DETOAST_DATUM_COPY
allocatessome memory, this was leaking as<br /> > >> well in raw_attr actually. I refactored the code such as
abytea* is<br /> > >> used and always freed when allocated to avoid leaks. Removing raw_attr<br /> >
>>actually simplified the code a bit.<br /> > >> - I simplified the docs, that was largely too verbose
inmy opinion.<br /> > >> - Instead of using VARATT_IS_1B_E and VARTAG_EXTERNAL, using<br /> > >>
VARATT_IS_EXTERNALand VARATT_IS_EXTERNAL_ONDISK seems more adapted to<br /> > >> me, those other ones are much
morelow-level and not really spread in<br /> > >> the backend code.<br /> > >> - Found some typos in
thecode, the docs and some comments. I reworked<br /> > >> the error messages as well.<br /> > >> -
Thecode format was not really in line with the project guidelines.<br /> > >> I fixed that as well.<br /> >
>>- I removed heap_page_item_attrs for now to get this patch in a<br /> > >> bare-bone state. Though I
wouldnot mind if this is re-added (I<br /> > >> personally don't think that's much necessary in the module<br
/>> >> actually...).<br /> > >> - The calculation of the length of t_bits using HEAP_NATTS_MASK is<br
/>> >> more correct as you mentioned earlier, so I let it in its state.<br /> > >> That's actually
moreaccurate for error handling as well.<br /> > >> That's everything I recall I have. How does this look?<br
/>> > You've completely rewrite everything ;-)<br /> > ><br /> > > Let everything be the way you
wrote.This code is better than mine.<br /> > ><br /> > > With one exception. I really  need
heap_page_item_attrsfunction. I used it in<br /> > > my Tuples Internals presentation<br /> > > <a
href="https://github.com/dhyannataraj/tuple-internals-presentation">https://github.com/dhyannataraj/tuple-internals-presentation</a><br
/>> > and I am 100% sure that this function is needed for educational purposes, and<br /> > > this function
shouldbe as simple as possible, so students can use it without<br /> > > extra efforts.<br /> ><br /> >
Fine.That's your patch after all.<br /> ><br /> > > I still have an opinion that documentation should be more
verbose,than your<br /> > > version, but I can accept your version.<br /> ><br /> > I am not sure that's
necessary,pageinspect is for developers.<br /> ><p dir="ltr">FWIW, I agree that pageinspect is mostly for devs.
Still,as i said to Nikolay after his talk at <a href="http://pgconf.eu">pgconf.eu</a>, it's a nice tool for people who
liketo know what's going on deep inside PostgreSQL.<p dir="ltr">So +1 for that nice feature.<p dir="ltr">> > Who
isgoing to add heap_page_item_attrs to your patch? me or you?<br /> ><br /> > I recall some parts of the code I
stilldid not like much :) I'll grab<br /> > some room to have an extra look at it. 

Re: pageinspect patch, for showing tuple data

From
Teodor Sigaev
Date:
-                   bits_len = tuphdr->t_hoff -
-                       offsetof(HeapTupleHeaderData, t_bits);
+                   int bits_len =
+                           ((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;

As I understand offline comments of Nikolay, current version of page inspect 
contains an mistake here. Should we backpatch this?

> OK. I have switched the status of this patch to "Ready for committer"
> (please, committer-san, double-check the area around
> tuple_data_split_internal when fetching data for each attribute, I
> think that we got that right but I may be missing something as well).
Looks good for a first glance


-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 



Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:


On Wed, Nov 25, 2015 at 10:06 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:
-                   bits_len = tuphdr->t_hoff -
-                       offsetof(HeapTupleHeaderData, t_bits);
+                   int bits_len =
+                           ((tuphdr->t_infomask2 & HEAP_NATTS_MASK) / 8 + 1) * 8;

As I understand offline comments of Nikolay, current version of page inspect contains an mistake here. Should we backpatch this?

As far as I understood from the code, the current code in pageinspect is overestimating the length of t_bits. So that's actually harmless. For the sake of this patch though this is needed to perform the sanity checks in place when scanning raw page entries.
--
Michael

Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:

Teodor Sigaev asked a very good question: does it properly do upgrade from 1.3
to 1.4

I've rechecked and fixed

here is a patch.

> On Wed, Nov 18, 2015 at 3:10 AM, Nikolay Shaplov wrote:
> > Everything seems to be ok. I've changed only one thing in your version
> > of the patch. I've renamed split_tuple_data to
> > tuple_data_split_internal, because when there are split_tuple_data and
> > tuple_data_split in the same file, nobody will guess what the difference
> > between these function and why are they named in such a strange way.
>
> Yep, that sounds better this way.
>
> > If it is ok, set proper status, and let commiter do his job :-)
>
> OK. I have switched the status of this patch to "Ready for committer"
> (please, committer-san, double-check the area around
> tuple_data_split_internal when fetching data for each attribute, I
> think that we got that right but I may be missing something as well).

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
Attachment

Re: pageinspect patch, for showing tuple data

From
Michael Paquier
Date:
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Wed, Nov 25, 2015 at 10:16 PM, Nikolay
Shaplov<span dir="ltr"><<a href="mailto:n.shaplov@postgrespro.ru"
target="_blank">n.shaplov@postgrespro.ru</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex">В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier
написал:<br/> Teodor Sigaev asked a very good question: does it properly do upgrade from 1.3<br /> to 1.4<br /><br />
I'verechecked and fixed<br /></blockquote></div><br /></div><div class="gmail_extra">What was actually the problem? I
haveto admit that I forgot to test that directly, and did not spot anything obvious on the 1.3--1.4.sql file.<br />--
<br/><div class="gmail_signature">Michael<br /></div></div></div> 

Re: pageinspect patch, for showing tuple data

From
Teodor Sigaev
Date:
> Teodor Sigaev asked a very good question: does it properly do upgrade from 1.3
> to 1.4
>
> I've rechecked and fixed
>
> here is a patch.

Committed, thank you.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 



Re: pageinspect patch, for showing tuple data

From
Nikolay Shaplov
Date:
В письме от 25 ноября 2015 22:27:57 пользователь Michael Paquier написал:
> On Wed, Nov 25, 2015 at 10:16 PM, Nikolay Shaplov <n.shaplov@postgrespro.ru>
> wrote:
> > В письме от 18 ноября 2015 15:52:54 пользователь Michael Paquier написал:
> > Teodor Sigaev asked a very good question: does it properly do upgrade from
> > 1.3
> > to 1.4
> >
> > I've rechecked and fixed
>
> What was actually the problem?

tuple_data_split should be defined before heap_page_item_attrs.

And there were also error in the first argument of tuple_data_split there. It
should be  "rel_oid oid" instead of "t_oid rel_oid"

> I have to admit that I forgot to test that
> directly, and did not spot anything obvious on the 1.3--1.4.sql file.
yes. but each of us added a non-obvious mistake there :-)

--
Nikolay Shaplov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company