Re: pageinspect patch, for showing tuple data - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: pageinspect patch, for showing tuple data
Date
Msg-id CAB7nPqQorL5EGbWXyrTRpR8Ys7O064zvTBWkPXELbwhAWWqzRA@mail.gmail.com
Whole thread Raw
In response to Re: pageinspect patch, for showing tuple data  (Nikolay Shaplov <n.shaplov@postgrespro.ru>)
Responses Re: pageinspect patch, for showing tuple data  (Nikolay Shaplov <n.shaplov@postgrespro.ru>)
Re: pageinspect patch, for showing tuple data  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Wed, Sep 9, 2015 at 8:39 PM, Nikolay Shaplov
<n.shaplov@postgrespro.ru> wrote:
> В письме от 8 сентября 2015 11:53:24 Вы написали:
>> Hence, instead of all those complications, why not simply introducing two
>> functions that take as input the tuple data and the OID of the relation,
>> returning those bytea arrays? It seems to be that this would be a more
>> handy interface, and as this is for educational purposes I guess that the
>> addition of the overhead induced by the extra function call would be
>> acceptable.
>
> When I looked at this task I considered doing the same thing. Bun unfortunately it is
> not possible. (Or if be more correct it is possible, but if I do so, it would be hard to us
> e it)
> The thing is, that to properly split tuple data into attributes, we need some values from
> tuple header:
> t_infomask2: where postgres store actual number of stored attributes, that may differ
> from one in tuple data. That will allow to properly parse tuples after
> ALTER TABLE ADD COLUMN when it was done without SET DEFAULT option
> t_infomask: has bit that indicates that there is some null attributes in tuple.
> t_bits: has a bit mask that shows what attributes were set to null.
>
> So if move tuple data parsing into separate function, then we have to pass these
> values alongside the tuple data. This is not convenient at all.
> So I did it in a way you see in the patch.

Why is it not convenient at all? Yes, you have a point, we need those
fields to be able to parse the t_data properly. Still the possibility
to show individual fields of a tuple as a bytea array either with
toasted or detoasted values is a concept completely different from
simply showing the page items, which is what, it seems to me,
heap_page_items is aimed to only do. Hence, As t_infomask2, t_infomask
and t_bits are already available as return fields of heap_page_items,
we should simply add a function like that:
heap_page_item_parse(Oid relid, bytea data, t_infomask2 int,
t_infomask int, t_bits int, bool force_detoast, warning_mode bool)
returns bytea[]

Note that the data corruption checks apply only to this function as
far as I understand, so I think that things could be actually split
into two independent patches:
1) Upgrade heap_page_items to add the tuple data as bytea.
2) Add the new function able to parse those fields appropriately.

As this code, as you justly mentioned, is aimed mainly for educational
purposes to understand a page structure, we should definitely make it
as simple as possible at code level, and it seems to me that this
comes with a separate SQL interface to control tuple data parsing as a
bytea[]. We are doing no favor to our users to complicate the code of
pageinspect.c as this patch is doing in its current state, especially
if beginners want to have a look at it.

>> Actually not two functions, but just one, with an extra flag to be
>> able to enforce detoasting on the field values where this can be done.
>
> Yeah, I also thought about it. But did not come to any final conclusion. Should
> we add forth argument, that enables detoasting, instead of adding separate
> function?

This is definitely something you want to control with a switch.
--
Michael



pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: checkpointer continuous flushing
Next
From: Michael Paquier
Date:
Subject: Re: [PATCH v2] GSSAPI encryption support