Thread: Heap page diagnostic/test functions (WIP)

Heap page diagnostic/test functions (WIP)

From
"Simon Riggs"
Date:
WIP patch for diagnostic/test functions for heap pages. (Linked to
discussion thread on -hackers "HOT - Whats Next?")

Specifically designed to allow test cases to be written that prove that
HOT works, as well as allowing diagnosis of general heap page content
errors.

Patch, plus additional file: /contrib/pgstattuple/pgstatheap.c

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com


Attachment

Re: Heap page diagnostic/test functions (WIP)

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> WIP patch for diagnostic/test functions for heap pages. (Linked to
> discussion thread on -hackers "HOT - Whats Next?")

--- no security checks; surely these must be superuser-only.

--- relation_open will succeed on things that don't have storage;
better use heap_open (and check it's not a view).

--- most of the validation functions are quite pointless as bufmgr will
refuse to load a page with bad header data.

> Specifically designed to allow test cases to be written that prove that
> HOT works,

Exactly what will these allow that you can't do with inspection of ctid
etc?  (I suspect your answer will be "can't see infomask", but I'd
rather expose that as a new system column than invent functions like
these.)  I'm pretty dubious of the premise anyway --- to get results
sufficiently constant that the current regression test comparison
mechanism works for them, I think you'll have to constrain the test
conditions so much that the test will prove little or nothing.

            regards, tom lane

Re: Heap page diagnostic/test functions (WIP)

From
"Simon Riggs"
Date:
On Mon, 2007-03-05 at 14:31 -0500, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > WIP patch for diagnostic/test functions for heap pages. (Linked to
> > discussion thread on -hackers "HOT - Whats Next?")
>
> --- no security checks; surely these must be superuser-only.

OK thanks

> --- relation_open will succeed on things that don't have storage;
> better use heap_open (and check it's not a view).

and again

> --- most of the validation functions are quite pointless as bufmgr will
> refuse to load a page with bad header data.

and again

> > Specifically designed to allow test cases to be written that prove that
> > HOT works,
>
> Exactly what will these allow that you can't do with inspection of ctid
> etc?  (I suspect your answer will be "can't see infomask", but I'd
> rather expose that as a new system column than invent functions like
> these.)

Interesting idea, but aren't they keywords? How many system columns
would we need to represent each of the info flags?

The other thing was the ability to see headers of dead tuples as well so
as to understand what is on the page in total, not just the visible
portion of it. With HOT, recently dead tuples can still play an
important part of the data access path, so being able to see them might
explain many things. Is there a way to run a query in SnapshotAll?

I'll happily code it as functions or system cols or any other way, as
long as we can see everything there is to see.

> I'm pretty dubious of the premise anyway --- to get results
> sufficiently constant that the current regression test comparison
> mechanism works for them, I think you'll have to constrain the test
> conditions so much that the test will prove little or nothing.

Well, I agree these would be more basic tests. But people might still
break them. Thinking was really to provide a tutorial to how things
work.

Some more complex multi-session tests have already been written, based
upon analysis of all of the paths taken through HeapTupleSatisfiesX.
Those require the multi-session psql patch to be enabled.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: Heap page diagnostic/test functions (WIP)

From
Tom Lane
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:
> On Mon, 2007-03-05 at 14:31 -0500, Tom Lane wrote:
>> Exactly what will these allow that you can't do with inspection of ctid
>> etc?  (I suspect your answer will be "can't see infomask", but I'd
>> rather expose that as a new system column than invent functions like
>> these.)

> Interesting idea, but aren't they keywords? How many system columns
> would we need to represent each of the info flags?

Just one; I was imagining just returning the whole bitmask.  See
http://archives.postgresql.org/pgsql-hackers/2005-02/msg00636.php

> The other thing was the ability to see headers of dead tuples as well so
> as to understand what is on the page in total, not just the visible
> portion of it.

Ah, that's a good point.

            regards, tom lane

Re: Heap page diagnostic/test functions (WIP)

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Exactly what will these allow that you can't do with inspection of ctid
> etc?  (I suspect your answer will be "can't see infomask"

For testing the packed varlena stuff it would have been handy to be able to
see the length of tuples on disk. I made do with pg_column_size(foo.*) but
it's not exactly the same thing I don't think.

And I could see for debugging HOT and vacuum it would be helpful to see the
physical layout of the tuples in the page. Ie, the offsets of each tuple
(which Simon's function didn't actually output last I saw, but would be nice
if it were added).

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Heap page diagnostic/test functions (WIP)

From
"Pavan Deolasee"
Date:
Simon Riggs wrote:
> I'll happily code it as functions or system cols or any other way, as
> long as we can see everything there is to see.
>


With HOT, other useful information is about the line pointers. It would be
cool to be able to print the redirection info, details about LP_DELETEd
line pointers etc. Also, there are some flags in the t_infomask2 which HOT
uses, so that information would be useful also.

Thanks,
Pavan

EnterpriseDB       http://www.enterprisedb.com

Re: Heap page diagnostic/test functions (WIP)

From
"Simon Riggs"
Date:
On Tue, 2007-03-06 at 09:33 +0530, Pavan Deolasee wrote:
> Simon Riggs wrote:
> > I'll happily code it as functions or system cols or any other way, as
> > long as we can see everything there is to see.

> With HOT, other useful information is about the line pointers.

Done

> It would be
> cool to be able to print the redirection info, details about LP_DELETEd
> line pointers etc. Also, there are some flags in the t_infomask2 which HOT
> uses, so that information would be useful also.

I'll return the infomasks directly, for you to manipulate.

Not happy with that, but open to suggestions.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



Re: Heap page diagnostic/test functions (WIP)

From
Gregory Stark
Date:
"Simon Riggs" <simon@2ndquadrant.com> writes:

> I'll return the infomasks directly, for you to manipulate.
>
> Not happy with that, but open to suggestions.

Well the alternative would be a long list of boolean columns which would make
the output kind of long.

Perhaps a function pg_decode_infomask(varbit) which returns a ROW of booleans
with appropriate names would be a good compromise. If you want it you could
use it in your query.

Or perhaps you could include a ROW of booleans in your output already,
something like:

postgres=# insert into tuple_info values (b'000', ROW(false,false,false));
INSERT 0 1

postgres=# select * from tuple_info;
 infomask_bits | infomask_flags
---------------+----------------
 000           | (f,f,f)
(1 row)

postgres=# select (infomask_flags).* from tuple_info;
 flag_a | flag_b | flag_c
--------+--------+--------
 f      | f      | f
(1 row)

That might be kind of tricky to cons up though. I had to create a table to do
it here.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Heap page diagnostic/test functions (WIP)

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
>> I'll return the infomasks directly, for you to manipulate.
>>
>> Not happy with that, but open to suggestions.

> Well the alternative would be a long list of boolean columns which would make
> the output kind of long.

> Perhaps a function pg_decode_infomask(varbit) which returns a ROW of booleans
> with appropriate names would be a good compromise. If you want it you could
> use it in your query.

This is pointless --- the function is already intended only for
debugging considerations, and anyone who needs it can be assumed capable
of ANDing with a bitmask or whatever he needs to do to inspect the
values.  I don't see anyone asking for pretty display of cmin, say,
and yet that's certainly not that easy to interpret either.

As for masks plural, I'd be inclined to merge them into one 32-bit
result --- the distinction between flag bits in infomask and infomask2
is at this point entirely historical.

            regards, tom lane