Re: Displaying and dumping of table access methods - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Displaying and dumping of table access methods
Date
Msg-id 20190108013013.GD2528@tamriel.snowman.net
Whole thread Raw
In response to Re: Displaying and dumping of table access methods  (Andres Freund <andres@anarazel.de>)
Responses Re: Displaying and dumping of table access methods  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2019-01-07 19:19:46 -0500, Stephen Frost wrote:
> > * Andres Freund (andres@anarazel.de) wrote:
> > > Over in [1] we're discussing the development of the pluggable storage
> > > patchset, which allows different ways of storing table data.
> > >
> > > One thing I'd like to discuss with a wider audience than the
> > > implementation details is psql and pg_dump handling of table access
> > > methods.
> > >
> > > Currently the patchset neither dumps nor displays table access
> > > methods . That's clearly not right.
> >
> > Agreed.
> >
> > > The reason for that however is not that it's hard to dump/display
> > > code-wise, but that to me the correct behaviour is not obvious.
> >
> > While it might be a lot of changes to the regression output results, I
> > tend to feel that showng the access method is a very important aspect of
> > the relation and therefore should be in \d output.
>
> I don't see how that'd be feasible.  Imo it is/was absolutely crucial
> for zheap development to be able to use the existing postgres tests.

I don't agree with the general assumption that "we did this for
development and therefore it should be done that way forever".

Instead, I would look at adding tests where there's a difference
between the two, or possibly some difference, and make sure that there
isn't, and make sure that the code paths are covered.

> > > The reason to make table storage pluggable is after all that the table
> > > access method can be changed, and part of developing new table access
> > > methods is being able to run the regression tests.
> >
> > We certainly want people to be able to run the regression tests, but it
> > feels like we will need more regression tests in the future as we wish
> > to cover both the built-in heap AM and the new zheap AM, so we should
> > really have those both run independently.  I don't think we'll be able
> > to have just one set of regression tests that cover everything
> > interesting for both, sadly.  Perhaps there's a way to have a set of
> > regression tests which are run for both, and another set that's run for
> > each, but building all of that logic is a fair bit of work and I'm not
> > sure how much it's really saving us.
>
> I don't think there's any sort of contradiction here. I don't think it's
> feasible to have tests tests for every feature duplicated for each
> supported AM, we have enough difficulty maintaining our current
> tests. But that doesn't mean it's a problem to have individual test
> [files] run with an explicitly assigned AM - the test can just do a SET
> default_table_access_method = zheap; or explicitly say USING zheap.

I don't mean to suggest that there's a contradiction.  I don't have any
problem with new tests being added which set the default AM to zheap, as
long as it's clear that such is happening for downstream tests.

> > > A patch at [2] adds display of a table's access method to \d+ - but that
> > > means that running the tests with a different default table access
> > > method (e.g. using PGOPTIONS='-c default_table_access_method=...)
> > > there'll be a significant number of test failures, even though the test
> > > results did not meaningfully differ.
> >
> > Yeah, I'm not really thrilled with this approach.
> >
> > > Similarly, if pg_dump starts to dump table access methods either
> > > unconditionally, or for all non-heap AMS, the pg_dump tests fail due to
> > > unimportant differences.
> >
> > In reality, pg_dump already depends on quite a few defaults to be in
> > place, so I don't see a particular issue with adding this into that set.
> > New tests would need to have new pg_dump checks, of course, but that's
> > generally the case as well.
>
> I am not sure what you mean here? Are you suggesting that there'd be a
> separate set of pg_dump test for zheap and every other possible future
> AM?

I'm suggesting that pg_dump would have additional tests for zheap, in
addition to the existing tests we already have.  No more, no less,
really.

> To me the approach you're suggesting is going to lead to an explosion of
> redundant tests, which are really hard to maintain, especially for
> out-of-tree AMs. Out of tree AMs with the setup I propose can just
> install the AM into the template database and set PGOPTIONS, and they
> have pretty good coverage.

I'm frankly much less interested in out-of-tree AMs as we aren't going
to have in-core regression tests for them anyway, because we can't as
they aren't in our tree and, ultimately, I don't find them to have
anywhere near the same value that in-core AMs have.

I don't have any problem with out-of-tree AMs hacking things up as they
see fit and then running whatever tests they want, but it is, and always
will be, a very different discussion and ultimately a different result
when we're talking about what will be incorporated and supported as part
of core.

Thanks!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: ALTER INDEX fails on partitioned index
Next
From: Amit Langote
Date:
Subject: Re: speeding up planning with partitions