Thread: Separate the attribute physical order from logical order

Separate the attribute physical order from logical order

From
Julien Rouhaud
Date:
(Starting a new thread)

On Sun, Jun 26, 2022 at 10:48:24AM +0800, Julien Rouhaud wrote:
> On Thu, Jun 23, 2022 at 10:19:44AM -0400, Robert Haas wrote:
> > On Thu, Jun 23, 2022 at 6:13 AM Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > > And should record_in / record_out use the logical position, as in:
> > > SELECT ab::text FROM ab / SELECT (a, b)::ab;
> > >
> > > I would think not, as relying on a possibly dynamic order could break things if
> > > you store the results somewhere, but YMMV.
> >
> > I think here the answer is yes again. I mean, consider that you could
> > also ALTER TABLE DROP COLUMN and then ALTER TABLE ADD COLUMN with the
> > same name. That is surely going to affect the meaning of such things.
> > I don't think we want to have one meaning if you reorder things that
> > way and a different meaning if you reorder things using whatever
> > commands we create for changing the display column positions.
>
> It indeed would, but ALTER TABLE DROP COLUMN is a destructive operation, and
> I'm assuming that anyone doing that is aware that it will have an impact on
> stored data and such.  I initially thought that changing the display order of
> columns shouldn't have the same impact with the stability of otherwise
> unchanged record definition, as it would make such reorder much more impacting.
> But I agree that having different behaviors seems worse.
>
> > > Then, what about joinrels expansion?  I learned that the column ordering rules
> > > are far from being obvious, and I didn't find those in the documentation (note
> > > that I don't know if that something actually described in the SQL standard).
> > > So for instance, if a join is using an explicit USING clause rather than an ON
> > > clause, the merged columns are expanded first, so:
> > >
> > > SELECT * FROM ab ab1 JOIN ab ab2 USING (b)
> > >
> > > should unexpectedly expand to (b, a, a).  Is this order a strict requirement?
> >
> > I dunno, but I can't see why it creates a problem for this patch to
> > maintain the current behavior. I mean, just use the logical column
> > position instead of the physical one here and forget about the details
> > of how it works beyond that.
>
> I'm not that familiar with this part of the code so I may have missed
> something, but I didn't see any place where I could just simply do that.
>
> To be clear, the approach I used is to change the expansion ordering but
> otherwise keep the current behavior, to try to minimize the changes.  This is
> done by keeping the attribute in the physical ordering pretty much everywhere,
> including in the nsitems, and just logically reorder them during the expansion.
> In other words all the code still knows that the 1st column is the first
> physical column and so on.
>
> So in that query, the ordering is supposed to happen when handling the "SELECT
> *", which makes it impossible to retain that order.
>
> I'm assuming that what you meant is to change the ordering when processing the
> JOIN and retain the old "SELECT *" behavior, which is to emit items in the
> order they're found.  But IIUC the only way to do that would be to change the
> order when building the nsitems themselves, and have the code believe that the
> attributes are physically stored in the logical order.  That's probably doable,
> but that looks like a way more impacting change.  Or did you mean to keep the
> approach I used, and just have some special case for "SELECT *" when referring
> to a joinrel and instead try to handle the logical expansion in the join?
> AFAICS it would require to add some extra info in the parsing structures, as it
> doesn't really really store any position, just relies on array offset / list
> position and maps things that way.

So, assuming that the current JOIN expansion order shouldn't be changed, I
implemented the last approach I mentioned.  As expected, it requires some extra
information in the parsing structures.  In the attached patch I added an array
in the ParseNamespaceItem struct (p_mappings) to map the logical / physical
positions, and iterate over that array when processing the JOIN in
transformFromClauseItem to emit the same tuples as if no logical order were
defined.  Also, expandNSItemAttrs() now needs to know that when an RTE_JOIN is
expanded, to keep the original order.

While at it I also fixed the column list that get automatically generated when
deparsing a view if the original query didn't had any alias but some DDL is
later executed (like renaming one of the column) making this column list
necessary.  This isn't problematic except in one case: functions returning
(setof) tables.  For this, I also need to save a array to map the physical /
logical positions but as far as I can see I need to save it in the
RangeTblEntry, only for RTE_FUNCTION, which is serialized in pg_rewrite so that
the deparsing can emit the correct order even if the attribute positions
changed between the view creation and the deparsing.  This also works well but
feels really hackish.

With those changes, the create_view.sql test now entirely works (except some
error message referencing a physical position).  There are still a lot of other
tests that fail, and I didn't really dig into all of them to know if that's
something normal or just some other places that needs to be fixed.

As I mentioned in my first email, I'm a bit doubtful about this approach in
general, so I'm looking for some feedback on it before investigating too much
time implementing something that would never be close to committable.
>
> > > Another problem (that probably wouldn't be a problem for system catalogs) is
> > > that defaults are evaluated in the physical position.  This example from the
> > > regression test will clearly have a different behavior if the columns are in a
> > > different physical order:
> > >
> > >  CREATE TABLE INSERT_TBL (
> > >         x INT DEFAULT nextval('insert_seq'),
> > >         y TEXT DEFAULT '-NULL-',
> > >         z INT DEFAULT -1 * currval('insert_seq'),
> > >         CONSTRAINT INSERT_TBL_CON CHECK (x >= 3 AND y <> 'check failed' AND x < 8),
> > >         CHECK (x + z = 0));
> > >
> > > But changing the behavior to rely on the logical position seems quite
> > > dangerous.
> >
> > Why?
>
> It feels to me like a POLA violation, and probably people wouldn't expect it to
> behave this way (even if this is clearly some corner case problem).  Even if
> you argue that this is not simply a default display order but something more
> like real column order, the physical position being some implementation detail,
> it still doesn't really feels right.
>
> The main reason for having the possibility to change the logical position is to
> have "better looking", easier to work with, relations even if you have some
> requirements with the real physical order like trying to optimize things as
> much as possible (reordering columns to avoid padding space, put non-nullable
> columns first...).  The order in which defaults are evaluated looks like the
> same kind of requirements.  How useful would it be if you could chose a logical
> order, but not being able to chose the one you actually want because it would
> break your default values?
>
> Anyway, per the nearby discussions I don't see much interest, especially not in
> the context of varlena identifiers (I should have started a different thread,
> sorry about that), so I don't think it's worth investing more efforts into it.

Attachment

Re: Separate the attribute physical order from logical order

From
Alvaro Herrera
Date:
On 2022-Jun-28, Julien Rouhaud wrote:

> So, assuming that the current JOIN expansion order shouldn't be
> changed, I implemented the last approach I mentioned.

Yeah, I'm not sure that this is a good assumption.  I mean, if logical
order is the order in which users see the table columns, then why
shouldn't JOIN expand in the same way?  My feeling is that every aspect
of user interaction should show columns ordered in logical order.  When
I said that "only star expansion changes" upthread, what I meant is that
there was no need to support any additional functionality such as
letting the column order be changed or the server changing things
underneath to avoid alignment padding, etc.


Anyway, I think your 0001 is not a good first step.  I think a better
first step is a patch that adds two more columns to pg_attribute:
attphysnum and attlognum (or something like that.  This is the name I
used years ago, but if you want to choose different, it's okay.)  In
0001, these columns would all be always identical, and there's no
functionality to handle the case where they differ (probably even add
some Assert that they are identical).  The idea behind these three
columns is: attnum is a column identity and it never changes from the
first value that is assigned to the column.  attphysnum represents the
physical position of the table.  attlognum is the position where the
column appears for user interaction.

In a 0002 patch, you would introduce backend support for the case where
attlognum differs from the other two; but the other two are always the
same and it's okay if the server misbehaves or crashes if attphysnum is
different from attnum (best: keep the asserts that they are always the
same).  Doing it this way limits the number of cases that you have to
deal with, because there will be enough difficulty already.  You need to
change RTE expansion everywhere: *-expansion, COPY, JOIN, expansion of
SQL function results, etc ...  even psql \d ;-)  But, again: the
physical position is always column identity and there's no way to
reorder the columns physically for storage efficiency.

You could put ALTER TABLE support for moving columns as 0003.  (So
testing for 0002 would just be some UPDATE sentences or some hack that
lets you test various cases.)

In a 0004 patch, you would introduce backend support for attphysnum to
be different.  Probably no DDL support yet, since maybe we don't want
that, but instead we would like the server to figure out the best
possible packing based on alignment padding, nullability varlenability.
So testing for this part is again just some UPDATEs.

I think 0001+0002 are already a submittable patchset.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Si quieres ser creativo, aprende el arte de perder el tiempo"



Re: Separate the attribute physical order from logical order

From
Julien Rouhaud
Date:
Hi,

On Tue, Jun 28, 2022 at 10:53:14AM +0200, Alvaro Herrera wrote:
> On 2022-Jun-28, Julien Rouhaud wrote:
>
> > So, assuming that the current JOIN expansion order shouldn't be
> > changed, I implemented the last approach I mentioned.
>
> Yeah, I'm not sure that this is a good assumption.  I mean, if logical
> order is the order in which users see the table columns, then why
> shouldn't JOIN expand in the same way?  My feeling is that every aspect
> of user interaction should show columns ordered in logical order.  When
> I said that "only star expansion changes" upthread, what I meant is that
> there was no need to support any additional functionality such as
> letting the column order be changed or the server changing things
> underneath to avoid alignment padding, etc.

I'm not entirely sure of what you meant.  Assuming tables a(a, z) and b(b, z),
what do you think those queries should return?

SELECT * FROM a JOIN b on a.z = b.z
Currently it returns (a.a, a.z, b.b, b.z)

SELECT * FROM a JOIN b USING (z)
Currently it returns a.z, a.a, b.b.

Should it now return (a.a, z, b.b) as long as the tables have that logical
order, whether or not any other position (attnum / attphysnum) is different or
stay the same as now?

> Anyway, I think your 0001 is not a good first step.

FWIW this is just what you were previously suggesting at [1].

> I think a better
> first step is a patch that adds two more columns to pg_attribute:
> attphysnum and attlognum (or something like that.  This is the name I
> used years ago, but if you want to choose different, it's okay.)  In
> 0001, these columns would all be always identical, and there's no
> functionality to handle the case where they differ (probably even add
> some Assert that they are identical).  The idea behind these three
> columns is: attnum is a column identity and it never changes from the
> first value that is assigned to the column.  attphysnum represents the
> physical position of the table.  attlognum is the position where the
> column appears for user interaction.

I'm not following.  If we keep attnum as the official identity position and
use attlognum as the position that should be used in any interactive command,
wouldn't that risk to break every single client?

Imagine you have some framework that automatically generates queries based on
the catalog, if it sees table abc with:
c: attnum 1, attphysnum 1, attlognum 3
b: attnum 2, attphysnum 2, attlognum 2
a: attnum 3, attphysnum 3, attlognum 1

and you ask that layer to generate an insert with something like {'a': 'a',
'b': 'b', 'c': 'c'}, what would prevent it from generating:

INSERT INTO abc VALUES ('c', 'b', 'a');

while attlognum says it should have been

INSERT INTO abc VALUES ('a', 'b', 'c');

> In a 0002 patch, you would introduce backend support for the case where
> attlognum differs from the other two; but the other two are always the
> same and it's okay if the server misbehaves or crashes if attphysnum is
> different from attnum (best: keep the asserts that they are always the
> same).  Doing it this way limits the number of cases that you have to
> deal with, because there will be enough difficulty already.  You need to
> change RTE expansion everywhere: *-expansion, COPY, JOIN, expansion of
> SQL function results, etc ...  even psql \d ;-)  But, again: the
> physical position is always column identity and there's no way to
> reorder the columns physically for storage efficiency.

Just to clarify my understanding, apart from the fact that I'm only using
attphysnum (for your attnum and attphysnum) and attnum (for your attlognum), is
there any difference in the behavior with what I started to implement (if what
I started to implement was finished of course) and what you're saying here?

Also, about the default values evaluation (see [2]), should it be tied to your
attnum, attphysnum or attlognum?

> You could put ALTER TABLE support for moving columns as 0003.  (So
> testing for 0002 would just be some UPDATE sentences or some hack that
> lets you test various cases.)
>
> In a 0004 patch, you would introduce backend support for attphysnum to
> be different.  Probably no DDL support yet, since maybe we don't want
> that, but instead we would like the server to figure out the best
> possible packing based on alignment padding, nullability varlenability.
> So testing for this part is again just some UPDATEs.
>
> I think 0001+0002 are already a submittable patchset.

I think that supporting at least a way to specify the logical order during the
table creation should be easy to implement (there shouldn't be any
question on whether it needs to invalidate any cache or what lock level to
use), and could also be added in the initial submission without much extra
efforts, which could help with the testing.

[1] https://www.postgresql.org/message-id/202108181639.xjuovrpwgkr2@alvherre.pgsql
[2] https://www.postgresql.org/message-id/20220626024824.qnlpp6vikzjvuxs3%40jrouhaud



Re: Separate the attribute physical order from logical order

From
Isaac Morland
Date:
On Tue, 28 Jun 2022 at 05:32, Julien Rouhaud <rjuju123@gmail.com> wrote:
I think that supporting at least a way to specify the logical order during the
table creation should be easy to implement (there shouldn't be any
question on whether it needs to invalidate any cache or what lock level to
use), and could also be added in the initial submission without much extra
efforts, which could help with the testing.

I think the meaning of “logical order” (well, the meaning it has for me, at least) implies that the logical order of a table after CREATE TABLE is the order in which the columns were given in the table creation statement.

If there needs to be a way of specifying the physical order separately, that is a different matter.

ALTER TABLE ADD … is another matter. Syntax there to be able to say BEFORE or AFTER an existing column would be nice to have. Presumably it would physically add the column at the end but set the logical position as specified.

Re: Separate the attribute physical order from logical order

From
Julien Rouhaud
Date:
Hi,

On Tue, Jun 28, 2022 at 09:00:05AM -0400, Isaac Morland wrote:
> On Tue, 28 Jun 2022 at 05:32, Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> > I think that supporting at least a way to specify the logical order during
> > the
> > table creation should be easy to implement (there shouldn't be any
> > question on whether it needs to invalidate any cache or what lock level to
> > use), and could also be added in the initial submission without much extra
> > efforts, which could help with the testing.
> >
>
> I think the meaning of “logical order” (well, the meaning it has for me, at
> least) implies that the logical order of a table after CREATE TABLE is the
> order in which the columns were given in the table creation statement.
>
> If there needs to be a way of specifying the physical order separately,
> that is a different matter.

Well, the way I see it is that the logical order is something that can be
changed, and therefore is the one that needs to be spelled out explicitly if
you want it to differ from the physical order.

But whether the physical or logical order is the one that needs explicit
additional syntax, it would still be nice to provide in a first iteration.  And
both versions would be the same to implement, difficulty wise.
>
> ALTER TABLE ADD … is another matter. Syntax there to be able to say BEFORE
> or AFTER an existing column would be nice to have. Presumably it would
> physically add the column at the end but set the logical position as
> specified.

Yes, but it raises some questions about lock level, cache invalidation and such
so I chose to ignore that for the moment.



Re: Separate the attribute physical order from logical order

From
Justin Pryzby
Date:
On Tue, Jun 28, 2022 at 04:32:30PM +0800, Julien Rouhaud wrote:
> psql displays a table columns information using the logical order rather the
> physical order, and if verbose emits an addition "Physical order" footer if the
> logical layout is different from the physical one.

FYI: the footer would work really poorly for us, since we use hundreds of
columns and sometimes over 1000 (historically up to 1600).  I think it'd be
better to show the physical position as an additional column, or a \d option to
sort by physical attnum.  (I'm not sure if it'd be useful for our case to see
the extra columns, but at least it won't create a "footer" which is multiple
pages long.  Actually, I've sometimes wished for a "\d-" quiet mode which would
show everything *except* the list of column names, or perhaps only show those
columns which are referenced by the list of indexes/constraints/stats
objects/etc).

BTW, since 2 years ago, when rewriting partitions to promote a column type, we
recreate the parent table sorted by attlen, to minimize alignment overhead in
new children.  AFAICT your patch is about adding an logical column order, not
about updating tables with a new physical order.

-- 
Justin



Re: Separate the attribute physical order from logical order

From
Julien Rouhaud
Date:
Hi,

On Tue, Jun 28, 2022 at 08:38:56AM -0500, Justin Pryzby wrote:
> On Tue, Jun 28, 2022 at 04:32:30PM +0800, Julien Rouhaud wrote:
> > psql displays a table columns information using the logical order rather the
> > physical order, and if verbose emits an addition "Physical order" footer if the
> > logical layout is different from the physical one.
> 
> FYI: the footer would work really poorly for us, since we use hundreds of
> columns and sometimes over 1000 (historically up to 1600).

Yeah :)  As I mentioned originally at [1]: "I also changed psql to display the
column in logical position, and emit an extra line with the physical position
in the verbose mode, but that's a clearly poor design which would need a lot
more thoughts."

> I think it'd be
> better to show the physical position as an additional column, or a \d option to
> sort by physical attnum.  (I'm not sure if it'd be useful for our case to see
> the extra columns, but at least it won't create a "footer" which is multiple
> pages long.

Yes, I was also thinking something like that could work.  I just did it with
the extra footer for now because I needed a quick way to check in which order
my tables were supposed to be displayed / stored during development.  As soon
as I get a clearer picture of what approach should be used I will clearly work
on this, and all other things that still need some care.

> Actually, I've sometimes wished for a "\d-" quiet mode which would
> show everything *except* the list of column names, or perhaps only show those
> columns which are referenced by the list of indexes/constraints/stats
> objects/etc).

I never had to work on crazy wide relations like that myself but I can easily
imagine how annoying it can get.  No objection from me, although it would be
good to start a new thread to attract more attention and see what other are
thinking.

> BTW, since 2 years ago, when rewriting partitions to promote a column type, we
> recreate the parent table sorted by attlen, to minimize alignment overhead in
> new children.  AFAICT your patch is about adding an logical column order, not
> about updating tables with a new physical order.

Indeed, the only thing it could do in such case is to allow you to create the
columns in an optimal order in the first place, without messing with the output.

But if the people who originally creates the table don't think about alignment
and things like that, there's still nothing that can be done with this feature.

That being said, in theory if such a feature existed, and if we also had a DDL
to allowed to specify a different logical order at creation time, it would be
easy to create a module that automatically reorder the columns before the table
is created to make sure that the columns are physically stored in an optimal
way.

[1] https://www.postgresql.org/message-id/20220623101155.3dljtwradu7eik6g@jrouhaud



Re: Separate the attribute physical order from logical order

From
Alvaro Herrera
Date:
On 2022-Jun-28, Julien Rouhaud wrote:


> On Tue, Jun 28, 2022 at 10:53:14AM +0200, Alvaro Herrera wrote:

> > My feeling is that every aspect of user interaction should show
> > columns ordered in logical order.
> 
> I'm not entirely sure of what you meant.  Assuming tables a(a, z) and b(b, z),
> what do you think those queries should return?
> 
> SELECT * FROM a JOIN b on a.z = b.z
> Currently it returns (a.a, a.z, b.b, b.z)
> 
> SELECT * FROM a JOIN b USING (z)
> Currently it returns a.z, a.a, b.b.
> 
> Should it now return (a.a, z, b.b) as long as the tables have that logical
> order, whether or not any other position (attnum / attphysnum) is different or
> stay the same as now?

For all user-visible intents and purposes, the column order is whatever
the logical order is (attlognum), regardless of attnum and attphysnum.
If the logical order is changed, then the order of the output columns of
a join will change to match.  The attnum and attphysnum are completely
irrelevant to all these purposes.  So, to answer your question, if the
join expands in this way at present, then it should continue to expand
that way if you define a table that has different attnum/attphysnum but
the same attlognum for those columns.


> I'm not following.  If we keep attnum as the official identity position and
> use attlognum as the position that should be used in any interactive command,
> wouldn't that risk to break every single client?

Yeah, it might break a lot of tools, but other things break tools too
and the world just moves on.

But if you don't want to break tools, I can think of two alternatives:

1. make the immutable column identity something like attidnum and
   keep attnum as the logical column order.
   This keeps tools happy, but if they try to match pg_attrdef by attnum
   bad things will happen.

2. in order to avoid possible silent breakage, remove attnum altogether
   and just have attidnum, attlognum, attphysnum; then every tool is
   forced to undergo an update.  Any cross-catalog relationships are now
   correct.

> Imagine you have some framework that automatically generates queries based on
> the catalog, if it sees table abc with:
> c: attnum 1, attphysnum 1, attlognum 3
> b: attnum 2, attphysnum 2, attlognum 2
> a: attnum 3, attphysnum 3, attlognum 1

Hopefully the framework will add a column list,
  INSERT INTO abc (c,b,a) VALUES ('c', 'b', 'a');
to avoid this problem.  But if it doesn't, then yeah it will misbehave,
and I don't think you should try to make it not misbehave.

> Also, about the default values evaluation (see [2]), should it be tied to your
> attnum, attphysnum or attlognum?

Default is tied to column identity.  If you change column order, the
defaults don't need to change at all.  Similarly, if the server decides
to repack the columns in a different way to save alignment padding, the
defaults don't need to change.

If you do not provide a column identity number or you use something else
(e.g. attlognum) to cross-references attributes from other catalogs,
then you'll have to edit pg_attrdef when a column moves; and any other
reference to a column number will have to change.  Or think about
pg_depend.  You don't want that.  This is why you need three columns,
not two.

> I think that supporting at least a way to specify the logical order
> during the table creation should be easy to implement

As long as it is really simple (just some stuff in CREATE TABLE, nothing
at all in ALTER TABLE) then that sounds good.  I just suggest not to
complicate things too much to avoid the risk of failing the project
altogether.

For testability, her's a crazy idea: have some test mode (maybe #ifdef
USE_ASSERT_CHECKING) that randomizes attlognum to start at some N >> 1,
and only attidnum starts at 1.  Then they never match and all tools need
to ensure they handle weird cases correctly.

> (there shouldn't be any question on whether it needs to invalidate any
> cache or what lock level to use), and could also be added in the
> initial submission without much extra efforts, which could help with
> the testing.

Famous last words :-)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Nadie está tan esclavizado como el que se cree libre no siéndolo" (Goethe)



Re: Separate the attribute physical order from logical order

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> If you do not provide a column identity number or you use something else
> (e.g. attlognum) to cross-references attributes from other catalogs,
> then you'll have to edit pg_attrdef when a column moves; and any other
> reference to a column number will have to change.  Or think about
> pg_depend.  You don't want that.  This is why you need three columns,
> not two.

In previous go-rounds on this topic (of which there have been many),
we understood the need for attidnum as being equivalent to the familiar
notion that tables should have an immutable primary key, with anything
that users might wish to change *not* being the primary key.  This
side-steps the need to propagate changes of the pkey into referencing
tables, which is essentially what Alvaro is pointing out you don't
want to have to deal with.

FWIW, I'd lean to the idea that using three new column names would
be a good thing, because it'll force you to look at every single
reference in the code and figure out which meaning is needed at that
spot.  There will still be a large number of wrong-meaning bugs, but
that disciplined step will hopefully result in "large" being "tolerable".

>> I think that supporting at least a way to specify the logical order
>> during the table creation should be easy to implement

> As long as it is really simple (just some stuff in CREATE TABLE, nothing
> at all in ALTER TABLE) then that sounds good.  I just suggest not to
> complicate things too much to avoid the risk of failing the project
> altogether.

I think that any user-reachable knobs for controlling this should be
designed and built later.  The initial split-up of attnum meanings
is already going to be a huge lift, and anything at all that you can
do to reduce the size of that first patch is advisable.  If you don't
realize what a large chance there is that you'll utterly fail on that
first step, then you have failed to learn anything from the history
of this topic.

Now you do need something that will make the three meanings different
in order to test that step.  But I'd suggest some bit of throwaway code
that just assigns randomly different logical and physical orders.

            regards, tom lane



Re: Separate the attribute physical order from logical order

From
Peter Geoghegan
Date:
On Tue, Jun 28, 2022 at 11:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Now you do need something that will make the three meanings different
> in order to test that step.  But I'd suggest some bit of throwaway code
> that just assigns randomly different logical and physical orders.

That seems like a good idea. Might also make sense to make the
behavior configurable via a developer-only GUC, to enable exhaustive
tests that use every possible permutation of physical/logical mappings
for a given table.

Perhaps the random behavior itself should work by selecting a value
for the GUC at various key points via a PRNG. During CREATE TABLE, for
example. This approach could make it easier to reproduce failures on the
buildfarm.

--
Peter Geoghegan



Re: Separate the attribute physical order from logical order

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Tue, Jun 28, 2022 at 11:47 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Now you do need something that will make the three meanings different
>> in order to test that step.  But I'd suggest some bit of throwaway code
>> that just assigns randomly different logical and physical orders.

> That seems like a good idea. Might also make sense to make the
> behavior configurable via a developer-only GUC, to enable exhaustive
> tests that use every possible permutation of physical/logical mappings
> for a given table.
> Perhaps the random behavior itself should work by selecting a value
> for the GUC at various key points via a PRNG. During CREATE TABLE, for
> example. This approach could make it easier to reproduce failures on the
> buildfarm.

Yeah, it can't be *too* random or debugging failures will be a nightmare.
My point is just to not spend a lot of engineering on this part, because
it won't be a long-term user feature.

            regards, tom lane