Thread: tablecmds.c/MergeAttributes() cleanup
The MergeAttributes() and related code in and around tablecmds.c is huge and ancient, with many things bolted on over time, and difficult to deal with. I took some time to make careful incremental updates and refactorings to make the code easier to follow, more compact, and more modern in appearance. I also found several pieces of obsolete code along the way. This resulted in the attached long patch series. Each patch tries to make a single change and can be considered incrementally. At the end, the code is shorter, better factored, and I hope easier to understand. There shouldn't be any change in behavior.
Attachment
- 0001-Remove-obsolete-comment-about-OID-support.patch
- 0002-Remove-ancient-special-case-code-for-adding-oid-colu.patch
- 0003-Remove-ancient-special-case-code-for-dropping-oid-co.patch
- 0004-Make-more-use-of-makeColumnDef.patch
- 0005-Clean-up-MergeAttributesIntoExisting.patch
- 0006-Clean-up-MergeCheckConstraint.patch
- 0007-MergeAttributes-and-related-variable-renaming.patch
- 0008-Improve-some-catalog-documentation.patch
- 0009-Remove-useless-if-condition.patch
- 0010-Remove-useless-if-condition.patch
- 0011-Refactor-ATExecAddColumn-to-use-BuildDescForRelation.patch
- 0012-Push-attidentity-and-attgenerated-handling-into-Buil.patch
- 0013-Move-BuildDescForRelation-from-tupdesc.c-to-tablecmd.patch
- 0014-Push-attcompression-and-attstorage-handling-into-Bui.patch
- 0015-Add-TupleDescGetDefault.patch
- 0016-MergeAttributes-convert-pg_attribute-back-to-ColumnD.patch
- 0017-Add-some-const-decorations.patch
On 2023-Jun-28, Peter Eisentraut wrote: > The MergeAttributes() and related code in and around tablecmds.c is huge and > ancient, with many things bolted on over time, and difficult to deal with. > I took some time to make careful incremental updates and refactorings to > make the code easier to follow, more compact, and more modern in appearance. > I also found several pieces of obsolete code along the way. This resulted > in the attached long patch series. Each patch tries to make a single change > and can be considered incrementally. At the end, the code is shorter, > better factored, and I hope easier to understand. There shouldn't be any > change in behavior. I request to leave this alone for now. I have enough things to juggle with in the NOTNULLs patch; this patchset looks like it will cause me messy merge conflicts. 0004 for instance looks problematic, as does 0007 and 0016. FWIW for the most part that patch is working and I intend to re-submit shortly, but the relevant pg_upgrade code is really brittle, so it's taken me much more than I expected to get it in good shape for all cases. Thanks -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 2023-Jun-28, Peter Eisentraut wrote: > The MergeAttributes() and related code in and around tablecmds.c is huge and > ancient, with many things bolted on over time, and difficult to deal with. > I took some time to make careful incremental updates and refactorings to > make the code easier to follow, more compact, and more modern in appearance. > I also found several pieces of obsolete code along the way. This resulted > in the attached long patch series. Each patch tries to make a single change > and can be considered incrementally. At the end, the code is shorter, > better factored, and I hope easier to understand. There shouldn't be any > change in behavior. I spent a few minutes doing a test merge of this to my branch with NOT NULL changes. Here's a quick review. > Subject: [PATCH 01/17] Remove obsolete comment about OID support Obvious, trivial. +1 > Subject: [PATCH 02/17] Remove ancient special case code for adding oid columns LGTM; deletes dead code. > Subject: [PATCH 03/17] Remove ancient special case code for dropping oid > columns Hmm, interesting. Yay for more dead code removal. Didn't verify it. > Subject: [PATCH 04/17] Make more use of makeColumnDef() Good idea, +1. Some lines (almost all makeColumnDef callsites) end up too long. This is the first patch that actually conflicts with the NOT NULLs one, and the conflicts are easy to solve, so I won't complain if you want to get it pushed soon. > Subject: [PATCH 05/17] Clean up MergeAttributesIntoExisting() I don't disagree with this in principle, but this one has more conflicts than the previous ones. > Subject: [PATCH 06/17] Clean up MergeCheckConstraint() Looks a reasonable change as far as this patch goes. However, reading it I noticed that CookedConstraint->inhcount is int and is tested for wraparound, but pg_constraint.coninhcount is int16. This is probably bogus already. ColumnDef->inhcount is also int. These should be narrowed to match the catalog definitions. > Subject: [PATCH 07/17] MergeAttributes() and related variable renaming I think this makes sense, but there's a bunch of conflicts to NOT NULLs. I guess we can come back to this one later. > Subject: [PATCH 08/17] Improve some catalog documentation > > Point out that typstorage and attstorage are never '\0', even for > fixed-length types. This is different from attcompression. For this > reason, some of the handling of these columns in tablecmds.c etc. is > different. (catalogs.sgml already contained this information in an > indirect way.) I don't understand why we must point out that they're never '\0'. I mean, if we're doing that, why not say that they can never be \xFF? The valid values are already listed. The other parts of this patch look okay. > Subject: [PATCH 09/17] Remove useless if condition > > This is useless because these fields are not set anywhere before, so > we can assign them unconditionally. This also makes this more > consistent with ATExecAddColumn(). Makes sense. > Subject: [PATCH 10/17] Remove useless if condition > > We can call GetAttributeCompression() with a NULL argument. It > handles that internally already. This change makes all the callers of > GetAttributeCompression() uniform. I agree, +1. > From 2eda6bc9897d0995a5112e2851c51daf0c35656e Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut <peter@eisentraut.org> > Date: Wed, 14 Jun 2023 17:51:31 +0200 > Subject: [PATCH 11/17] Refactor ATExecAddColumn() to use > BuildDescForRelation() > > BuildDescForRelation() has all the knowledge for converting a > ColumnDef into pg_attribute/tuple descriptor. ATExecAddColumn() can > make use of that, instead of duplicating all that logic. We just pass > a one-element list of ColumnDef and we get back exactly the data > structure we need. Note that we don't even need to touch > BuildDescForRelation() to make this work. Hmm, crazy. I'm not sure I like this, because it seems much too clever. The number of lines that are deleted is alluring, though. Maybe it'd be better to create a separate routine that takes a single ColumnDef and returns the Form_pg_attribute element for it; then use that in both BuildDescForRelation and ATExecAddColumn. > Subject: [PATCH 12/17] Push attidentity and attgenerated handling into > BuildDescForRelation() > > Previously, this was handled by the callers separately, but it can be > trivially moved into BuildDescForRelation() so that it is handled in a > central place. Looks reasonable. I think the last few patches are the more innovative (interesting, useful) of the bunch. Let's get the first few out of the way. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On 11.07.23 20:17, Alvaro Herrera wrote: > I spent a few minutes doing a test merge of this to my branch with NOT > NULL changes. Here's a quick review. > >> Subject: [PATCH 01/17] Remove obsolete comment about OID support > > Obvious, trivial. +1 > >> Subject: [PATCH 02/17] Remove ancient special case code for adding oid columns > > LGTM; deletes dead code. > >> Subject: [PATCH 03/17] Remove ancient special case code for dropping oid >> columns > > Hmm, interesting. Yay for more dead code removal. Didn't verify it. I have committed these first three. I'll leave it at that for now. >> Subject: [PATCH 08/17] Improve some catalog documentation >> >> Point out that typstorage and attstorage are never '\0', even for >> fixed-length types. This is different from attcompression. For this >> reason, some of the handling of these columns in tablecmds.c etc. is >> different. (catalogs.sgml already contained this information in an >> indirect way.) > > I don't understand why we must point out that they're never '\0'. I > mean, if we're doing that, why not say that they can never be \xFF? > The valid values are already listed. The other parts of this patch look > okay. While working through the storage and compression handling, which look similar, I was momentarily puzzled by this. While attcompression can be 0 to mean, use default, this is not possible/allowed for attstorage, but it took looking around three corners to verify this. It could be more explicit, I thought.
On 12.07.23 16:29, Peter Eisentraut wrote: > On 11.07.23 20:17, Alvaro Herrera wrote: >> I spent a few minutes doing a test merge of this to my branch with NOT >> NULL changes. Here's a quick review. >> >>> Subject: [PATCH 01/17] Remove obsolete comment about OID support >> >> Obvious, trivial. +1 >> >>> Subject: [PATCH 02/17] Remove ancient special case code for adding >>> oid columns >> >> LGTM; deletes dead code. >> >>> Subject: [PATCH 03/17] Remove ancient special case code for dropping oid >>> columns >> >> Hmm, interesting. Yay for more dead code removal. Didn't verify it. > > I have committed these first three. I'll leave it at that for now. I have committed a few more patches from this series that were already agreed upon. The remaining ones are rebased and reordered a bit, attached. There was some doubt about the patch "Refactor ATExecAddColumn() to use BuildDescForRelation()" (now 0009), whether it's too clever to build a fake one-item tuple descriptor. I am working on another patch, which I hope to post this week, that proposes to replace the use of tuple descriptors there with a List of something. That would then allow maybe rewriting this in a less-clever way. That patch incidentally also wants to move BuildDescForRelation from tupdesc.c to tablecmds.c (patch 0007 here).
Attachment
- v2-0001-Clean-up-MergeAttributesIntoExisting.patch
- v2-0002-Clean-up-MergeCheckConstraint.patch
- v2-0003-MergeAttributes-and-related-variable-renaming.patch
- v2-0004-Add-TupleDescGetDefault.patch
- v2-0005-Improve-some-catalog-documentation.patch
- v2-0006-Push-attidentity-and-attgenerated-handling-into-B.patch
- v2-0007-Move-BuildDescForRelation-from-tupdesc.c-to-table.patch
- v2-0008-Push-attcompression-and-attstorage-handling-into-.patch
- v2-0009-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patch
- v2-0010-MergeAttributes-convert-pg_attribute-back-to-Colu.patch
On 2023-Aug-29, Peter Eisentraut wrote: Regarding this hunk in 0002, > @@ -3278,13 +3261,16 @@ MergeAttributes(List *schema, List *supers, char relpersistence, > * > * constraints is a list of CookedConstraint structs for previous constraints. > * > - * Returns true if merged (constraint is a duplicate), or false if it's > - * got a so-far-unique name, or throws error if conflict. > + * If the constraint is a duplicate, then the existing constraint's > + * inheritance count is updated. If the constraint doesn't match or conflict > + * with an existing one, a new constraint is appended to the list. If there > + * is a conflict (same name but different expression), throw an error. This wording confused me: "If the constraint doesn't match or conflict with an existing one, a new constraint is appended to the list." I first read it as "doesn't match or conflicts with ..." (i.e., the negation only applied to the first verb, not both) which would have been surprising (== broken) behavior. I think it's clearer if you say "doesn't match nor conflict", but I'm not sure if this is grammatically correct. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On 2023-Aug-29, Peter Eisentraut wrote: > From 471fda80c41fae835ecbe63ae8505526a37487a9 Mon Sep 17 00:00:00 2001 > From: Peter Eisentraut <peter@eisentraut.org> > Date: Wed, 12 Jul 2023 16:12:35 +0200 > Subject: [PATCH v2 04/10] Add TupleDescGetDefault() > > This unifies some repetitive code. > > Note: I didn't push the "not found" error message into the new > function, even though all existing callers would be able to make use > of it. Using the existing error handling as-is would probably require > exposing the Relation type via tupdesc.h, which doesn't seem > desirable. Note that all errors here are elog(ERROR), not user-facing, so ISTM it would be OK to change them to report the relation OID rather than the name. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "In fact, the basic problem with Perl 5's subroutines is that they're not crufty enough, so the cruft leaks out into user-defined code instead, by the Conservation of Cruft Principle." (Larry Wall, Apocalypse 6)
On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote: > I have committed a few more patches from this series that were already > agreed upon. The remaining ones are rebased and reordered a bit, attached. My compiler is complaining about 1fa9241b: ../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’: ../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 196 | stmt->tableElts = lappend(stmt->tableElts, coldef); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This went away when I added a default case that ERROR'd or initialized coldef to NULL. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 2023-Aug-29, Nathan Bossart wrote: > On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote: > > I have committed a few more patches from this series that were already > > agreed upon. The remaining ones are rebased and reordered a bit, attached. > > My compiler is complaining about 1fa9241b: > > ../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’: > ../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized] > 196 | stmt->tableElts = lappend(stmt->tableElts, coldef); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > This went away when I added a default case that ERROR'd or initialized > coldef to NULL. Makes sense. However, maybe we should replace those ugly defines and their hardcoded values in DefineSequence with a proper array with their names and datatypes. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Los trabajadores menos efectivos son sistematicamente llevados al lugar donde pueden hacer el menor daño posible: gerencia." (El principio Dilbert)
On Tue, Aug 29, 2023 at 08:44:02PM +0200, Alvaro Herrera wrote: > On 2023-Aug-29, Nathan Bossart wrote: >> My compiler is complaining about 1fa9241b: >> >> ../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’: >> ../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized] >> 196 | stmt->tableElts = lappend(stmt->tableElts, coldef); >> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> This went away when I added a default case that ERROR'd or initialized >> coldef to NULL. > > Makes sense. However, maybe we should replace those ugly defines and > their hardcoded values in DefineSequence with a proper array with their > names and datatypes. That might be an improvement, but IIUC you'd still need to enumerate all of the columns or data types to make sure you use the right get-Datum function. It doesn't help that last_value uses Int64GetDatumFast and log_cnt uses Int64GetDatum. I could be missing something, though. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 29.08.23 19:45, Nathan Bossart wrote: > On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote: >> I have committed a few more patches from this series that were already >> agreed upon. The remaining ones are rebased and reordered a bit, attached. > > My compiler is complaining about 1fa9241b: > > ../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’: > ../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be used uninitialized in this function [-Werror=maybe-uninitialized] > 196 | stmt->tableElts = lappend(stmt->tableElts, coldef); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > This went away when I added a default case that ERROR'd or initialized > coldef to NULL. fixed
On 2023-Aug-29, Nathan Bossart wrote: > On Tue, Aug 29, 2023 at 08:44:02PM +0200, Alvaro Herrera wrote: > > Makes sense. However, maybe we should replace those ugly defines and > > their hardcoded values in DefineSequence with a proper array with their > > names and datatypes. > > That might be an improvement, but IIUC you'd still need to enumerate all of > the columns or data types to make sure you use the right get-Datum > function. It doesn't help that last_value uses Int64GetDatumFast and > log_cnt uses Int64GetDatum. I could be missing something, though. Well, for sure I meant to enumerate everything that was needed, including the initializer for the value. Like in the attached patch. However, now that I've actually written it, I don't find it so pretty anymore, but maybe that's just because I don't know how to write the array assignment as a single statement instead of a separate statement for each column. But this should silence the warnings, anyway. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Attachment
On 29.08.23 13:20, Alvaro Herrera wrote: > On 2023-Aug-29, Peter Eisentraut wrote: >> @@ -3278,13 +3261,16 @@ MergeAttributes(List *schema, List *supers, char relpersistence, >> * >> * constraints is a list of CookedConstraint structs for previous constraints. >> * >> - * Returns true if merged (constraint is a duplicate), or false if it's >> - * got a so-far-unique name, or throws error if conflict. >> + * If the constraint is a duplicate, then the existing constraint's >> + * inheritance count is updated. If the constraint doesn't match or conflict >> + * with an existing one, a new constraint is appended to the list. If there >> + * is a conflict (same name but different expression), throw an error. > > This wording confused me: > > "If the constraint doesn't match or conflict with an existing one, a new > constraint is appended to the list." > > I first read it as "doesn't match or conflicts with ..." (i.e., the > negation only applied to the first verb, not both) which would have been > surprising (== broken) behavior. > > I think it's clearer if you say "doesn't match nor conflict", but I'm > not sure if this is grammatically correct. Here is an updated version of this patch set. I resolved some conflicts and addressed this comment of yours. I also dropped the one patch with some catalog header edits that people didn't seem to particularly like. The patches that are now 0001 through 0004 were previously reviewed and just held for the not-null constraint patches, I think, so I'll commit them in a few days if there are no objections. Patches 0005 through 0007 are also ready in my opinion, but they haven't really been reviewed, so this would be something for reviewers to focus on. (0005 and 0007 are trivial, but they go to together with 0006.) The remaining 0008 and 0009 were still under discussion and contemplation.
Attachment
- v3-0001-Clean-up-MergeAttributesIntoExisting.patch
- v3-0002-Clean-up-MergeCheckConstraint.patch
- v3-0003-MergeAttributes-and-related-variable-renaming.patch
- v3-0004-Add-TupleDescGetDefault.patch
- v3-0005-Push-attidentity-and-attgenerated-handling-into-B.patch
- v3-0006-Move-BuildDescForRelation-from-tupdesc.c-to-table.patch
- v3-0007-Push-attcompression-and-attstorage-handling-into-.patch
- v3-0008-Refactor-ATExecAddColumn-to-use-BuildDescForRelat.patch
- v3-0009-MergeAttributes-convert-pg_attribute-back-to-Colu.patch
On 19.09.23 15:11, Peter Eisentraut wrote: > Here is an updated version of this patch set. I resolved some conflicts > and addressed this comment of yours. I also dropped the one patch with > some catalog header edits that people didn't seem to particularly like. > > The patches that are now 0001 through 0004 were previously reviewed and > just held for the not-null constraint patches, I think, so I'll commit > them in a few days if there are no objections. > > Patches 0005 through 0007 are also ready in my opinion, but they haven't > really been reviewed, so this would be something for reviewers to focus > on. (0005 and 0007 are trivial, but they go to together with 0006.) > > The remaining 0008 and 0009 were still under discussion and contemplation. I have committed through 0007, and I'll now close this patch set as "Committed", and I will (probably) bring back the rest (especially 0008) as part of a different patch set soon.
On 05.10.23 17:49, Peter Eisentraut wrote: > On 19.09.23 15:11, Peter Eisentraut wrote: >> Here is an updated version of this patch set. I resolved some >> conflicts and addressed this comment of yours. I also dropped the one >> patch with some catalog header edits that people didn't seem to >> particularly like. >> >> The patches that are now 0001 through 0004 were previously reviewed >> and just held for the not-null constraint patches, I think, so I'll >> commit them in a few days if there are no objections. >> >> Patches 0005 through 0007 are also ready in my opinion, but they >> haven't really been reviewed, so this would be something for reviewers >> to focus on. (0005 and 0007 are trivial, but they go to together with >> 0006.) >> >> The remaining 0008 and 0009 were still under discussion and >> contemplation. > > I have committed through 0007, and I'll now close this patch set as > "Committed", and I will (probably) bring back the rest (especially 0008) > as part of a different patch set soon. After playing with this for, well, 2 months, and considering various other approaches, I would like to bring back the remaining two patches in unchanged form. Especially the (now) first patch "Refactor ATExecAddColumn() to use BuildDescForRelation()" would be very helpful for facilitating further refactoring in this area, because it avoids having two essentially duplicate pieces of code responsible for converting a ColumnDef node into internal form. One of your (Álvaro's) comments about this earlier was > Hmm, crazy. I'm not sure I like this, because it seems much too clever. > The number of lines that are deleted is alluring, though. > > Maybe it'd be better to create a separate routine that takes a single > ColumnDef and returns the Form_pg_attribute element for it; then use > that in both BuildDescForRelation and ATExecAddColumn. which was also my thought at the beginning. However, this wouldn't quite work that way, for several reasons. For instance, BuildDescForRelation() also needs to keep track of the has_not_null property across all fields, and so if you split that function up, you would have to somehow make that an output argument and have the caller keep track of it. Also, the output of BuildDescForRelation() in ATExecAddColumn() is passed into InsertPgAttributeTuples(), which requires a tuple descriptor anyway, so splitting this up into a per-attribute function would then require ATExecAddColumn() to re-assemble a tuple descriptor anyway, so this wouldn't save anything. Also note that ATExecAddColumn() could in theory be enhanced to add more than one column in one go, so having this code structure in place isn't inconsistent with that. The main hackish thing, I suppose, is that we have to fix up the attribute number after returning from BuildDescForRelation(). I suppose we could address that by passing in a starting attribute number (or alternatively maximum existing attribute number) into BuildDescForRelation(). I think that would be okay; it would probably be about a wash in terms of code added versus saved. The (now) second patch is also still of interest to me, but I have since noticed that I think [0] should be fixed first, but to make that fix simpler, I would like the first patch from here. [0]: https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org
Attachment
On 2023-Dec-06, Peter Eisentraut wrote: > One of your (Álvaro's) comments about this earlier was > > > Hmm, crazy. I'm not sure I like this, because it seems much too clever. > > The number of lines that are deleted is alluring, though. > > > > Maybe it'd be better to create a separate routine that takes a single > > ColumnDef and returns the Form_pg_attribute element for it; then use > > that in both BuildDescForRelation and ATExecAddColumn. > > which was also my thought at the beginning. However, this wouldn't quite > work that way, for several reasons. For instance, BuildDescForRelation() > also needs to keep track of the has_not_null property across all fields, and > so if you split that function up, you would have to somehow make that an > output argument and have the caller keep track of it. Also, the output of > BuildDescForRelation() in ATExecAddColumn() is passed into > InsertPgAttributeTuples(), which requires a tuple descriptor anyway, so > splitting this up into a per-attribute function would then require > ATExecAddColumn() to re-assemble a tuple descriptor anyway, so this wouldn't > save anything. Hmm. Well, if this state of affairs is useful to you, then I withdraw my objection, because with this patch we're not really adding any new weirdness, just moving around already-existing weirdness. So let's press ahead with 0001. (I didn't look at 0002 this time, since apparently you'd like to process the other patch first and then come back here.) If you look closely at InsertPgAttributeTuples and accompanying code, it all looks a bit archaic. They seem to be treating TupleDesc as a glorified array of Form_pg_attribute elements in a convenient packaging. It's probably cleaner to change these APIs so that they deal with a Form_pg_attribute array, and not TupleDesc anymore. But we can hack on that some other day. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Pensar que el espectro que vemos es ilusorio no lo despoja de espanto, sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)
On 2024-Jan-11, Alvaro Herrera wrote: > If you look closely at InsertPgAttributeTuples and accompanying code, it > all looks a bit archaic. They seem to be treating TupleDesc as a > glorified array of Form_pg_attribute elements in a convenient packaging. > It's probably cleaner to change these APIs so that they deal with a > Form_pg_attribute array, and not TupleDesc anymore. But we can hack on > that some other day. In addition, it also occurs to me now that maybe it would make sense to change the TupleDesc implementation to use a List of Form_pg_attribute instead of an array, and do away with ->natts. This would let us change all "for ( ... natts ...)" loops into foreach_ptr() loops ... there are only five hundred of them or so --rolls eyes--. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "El sudor es la mejor cura para un pensamiento enfermo" (Bardia)
On Fri, Jan 12, 2024 at 5:32 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > On 2024-Jan-11, Alvaro Herrera wrote: > > If you look closely at InsertPgAttributeTuples and accompanying code, it > > all looks a bit archaic. They seem to be treating TupleDesc as a > > glorified array of Form_pg_attribute elements in a convenient packaging. > > It's probably cleaner to change these APIs so that they deal with a > > Form_pg_attribute array, and not TupleDesc anymore. But we can hack on > > that some other day. > > In addition, it also occurs to me now that maybe it would make sense to > change the TupleDesc implementation to use a List of Form_pg_attribute > instead of an array, and do away with ->natts. This would let us change > all "for ( ... natts ...)" loops into foreach_ptr() loops ... there are > only five hundred of them or so --rolls eyes--. Open-coding stuff like this can easily work out to a loss, and I personally think we're overly dependent on List. It's not a particularly good abstraction, IMHO, and if we do a lot of work to start using it everywhere because a list is really an array, then what happens when somebody decides that a list really ought to be a skip-list, or a hash table, or some other crazy thing? -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Jan 12, 2024 at 5:32 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> In addition, it also occurs to me now that maybe it would make sense to >> change the TupleDesc implementation to use a List of Form_pg_attribute >> instead of an array, and do away with ->natts. This would let us change >> all "for ( ... natts ...)" loops into foreach_ptr() loops ... there are >> only five hundred of them or so --rolls eyes--. > Open-coding stuff like this can easily work out to a loss, and I > personally think we're overly dependent on List. It's not a > particularly good abstraction, IMHO, and if we do a lot of work to > start using it everywhere because a list is really an array, then what > happens when somebody decides that a list really ought to be a > skip-list, or a hash table, or some other crazy thing? Without getting into opinions on whether List is a good abstraction, I'm -1 on this idea. It would be a large amount of code churn, with attendant back-patching pain, and I just don't see that there is much to be gained. regards, tom lane
On Fri, Jan 12, 2024 at 10:09 AM Robert Haas <robertmhaas@gmail.com> wrote: > Open-coding stuff like this can easily work out to a loss, and I > personally think we're overly dependent on List. It's not a > particularly good abstraction, IMHO, and if we do a lot of work to > start using it everywhere because a list is really an array, then what > happens when somebody decides that a list really ought to be a > skip-list, or a hash table, or some other crazy thing? This paragraph was a bit garbled. I meant to say that open-coding can be better than relying on a canned abstraction, but it came out the other way around. -- Robert Haas EDB: http://www.enterprisedb.com
On 06.12.23 09:23, Peter Eisentraut wrote: > The (now) second patch is also still of interest to me, but I have since > noticed that I think [0] should be fixed first, but to make that fix > simpler, I would like the first patch from here. > > [0]: > https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org The remaining patch in this series needed a rebase and adjustment. The above precondition still applies.
Attachment
Hi Peter, On Mon, Jan 22, 2024 at 6:13 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 06.12.23 09:23, Peter Eisentraut wrote: > > The (now) second patch is also still of interest to me, but I have since > > noticed that I think [0] should be fixed first, but to make that fix > > simpler, I would like the first patch from here. > > > > [0]: > > https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org > > The remaining patch in this series needed a rebase and adjustment. > > The above precondition still applies. While working on identity support and now while looking at the compression problem you referred to, I found MergeAttribute() to be hard to read. It's hard to follow high level logic in that function since the function is not modular. I took a stab at modularising a part of it. Attached is the resulting patch series. 0001 is your patch as is 0002 is pgindent fix and also fixing what I think is a typo/thinko from 0001. If you are fine with the changes, 0002 should be merged into 0003. 0003 separates the part of code merging a child attribute to the corresponding inherited attribute into its own function. 0004 does the same for code merging inherited attributes incrementally. I have kept 0003 and 0004 separate in case we pick one and not the other. But they can be committed as a single commit. The two new functions have some common code and some differences. Common code is not surprising since merging attributes whether from child definition or from inheritance parents will have common rules. Differences are expected in cases when child attributes need to be treated differently. But the differences may point us to some yet-unknown bugs; compression being one of those differences. I think the next step should combine these functions into one so that all the logic to merge one attribute is at one place. I haven't attempted it; wanted to propose the idea first. I can see that this moduralization will lead to another and we will be able to reduce MergeAttribute() to a set of function calls reflecting its high level logic and push the detailed implementation into minion functions like this. -- Best Wishes, Ashutosh Bapat
Attachment
On 24.01.24 07:27, Ashutosh Bapat wrote: > While working on identity support and now while looking at the > compression problem you referred to, I found MergeAttribute() to be > hard to read. It's hard to follow high level logic in that function > since the function is not modular. I took a stab at modularising a > part of it. Attached is the resulting patch series. > > 0001 is your patch as is > 0002 is pgindent fix and also fixing what I think is a typo/thinko > from 0001. If you are fine with the changes, 0002 should be merged > into 0003. > 0003 separates the part of code merging a child attribute to the > corresponding inherited attribute into its own function. > 0004 does the same for code merging inherited attributes incrementally. > > I have kept 0003 and 0004 separate in case we pick one and not the > other. But they can be committed as a single commit. I have committed all this. These are great improvements. (One little change I made to your 0003 and 0004 patches is that I kept the check whether the new column matches an existing one by name in MergeAttributes(). I found that pushing that down made the logic in MergeAttributes() too hard to follow. But it's pretty much the same.)
Hello Peter, 26.01.2024 16:42, Peter Eisentraut wrote: > > I have committed all this. These are great improvements. > Please look at the segmentation fault triggered by the following query since 4d969b2f8: CREATE TABLE t1(a text COMPRESSION pglz); CREATE TABLE t2(a text); CREATE TABLE t3() INHERITS(t1, t2); NOTICE: merging multiple inherited definitions of column "a" server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. Core was generated by `postgres: law regression [local] CREATE TABLE '. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116 #1 0x00005606fbcc9d52 in MergeAttributes (columns=0x0, supers=supers@entry=0x5606fe293d30, relpersistence=112 'p', is_partition=false, supconstr=supconstr@entry=0x7fff4046d410, supnotnulls=supnotnulls@entry=0x7fff4046d418) at tablecmds.c:2811 #2 0x00005606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130, relkind=relkind@entry=114 'r', ownerId=10, ownerId@entry=0, typaddress=typaddress@entry=0x0, queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3() INHERITS(t1, t2);") at tablecmds.c:885 ... Best regards, Alexander
Hi Alexander, On Sun, Jan 28, 2024 at 1:30 PM Alexander Lakhin <exclusion@gmail.com> wrote: > > Hello Peter, > > 26.01.2024 16:42, Peter Eisentraut wrote: > > > > I have committed all this. These are great improvements. > > > > Please look at the segmentation fault triggered by the following query since > 4d969b2f8: > CREATE TABLE t1(a text COMPRESSION pglz); > CREATE TABLE t2(a text); > CREATE TABLE t3() INHERITS(t1, t2); > NOTICE: merging multiple inherited definitions of column "a" > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > > Core was generated by `postgres: law regression [local] CREATE TABLE '. > Program terminated with signal SIGSEGV, Segmentation fault. > > (gdb) bt > #0 __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116 > #1 0x00005606fbcc9d52 in MergeAttributes (columns=0x0, supers=supers@entry=0x5606fe293d30, relpersistence=112 'p', > is_partition=false, supconstr=supconstr@entry=0x7fff4046d410, supnotnulls=supnotnulls@entry=0x7fff4046d418) > at tablecmds.c:2811 > #2 0x00005606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130, relkind=relkind@entry=114 'r', ownerId=10, > ownerId@entry=0, typaddress=typaddress@entry=0x0, > queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3() INHERITS(t1, t2);") at tablecmds.c:885 This bug existed even before the refactoring.Happens because strcmp() is called on NULL input (t2's compression is NULL). I already have a fix for this and will be posting it in [1]. [1] https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org -- Best Wishes, Ashutosh Bapat
Hello, 30.01.2024 09:22, Ashutosh Bapat wrote: > >> Please look at the segmentation fault triggered by the following query since >> 4d969b2f8: >> CREATE TABLE t1(a text COMPRESSION pglz); >> CREATE TABLE t2(a text); >> CREATE TABLE t3() INHERITS(t1, t2); >> NOTICE: merging multiple inherited definitions of column "a" >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> >> Core was generated by `postgres: law regression [local] CREATE TABLE '. >> Program terminated with signal SIGSEGV, Segmentation fault. >> >> (gdb) bt >> #0 __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116 >> #1 0x00005606fbcc9d52 in MergeAttributes (columns=0x0, supers=supers@entry=0x5606fe293d30, relpersistence=112 'p', >> is_partition=false, supconstr=supconstr@entry=0x7fff4046d410, supnotnulls=supnotnulls@entry=0x7fff4046d418) >> at tablecmds.c:2811 >> #2 0x00005606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130, relkind=relkind@entry=114 'r', ownerId=10, >> ownerId@entry=0, typaddress=typaddress@entry=0x0, >> queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3() INHERITS(t1, t2);") at tablecmds.c:885 > This bug existed even before the refactoring.Happens because strcmp() > is called on NULL input (t2's compression is NULL). I already have a > fix for this and will be posting it in [1]. > > [1] https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org > Now that that fix is closed with RwF [1], shouldn't this crash issue be added to Open Items for v17? (I couldn't reproduce the crash on 4d969b2f8~1 nor on REL_16_STABLE.) https://commitfest.postgresql.org/47/4813/ Best regards, Alexander
On Sat, Apr 20, 2024 at 9:30 AM Alexander Lakhin <exclusion@gmail.com> wrote:
Hello,
30.01.2024 09:22, Ashutosh Bapat wrote:
>
>> Please look at the segmentation fault triggered by the following query since
>> 4d969b2f8:
>> CREATE TABLE t1(a text COMPRESSION pglz);
>> CREATE TABLE t2(a text);
>> CREATE TABLE t3() INHERITS(t1, t2);
>> NOTICE: merging multiple inherited definitions of column "a"
>> server closed the connection unexpectedly
>> This probably means the server terminated abnormally
>> before or while processing the request.
>>
>> Core was generated by `postgres: law regression [local] CREATE TABLE '.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>>
>> (gdb) bt
>> #0 __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
>> #1 0x00005606fbcc9d52 in MergeAttributes (columns=0x0, supers=supers@entry=0x5606fe293d30, relpersistence=112 'p',
>> is_partition=false, supconstr=supconstr@entry=0x7fff4046d410, supnotnulls=supnotnulls@entry=0x7fff4046d418)
>> at tablecmds.c:2811
>> #2 0x00005606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130, relkind=relkind@entry=114 'r', ownerId=10,
>> ownerId@entry=0, typaddress=typaddress@entry=0x0,
>> queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3() INHERITS(t1, t2);") at tablecmds.c:885
> This bug existed even before the refactoring.Happens because strcmp()
> is called on NULL input (t2's compression is NULL). I already have a
> fix for this and will be posting it in [1].
>
> [1] https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org
>
Now that that fix is closed with RwF [1], shouldn't this crash issue be
added to Open Items for v17?
(I couldn't reproduce the crash on 4d969b2f8~1 nor on REL_16_STABLE.)
https://commitfest.postgresql.org/47/4813/
Best Wishes,
Ashutosh Bapat
On Sat, Apr 20, 2024 at 12:17 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > Yes please. Probably this issue surfaced again after we reverted compression and storage fix? Please If that's the case,please add it to the open items. This is still on the open items list and I'm not clear who, if anyone, is working on fixing it. It would be good if someone fixed it. :-) -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Apr 29, 2024 at 6:46 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Apr 20, 2024 at 12:17 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> Yes please. Probably this issue surfaced again after we reverted compression and storage fix? Please If that's the case, please add it to the open items.
This is still on the open items list and I'm not clear who, if anyone,
is working on fixing it.
It would be good if someone fixed it. :-)
Here's a patch fixing it.
I have added the reproducer provided by Alexander as a test. I thought of improving that test further to test the compression of the inherited table but did not implement it since we haven't documented the behaviour of compression with inheritance. Defining and implementing compression behaviour for inherited tables was the goal of 0413a556990ba628a3de8a0b58be020fd9a14ed0, which has been reverted.
Best Wishes,
Ashutosh Bapat
Attachment
On Tue, Apr 30, 2024 at 2:19 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > On Mon, Apr 29, 2024 at 6:46 PM Robert Haas <robertmhaas@gmail.com> wrote: >> On Sat, Apr 20, 2024 at 12:17 AM Ashutosh Bapat >> <ashutosh.bapat.oss@gmail.com> wrote: >> > Yes please. Probably this issue surfaced again after we reverted compression and storage fix? Please If that's thecase, please add it to the open items. >> >> This is still on the open items list and I'm not clear who, if anyone, >> is working on fixing it. >> >> It would be good if someone fixed it. :-) > > Here's a patch fixing it. > > I have added the reproducer provided by Alexander as a test. I thought of improving that test further to test the compressionof the inherited table but did not implement it since we haven't documented the behaviour of compression withinheritance. Defining and implementing compression behaviour for inherited tables was the goal of 0413a556990ba628a3de8a0b58be020fd9a14ed0,which has been reverted. I took a look at this patch. Currently this case crashes: CREATE TABLE cmdata(f1 text COMPRESSION pglz); CREATE TABLE cmdata3(f1 text); CREATE TABLE cminh() INHERITS (cmdata, cmdata3); The patch makes this succeed, but I was initially unclear why it didn't make it fail with an error instead: you can argue that cmdata has pglz and cmdata3 has default and those are different. It seems that prior precedent goes both ways -- we treat the absence of a STORAGE specification as STORAGE EXTENDED and it conflicts with an explicit storage specification on some other inheritance parent - but on the other hand, we treat the absence of a default as compatible with any explicit default, similar to what happens here. But I eventually realized that you're just putting back behavior that we had in previous releases: pre-v17, the code already works the way this patch makes it do, and MergeChildAttribute() is already coded similar to this. As Alexander Lakhin said upthread, 4d969b2f8 seems to have broken this. So now I think this is committable, but I can't do it now because I won't be around for the next few hours in case the buildfarm blows up. I can do it tomorrow, or perhaps Peter would like to handle it since it seems to have been his commit that introduced the issue. -- Robert Haas EDB: http://www.enterprisedb.com
On 30.04.24 21:48, Robert Haas wrote: > I took a look at this patch. Currently this case crashes: > > CREATE TABLE cmdata(f1 text COMPRESSION pglz); > CREATE TABLE cmdata3(f1 text); > CREATE TABLE cminh() INHERITS (cmdata, cmdata3); > > The patch makes this succeed, but I was initially unclear why it > didn't make it fail with an error instead: you can argue that cmdata > has pglz and cmdata3 has default and those are different. It seems > that prior precedent goes both ways -- we treat the absence of a > STORAGE specification as STORAGE EXTENDED and it conflicts with an > explicit storage specification on some other inheritance parent - but > on the other hand, we treat the absence of a default as compatible > with any explicit default, similar to what happens here. The actual behavior here is arguably not ideal. It was the purpose of the other thread mentioned upthread to improve that, but that was not successful for the time being. > So now I think this is committable, but I can't do it now because I > won't be around for the next few hours in case the buildfarm blows up. > I can do it tomorrow, or perhaps Peter would like to handle it since > it seems to have been his commit that introduced the issue. I have committed it now.
On Fri, May 3, 2024 at 2:47 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 30.04.24 21:48, Robert Haas wrote:
> I took a look at this patch. Currently this case crashes:
>
> CREATE TABLE cmdata(f1 text COMPRESSION pglz);
> CREATE TABLE cmdata3(f1 text);
> CREATE TABLE cminh() INHERITS (cmdata, cmdata3);
>
> The patch makes this succeed, but I was initially unclear why it
> didn't make it fail with an error instead: you can argue that cmdata
> has pglz and cmdata3 has default and those are different. It seems
> that prior precedent goes both ways -- we treat the absence of a
> STORAGE specification as STORAGE EXTENDED and it conflicts with an
> explicit storage specification on some other inheritance parent - but
> on the other hand, we treat the absence of a default as compatible
> with any explicit default, similar to what happens here.
The actual behavior here is arguably not ideal. It was the purpose of
the other thread mentioned upthread to improve that, but that was not
successful for the time being.
> So now I think this is committable, but I can't do it now because I
> won't be around for the next few hours in case the buildfarm blows up.
> I can do it tomorrow, or perhaps Peter would like to handle it since
> it seems to have been his commit that introduced the issue.
I have committed it now.
Best Wishes,
Ashutosh Bapat