Thread: Arrays of Complex Types
Folks, I'd like to take the TODO item that reads, "Add support for arrays of complex types," but before I start patching, I'd like to see whether what I'm about to do makes any sense: 1. In src/backend/commands/tablecmds.c, change DefineRelation as follows: * After the first call to heap_create_with_catalog, construct and do another call to for the array type. * Add an appropriate pg_depend entry. 2. Change RemoveRelation to reflect the above. 3. Change TypeRename appropriately, whatever that turns out to be. Does the above make sense? Have I missed anything critical? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > 1. In src/backend/commands/tablecmds.c, change DefineRelation as > follows: > * After the first call to heap_create_with_catalog, construct and > do another call to for the array type. I'm still not happy about the idea of doing this for every relation (and doing it for sequences and indexes would be the height of wastefulness). How about we only do it for composite types? > * Add an appropriate pg_depend entry. > 2. Change RemoveRelation to reflect the above. You only need one of those two: either you drop by hand or you let the dependency machinery deal with it. Not both. > Does the above make sense? Have I missed anything critical? Ummm ... making it actually work? Possibly that just falls out, but I'm not sure. If it turns out that it does Just Work, you might take a stab at arrays of domains too. regards, tom lane
Tom Lane wrote: > David Fetter <david@fetter.org> writes: >> 1. In src/backend/commands/tablecmds.c, change DefineRelation as >> follows: >> * After the first call to heap_create_with_catalog, construct and >> do another call to for the array type. > > I'm still not happy about the idea of doing this for every relation > (and doing it for sequences and indexes would be the height of > wastefulness). How about we only do it for composite types? > I'm not happy about that. I agree that indexes and sequences should not be done, but can we please do plain table types? I would be OK if we skipped catalog tables, if that would make you happier. cheers andrew
On Fri, Mar 02, 2007 at 06:42:14PM -0600, Andrew Dunstan wrote: > > I'm still not happy about the idea of doing this for every relation > > (and doing it for sequences and indexes would be the height of > > wastefulness). How about we only do it for composite types? > > I'm not happy about that. I agree that indexes and sequences should not be > done, but can we please do plain table types? I would be OK if we skipped > catalog tables, if that would make you happier. Two thoughts: 1. Make the array types only when someone actually uses them (create a table with it or something). 2. Make a command: CREATE TYPE ARRAY OF "foo"; The latter has the benefit of not restricting it to an arbitrary choice of types, you could accept both domains and composite types here. I don't think it's unreasonable to require people to predeclare their usage of array-of-compostite-type. Perhaps change the word "CREATE" to "DECLARE". I'm thinking of the explicit declaration of shell types as precedent here. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout wrote: > On Fri, Mar 02, 2007 at 06:42:14PM -0600, Andrew Dunstan wrote: > >>> I'm still not happy about the idea of doing this for every relation >>> (and doing it for sequences and indexes would be the height of >>> wastefulness). How about we only do it for composite types? >>> >> I'm not happy about that. I agree that indexes and sequences should not be >> done, but can we please do plain table types? I would be OK if we skipped >> catalog tables, if that would make you happier. >> > > Two thoughts: > > 1. Make the array types only when someone actually uses them (create a > table with it or something). > > 2. Make a command: CREATE TYPE ARRAY OF "foo"; > > The latter has the benefit of not restricting it to an arbitrary choice > of types, you could accept both domains and composite types here. I > don't think it's unreasonable to require people to predeclare their > usage of array-of-compostite-type. > > Perhaps change the word "CREATE" to "DECLARE". I'm thinking of the > explicit declaration of shell types as precedent here. > > > #2 would be fine with me - not keen on #1. cheers andrew
On Sat, Mar 03, 2007 at 09:06:11AM -0500, Andrew Dunstan wrote: > > > Martijn van Oosterhout wrote: > >On Fri, Mar 02, 2007 at 06:42:14PM -0600, Andrew Dunstan wrote: > > > >>>I'm still not happy about the idea of doing this for every relation > >>>(and doing it for sequences and indexes would be the height of > >>>wastefulness). How about we only do it for composite types? > >>> > >>I'm not happy about that. I agree that indexes and sequences should not be > >>done, but can we please do plain table types? I would be OK if we skipped > >>catalog tables, if that would make you happier. > >> > > > >Two thoughts: > > > >1. Make the array types only when someone actually uses them (create a > >table with it or something). This doesn't sound so great. > >2. Make a command: CREATE TYPE ARRAY OF "foo"; > > > >The latter has the benefit of not restricting it to an arbitrary choice > >of types, you could accept both domains and composite types here. I'm thinking that DOMAINs over simple types should just automatically get corresponding array types. We don't yet support DOMAINs over complex types, but when we do, whatever we do with arrays of regular complex types should apply the same way. > >I don't think it's unreasonable to require people to predeclare > >their usage of array-of-compostite-type. > > > >Perhaps change the word "CREATE" to "DECLARE". I'm thinking of the > >explicit declaration of shell types as precedent here. > > > #2 would be fine with me - not keen on #1. Per your earlier suggestion in IRC, I'm picturing a polymorphic function like pg_catalog.array_for(typepoid OID) pg_catalog.array_for(typename NAME) pg_catalog.array_for(typenamespace NAME, typename NAME) I don't see a good reason to allow putting array types in a different schema from their base types. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
"David Fetter" <david@fetter.org> writes: >> >2. Make a command: CREATE TYPE ARRAY OF "foo"; >> > >> >The latter has the benefit of not restricting it to an arbitrary choice >> >of types, you could accept both domains and composite types here. > > I'm thinking that DOMAINs over simple types should just automatically > get corresponding array types. We don't yet support DOMAINs over > complex types, but when we do, whatever we do with arrays of regular > complex types should apply the same way. Just a thought, but if we supported domains over complex types we could kill two birds with one stone and say when you create a domain the array type is created and that's how you have to refer to arrays over complex types. Probably doesn't make anything easier though. Just a thought. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
David Fetter wrote: > Per your earlier suggestion in IRC, I'm picturing a polymorphic > function like > > pg_catalog.array_for(typepoid OID) > pg_catalog.array_for(typename NAME) > pg_catalog.array_for(typenamespace NAME, typename NAME) > > I don't see a good reason to allow putting array types in a different > schema from their base types. > > Actually, I think I prefer Martijns simple SQL extension suggestion better. cheers andrew
On Fri, Mar 02, 2007 at 06:59:50PM -0500, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > 1. In src/backend/commands/tablecmds.c, change DefineRelation as > > follows: > > * After the first call to heap_create_with_catalog, construct and > > do another call to for the array type. > > I'm still not happy about the idea of doing this for every relation > (and doing it for sequences and indexes would be the height of > wastefulness). How about we only do it for composite types? How about doing it for user-defined tables, views and composite types, and skipping ? > > * Add an appropriate pg_depend entry. > > 2. Change RemoveRelation to reflect the above. > > You only need one of those two: either you drop by hand or you let the > dependency machinery deal with it. Not both. pg_depend it is, then :) > > Does the above make sense? Have I missed anything critical? > > Ummm ... making it actually work? Possibly that just falls out, but > I'm not sure. > > If it turns out that it does Just Work, you might take a stab at > arrays of domains too. OK. I noticed something in src/backend/commands/tablecmds.c which worries me, namely that it ignores functions and views. It should at least be checking that the typeoid isn't in pg_proc.prorettype or pg_proc.proargtypes, and if possible, the DECLARE section of pl/pgsql functions. Is there a way to do SQL at that place in the back-end, or is there some different kind of Magick(TM) needed to access these kinds of things at that level? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > I noticed something in src/backend/commands/tablecmds.c which worries > me, namely that it ignores functions and views. What? regards, tom lane
On Tue, Mar 06, 2007 at 04:14:07PM -0500, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > I noticed something in src/backend/commands/tablecmds.c which > > worries me, namely that it ignores functions and views. > > What? The "it" in question is, find_composite_type_dependencies() Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Tue, Mar 06, 2007 at 04:14:07PM -0500, Tom Lane wrote: >> David Fetter <david@fetter.org> writes: >>> I noticed something in src/backend/commands/tablecmds.c which >>> worries me, namely that it ignores functions and views. >> >> What? > The "it" in question is, find_composite_type_dependencies() All that that's interested in is whether there are stored values of the datatype somewhere. Views don't have any storage, and a function definition doesn't either. regards, tom lane
On Tue, Mar 06, 2007 at 04:24:36PM -0500, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Tue, Mar 06, 2007 at 04:14:07PM -0500, Tom Lane wrote: > >> David Fetter <david@fetter.org> writes: > >>> I noticed something in src/backend/commands/tablecmds.c which > >>> worries me, namely that it ignores functions and views. > >> > >> What? > > > The "it" in question is, find_composite_type_dependencies() > > All that that's interested in is whether there are stored values of the > datatype somewhere. Views don't have any storage, and a function > definition doesn't either. I see. Perhaps I've misunderstood what this thing was for, then. What is it that checks whether it's OK to change a composite type, then? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
I wrote: > I'd like to take the TODO item that reads, "Add support for arrays of > complex types," but before I start patching, I'd like to see whether > what I'm about to do makes any sense: I've written up a patch intended to implement this on the non-pg_catalog tables and VIEWs, but while it builds, it doesn't initdb. Enclosed are the patch and the error log. Any hints as to what I might look at? Cheers, David. > > 1. In src/backend/commands/tablecmds.c, change DefineRelation as > follows: > > * After the first call to heap_create_with_catalog, construct and > do another call to for the array type. > > * Add an appropriate pg_depend entry. > > 2. Change RemoveRelation to reflect the above. > > 3. Change TypeRename appropriately, whatever that turns out to be. > > Does the above make sense? Have I missed anything critical? > > Cheers, > D > -- > David Fetter <david@fetter.org> http://fetter.org/ > phone: +1 415 235 3778 AIM: dfetter666 > Skype: davidfetter > > Remember to vote! > Consider donating to PostgreSQL: http://www.postgresql.org/about/donate > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Attachment
David Fetter <david@fetter.org> writes: > I've written up a patch intended to implement this on the > non-pg_catalog tables and VIEWs, but while it builds, it doesn't > initdb. Enclosed are the patch and the error log. > Any hints as to what I might look at? > creating template1 database in /var/lib/pgsql/pgsql/src/test/regress/./tmp_check/data/base/1 ... ok > initializing pg_authid ... ok > initializing dependencies ... ok > creating system views ... ok > loading system objects' descriptions ... FATAL: cache lookup failed for type 11096 > child process exited with exit code 1 That step of initdb creates a TEMP table ... maybe your patch doesn't work for temp tables? Anyway, you're certainly far enough along there that you could fire up the postmaster and reproduce the error in a normal debugging environment. regards, tom lane
On Sun, Mar 25, 2007 at 10:18:14PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > I've written up a patch intended to implement this on the > > non-pg_catalog tables and VIEWs, but while it builds, it doesn't > > initdb. Enclosed are the patch and the error log. > > Any hints as to what I might look at? > > > creating template1 database in /var/lib/pgsql/pgsql/src/test/regress/./tmp_check/data/base/1 ... ok > > initializing pg_authid ... ok > > initializing dependencies ... ok > > creating system views ... ok > > loading system objects' descriptions ... FATAL: cache lookup failed for type 11096 > > child process exited with exit code 1 > > That step of initdb creates a TEMP table ... maybe your patch doesn't > work for temp tables? Anyway, you're certainly far enough along there > that you could fire up the postmaster and reproduce the error in a > normal debugging environment. I've done that, and thanks to Andrew of Supernews, I've got a slightly better patch, albeit one that bombs out at the same spot. In the patch attached, it appears that TypeCreate is not doing the right thing in pg_depend, either because I'm not invoking it right or because it needs more machinery. Any ideas? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Attachment
On Tue, Mar 27, 2007 at 11:08:44AM -0700, David Fetter wrote: > On Sun, Mar 25, 2007 at 10:18:14PM -0400, Tom Lane wrote: > > David Fetter <david@fetter.org> writes: > > > I've written up a patch intended to implement this on the > > > non-pg_catalog tables and VIEWs, but while it builds, it doesn't > > > initdb. Enclosed are the patch and the error log. > > > Any hints as to what I might look at? > > > > > creating template1 database in /var/lib/pgsql/pgsql/src/test/regress/./tmp_check/data/base/1 ... ok > > > initializing pg_authid ... ok > > > initializing dependencies ... ok > > > creating system views ... ok > > > loading system objects' descriptions ... FATAL: cache lookup failed for type 11096 > > > child process exited with exit code 1 > > > > That step of initdb creates a TEMP table ... maybe your patch > > doesn't work for temp tables? Anyway, you're certainly far enough > > along there that you could fire up the postmaster and reproduce > > the error in a normal debugging environment. > > I've done that, and thanks to Andrew of Supernews, I've got a > slightly better patch, albeit one that bombs out at the same spot. > In the patch attached, it appears that TypeCreate is not doing the > right thing in pg_depend, either because I'm not invoking it right > or because it needs more machinery. > > Any ideas? Pardon the self-follow-up. Per further discussion with Andrew of Supernews and Merlin Moncure, I've added a check for compound types and moved the creation of the array type from DefineRelation in backend/commands/tablecmds.c to heap_create_with_catalog in backend/catalog/heap.c. It now initdb's successfully, but fails on a lot of regression tests. Please find attached the new patch vs. CVS TIP and the regression test output. Am I on the right track here? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Attachment
David Fetter wrote: > > It now initdb's successfully, but fails on a lot of regression tests. > > Please find attached the new patch vs. CVS TIP and the regression test > output. > > The regression test output is useless without seeing the regression diffs. cheers andrew
On 2007-03-27, David Fetter <david@fetter.org> wrote: > Per further discussion with Andrew of Supernews and Merlin Moncure, > I've added a check for compound types and moved the creation of the > array type from DefineRelation in backend/commands/tablecmds.c to > heap_create_with_catalog in backend/catalog/heap.c. You've still got the usage of the relation OID and the relation _type_ OID reversed. The array element type that you pass to TypeCreate must be the _type_ OID. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
On Wed, Mar 28, 2007 at 07:05:24AM -0000, Andrew - Supernews wrote: > On 2007-03-27, David Fetter <david@fetter.org> wrote: > > Per further discussion with Andrew of Supernews and Merlin > > Moncure, I've added a check for compound types and moved the > > creation of the array type from DefineRelation in > > backend/commands/tablecmds.c to heap_create_with_catalog in > > backend/catalog/heap.c. > > You've still got the usage of the relation OID and the relation > _type_ OID reversed. > > The array element type that you pass to TypeCreate must be the > _type_ OID. The attached patch takes it down to two regression test failures, also attached: The first is in type_sanity, which basically doesn't understand that complex types now have array types associated with them and thinks they're orphan array types, so it's actually the test that's not right. The second is in alter_table where ALTER TABLE ... SET SCHEMA doesn't pick up the array types associated with the tables. Andrew at Supernews also noticed that in general, the array type doesn't change schemas when its base type does. Is this the intended behavior? If not, should we change it globally? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Attachment
David Fetter wrote: > The first is in type_sanity, which basically doesn't understand that > complex types now have array types associated with them and thinks > they're orphan array types, so it's actually the test that's not > right. Hmm, I question the usefulness of automatically creating array types for all relation types that are created -- the catalog bloat seems a bit too much. An array of pg_autovacuum for example, does that make sense? I'm not sure what was the reaction to having an "CREATE TYPE foo ARRAY OF bar" command of some kind? I think this was discussed but not explicitely rejected, or was it? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > David Fetter wrote: > > >> The first is in type_sanity, which basically doesn't understand that >> complex types now have array types associated with them and thinks >> they're orphan array types, so it's actually the test that's not >> right. >> > > Hmm, I question the usefulness of automatically creating array types for > all relation types that are created -- the catalog bloat seems a bit too > much. An array of pg_autovacuum for example, does that make sense? > > I'm not sure what was the reaction to having an "CREATE TYPE foo ARRAY > OF bar" command of some kind? I think this was discussed but not > explicitely rejected, or was it? > > It certainly seems rather inconsistent to have array types autocreated for some types but not others. But unless we create them for all types then I think we need a command such as you suggest. How much bloat will this really be? If it's not used it won't get into the type cache. I find it hard to believe there will be any very significant performance effect. cheers andrew
Alvaro Herrera <alvherre@commandprompt.com> writes: > Hmm, I question the usefulness of automatically creating array types for > all relation types that are created -- the catalog bloat seems a bit too > much. An array of pg_autovacuum for example, does that make sense? Not only that, it won't even work for pg_statistic, which has got a special kluge to allow anyarray in a place where it usually mustn't go. > I'm not sure what was the reaction to having an "CREATE TYPE foo ARRAY > OF bar" command of some kind? I think this was discussed but not > explicitely rejected, or was it? I think this is a much better idea than automatically creating a pile of usually-useless types. In the long run maybe we should even migrate to the assumption that array types aren't automatically created? If we think this way, it changes the ground rules for Andrew's question about whether an array type ought to be affected by ALTER TYPE SET SCHEMA on its base type --- it starts to look more like an independent entity than an auxiliary component. I'm not really sure which answer I like better. One point here is that currently the system depends on the naming convention "foo[] is named _foo" to be able to find the array type from the base type. The syntax you suggest would break that. We could fix it by adding More Stuff to pg_type, but I wonder whether it's worth it, compared to sayCREATE ARRAY TYPE FOR foo Also, at the moment ALTER TYPE SET SCHEMA is certainly broken because it destroys this naming convention ... we either abandon the convention or fix SET SCHEMA. regards, tom lane
On Wed, Mar 28, 2007 at 01:33:56PM -0400, Andrew Dunstan wrote: > Alvaro Herrera wrote: > >David Fetter wrote: > >>The first is in type_sanity, which basically doesn't understand > >>that complex types now have array types associated with them and > >>thinks they're orphan array types, so it's actually the test > >>that's not right. > > > >Hmm, I question the usefulness of automatically creating array > >types for all relation types that are created -- the catalog bloat > >seems a bit too much. An array of pg_autovacuum for example, does > >that make sense? > > > >I'm not sure what was the reaction to having an "CREATE TYPE foo > >ARRAY OF bar" command of some kind? I think this was discussed but > >not explicitely rejected, or was it? > > It certainly seems rather inconsistent to have array types > autocreated for some types but not others. This was my thought in the latest version of the patch. > But unless we create them for all types then I think we need a > command such as you suggest. > > How much bloat will this really be? If it's not used it won't get > into the type cache. I find it hard to believe there will be any > very significant performance effect. So do I, but how would we check this? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > CREATE ARRAY TYPE FOR foo I also made a suggestion along the way that we never create array types automatically except for domains. Ie, we don't need a new command, we just document that what you do if you want to create an array of something is create a domain for it then use arrays of that domain. I'm not sure whether having to create a new command is cleaner or less clean than overloading an existing command with two purposes. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark <stark@enterprisedb.com> writes: > "Tom Lane" <tgl@sss.pgh.pa.us> writes: >> CREATE ARRAY TYPE FOR foo > I also made a suggestion along the way that we never create array types > automatically except for domains. That seems awfully strange, not to mention very non-backwards-compatible since it exactly reverses what happens now. I'd be willing to consider it if a domain were a zero-cost addition to the equation, but it is not --- every operation on a domain has to check to see if there are constraints to enforce. You shouldn't have to buy into that overhead to have an array. regards, tom lane
On Wed, Mar 28, 2007 at 03:24:26PM -0400, Tom Lane wrote: > Gregory Stark <stark@enterprisedb.com> writes: > > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > >> CREATE ARRAY TYPE FOR foo > > > I also made a suggestion along the way that we never create array > > types automatically except for domains. > > That seems awfully strange, not to mention very > non-backwards-compatible since it exactly reverses what happens now. > > I'd be willing to consider it if a domain were a zero-cost addition > to the equation, but it is not --- every operation on a domain has > to check to see if there are constraints to enforce. You shouldn't > have to buy into that overhead to have an array. The way I see the big picture, complex types, arrays and domains should all compose without limit, as in arrays of domains of complex types, etc. The SQL standard even has something like our SETOF (which should probably be called BAGOF, but let's not go there just now ;) in the form of MULTISET, and that, too, should eventually be in the above mix. I'm not advocating the idea that people should *store* those compositions--if it were just up to me, I'd disallow it--but they're very handy for input and output :) Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
On Fri, Mar 02, 2007 at 03:40:16PM -0800, David Fetter wrote: > Folks, > > I'd like to take the TODO item that reads, "Add support for arrays of > complex types," but before I start patching, I'd like to see whether > what I'm about to do makes any sense: After several rounds of patches, it appears that it might be easier to create a new typtype entry, which I'll tentatively call 'a' because it seems a little fragile and a lot inelegant and hard to maintain to have typtype='c' and typrelid=InvalidOid mean, "this is an array of complex types." I'd like to see about making this new typtype available for arrays of DOMAINs eventually, but that's not a requirement right this instant. What parts of the code would need a once-over? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > After several rounds of patches, it appears that it might be easier to > create a new typtype entry, which I'll tentatively call 'a' because it > seems a little fragile and a lot inelegant and hard to maintain to > have typtype='c' and typrelid=InvalidOid mean, "this is an array of > complex types." Uh, wouldn't it be typtype = 'c' and typelem != 0 ? > I'd like to see about making this new typtype > available for arrays of DOMAINs eventually, but that's not a > requirement right this instant. Hmm. It might not be a bad idea to switch to 'a' for arrays over regular scalar types too. Right now we have some klugy rules involving looking at typlen to decide whether an array type is a normal array. (There are also some special subscriptable types like name and point, which should continue to not use 'a' because they are not general purpose arrays. So the 'a' marker wouldn't be entirely redundant with typelem being nonzero, rather checking for 'a' would replace the places where we test both typelem and typlen.) OTOH this is a lot of hacking for something that I'm not convinced is really needed. Anyway, the point is that I dislike the idea of doing arrays for complex types differently from those for scalars --- either both should use a new typtype, or neither. If you try to do it differently then you'll have more complexity, not less, since there are a lot of places that shouldn't need to care. get_element_type() is an example. > What parts of the code would need a once-over? A lot :-( ... probably every place that touches typtype or typelem would need at least a look. It'd be a good idea to take the opportunity to start using macros for the values of typtype, as we do for relkind but for some reason never adopted for typtype. regards, tom lane
I wrote: > David Fetter <david@fetter.org> writes: >> What parts of the code would need a once-over? > A lot :-( ... probably every place that touches typtype or typelem would > need at least a look. It'd be a good idea to take the opportunity to > start using macros for the values of typtype, as we do for relkind but > for some reason never adopted for typtype. I just realized that I need to check every usage of typtype to be sure that the enums patch is sane. So, barring objection, I intend to take this opportunity to make the code stop referring directly to 'b', 'c' etc whereever possible. Any objections to these names? #define TYPTYPE_BASE 'b' #define TYPTYPE_COMPOSITE 'c' #define TYPTYPE_DOMAIN 'd' #define TYPTYPE_ENUM 'e' #define TYPTYPE_PSEUDO 'p' regards, tom lane
Tom Lane wrote: > I wrote: > > David Fetter <david@fetter.org> writes: > >> What parts of the code would need a once-over? > > > A lot :-( ... probably every place that touches typtype or typelem would > > need at least a look. It'd be a good idea to take the opportunity to > > start using macros for the values of typtype, as we do for relkind but > > for some reason never adopted for typtype. > > I just realized that I need to check every usage of typtype to be sure > that the enums patch is sane. So, barring objection, I intend to take > this opportunity to make the code stop referring directly to 'b', 'c' > etc whereever possible. Any objections to these names? > > #define TYPTYPE_BASE 'b' > #define TYPTYPE_COMPOSITE 'c' > #define TYPTYPE_DOMAIN 'd' > #define TYPTYPE_ENUM 'e' > #define TYPTYPE_PSEUDO 'p' I like macros. ;-) -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Sat, Mar 31, 2007 at 07:13:20PM -0400, Bruce Momjian wrote: > Tom Lane wrote: > > I wrote: > > > David Fetter <david@fetter.org> writes: > > >> What parts of the code would need a once-over? > > > > > A lot :-( ... probably every place that touches typtype or typelem would > > > need at least a look. It'd be a good idea to take the opportunity to > > > start using macros for the values of typtype, as we do for relkind but > > > for some reason never adopted for typtype. > > > > I just realized that I need to check every usage of typtype to be sure > > that the enums patch is sane. So, barring objection, I intend to take > > this opportunity to make the code stop referring directly to 'b', 'c' > > etc whereever possible. Any objections to these names? > > > > #define TYPTYPE_BASE 'b' > > #define TYPTYPE_COMPOSITE 'c' > > #define TYPTYPE_DOMAIN 'd' > > #define TYPTYPE_ENUM 'e' > > #define TYPTYPE_PSEUDO 'p' > > I like macros. ;-) Macros are great. :) What say we put one in pre-emptively for TYPTYPE_ARRAY? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > What say we put one in pre-emptively for TYPTYPE_ARRAY? When and if the patch appears, you can add it ;-). I'm just intending a search-and-replace at the moment. regards, tom lane
On Sat, Mar 31, 2007 at 07:58:34PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > What say we put one in pre-emptively for TYPTYPE_ARRAY? > > When and if the patch appears, you can add it ;-). I'm just intending a > search-and-replace at the moment. Like this? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Attachment
Bruce Momjian wrote: > > I just realized that I need to check every usage of typtype to be > > sure that the enums patch is sane. So, barring objection, I intend > > to take this opportunity to make the code stop referring directly > > to 'b', 'c' etc whereever possible. Any objections to these names? > > > > #define TYPTYPE_BASE 'b' > > #define TYPTYPE_COMPOSITE 'c' > > #define TYPTYPE_DOMAIN 'd' > > #define TYPTYPE_ENUM 'e' > > #define TYPTYPE_PSEUDO 'p' > > I like macros. ;-) Why not enums? ;-) -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: >>> I just realized that I need to check every usage of typtype to be >>> sure that the enums patch is sane. So, barring objection, I intend >>> to take this opportunity to make the code stop referring directly >>> to 'b', 'c' etc whereever possible. Any objections to these names? > Why not enums? ;-) Well, macros is how we do it elsewhere for "char" system columns. I'm not sure we could rely on the behavior if we declared pg_type.typtype as an enum type ... and if we don't, there's not much point. regards, tom lane
Tom Lane wrote: > > Why not enums? ;-) > > Well, macros is how we do it elsewhere for "char" system columns. > I'm not sure we could rely on the behavior if we declared > pg_type.typtype as an enum type ... and if we don't, there's not > much point. I was thinking C enums: enum typtype_type {TYPTYPE_BASE = 'b',TYPTYPE_COMPOSITE = 'c',TYPTYPE_DOMAIN = 'd',TYPTYPE_ENUM = 'e',TYPTYPE_PSEUDO = 'p' }; I'm not sure if this is better. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> I'm not sure we could rely on the behavior if we declared >> pg_type.typtype as an enum type ... and if we don't, there's not >> much point. > I was thinking C enums: > enum typtype_type { > TYPTYPE_BASE = 'b', > TYPTYPE_COMPOSITE = 'c', > TYPTYPE_DOMAIN = 'd', > TYPTYPE_ENUM = 'e', > TYPTYPE_PSEUDO = 'p' > }; > I'm not sure if this is better. What bothers me about that is I don't think the C spec mandates the representation width. If we could guarantee that enum typtype_type was 1 byte I'd be all for it. regards, tom lane
Tom Lane wrote: > What bothers me about that is I don't think the C spec mandates the > representation width. If we could guarantee that enum typtype_type > was 1 byte I'd be all for it. The width is 4 both for the macro and the enum case. Both #define TYPTYPE_BASE 'b' and enum ... {TYPTYPE_BASE = 'b', effectively generate int constants named TYPTYPE_BASE with decimal value 98. So there are no storage advantages either way. -- Peter Eisentraut http://developer.postgresql.org/~petere/
"Peter Eisentraut" <peter_e@gmx.net> writes: > Tom Lane wrote: >> What bothers me about that is I don't think the C spec mandates the >> representation width. If we could guarantee that enum typtype_type >> was 1 byte I'd be all for it. > > The width is 4 both for the macro and the enum case. Both > > #define TYPTYPE_BASE 'b' > > and > > enum ... { > TYPTYPE_BASE = 'b', > > effectively generate int constants named TYPTYPE_BASE with decimal value > 98. So there are no storage advantages either way. That's not accurate at all. The macro case gives you a constant you can only use to initialize integer variables and members that are explicitly declared with some integral type. If we consistently declare them "char" then they'll be predictably 1 byte long. The enum case does two things. It defines a syntactic meaning for the label, *and* it defines a thing "enum typtype" which can be used to define variables and members. If the latter is used then Tom is saying the standard doesn't specify what width the variable or member will be. We could use enum {} to define the labels and then make a rule that all actual variables should be declared using "char" rather than declaring them as "enum typtype". But I fear somebody would get that wrong some day. The worst thing is that it might work on their compiler or it might even work on all compilers and just silently be causing things to be aligned differently than they expect. On the other hand it I don't really think it would cause any problems if people stored their typtypes in integers. Except for the actual FormData_pg_* structures the precise alignment doesn't actually matter for anything does it? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
Gregory Stark wrote: > > The width is 4 both for the macro and the enum case. Both > > > > #define TYPTYPE_BASE 'b' > > > > and > > > > enum ... { > > TYPTYPE_BASE = 'b', > > > > effectively generate int constants named TYPTYPE_BASE with decimal > > value 98. So there are no storage advantages either way. > > That's not accurate at all. How so? > The macro case gives you a constant you > can only use to initialize integer variables and members that are > explicitly declared with some integral type. If we consistently > declare them "char" then they'll be predictably 1 byte long. But character constants are actually ints, so when you do what you describe then the compiler has to generate code to copy a four-byte integer into a single byte. (Of course that can be optimized away, probably.) > The enum case does two things. It defines a syntactic meaning for the > label, *and* it defines a thing "enum typtype" which can be used to > define variables and members. If the latter is used then Tom is > saying the standard doesn't specify what width the variable or member > will be. The standard says that enums are the same as ints. So when you assign an enum label to a char variable, then compiler has to generate code to copy a four-byte integer into a single byte. (Of course that can be optimized away, probably.) The fact that you can also declare variables of the enum type is not under consideration here. QED -- Peter Eisentraut http://developer.postgresql.org/~petere/
Gregory Stark <stark@enterprisedb.com> writes: > We could use enum {} to define the labels and then make a rule that > all actual variables should be declared using "char" rather than > declaring them as "enum typtype". But I fear somebody would get that > wrong some day. Yeah, that seems to me to be just asking for trouble. > On the other hand it I don't really think it would cause any problems > if people stored their typtypes in integers. Except for the actual > FormData_pg_* structures the precise alignment doesn't actually matter > for anything does it? The layout of the FormData struct is exactly the sticking point. If the compiler makes the size or alignment of a struct field different from what the tuple packing/unpacking code does for the corresponding column type, we've got big trouble. As for Peter's claim that the storage of an enum field is always int, I think the C spec says otherwise. In 6.7.2.2 of C99 I see [#4] Each enumerated type shall be compatible with an integer type. The choice of type is implementation-defined,97) but shall be capable of representing the values of all the members of the enumeration. Theenumerated type is incomplete until after the } that terminates the list of enumerator declarations. 97) An implementation may delay the choice of which integer type until all enumeration constants have been seen. It seems clear to me that this authorizes, but *does not require*, the compiler to store an enum field in a byte or short instead of an int when all the declared values will fit. So if we tried to do this, we'd have the problem of needing compiler-specific data type information entered in pg_type. Perhaps all C compilers do this alike, but how would we know? Anyway the possible gain seems not worth the risk to me. regards, tom lane
Tom Lane wrote: > It seems clear to me that this authorizes, but *does not require*, > the compiler to store an enum field in a byte or short instead of > an int when all the declared values will fit. So if we tried to > do this, we'd have the problem of needing compiler-specific data > type information entered in pg_type. FWIW, I never meant to suggest using enums tuple structures. I did, however, stumble over a case that appears to be handled similar to what I had in mind: see enum CoercionCodes in primnodes.h. Again, it's not really important, but it's interesting to see that there is precedent. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane wrote: >> It seems clear to me that this authorizes, but *does not require*, >> the compiler to store an enum field in a byte or short instead of >> an int when all the declared values will fit. > FWIW, I never meant to suggest using enums tuple structures. I did, > however, stumble over a case that appears to be handled similar to what > I had in mind: see enum CoercionCodes in primnodes.h. Again, it's not > really important, but it's interesting to see that there is precedent. AFAIK, we don't store CoercionCodes in any system catalog columns. But now that you mention it there is at least one place where we do it like that: pg_depend.deptype is a "char" but its values are defined by enum DependencyType. I have some recollection of doing it that way because I was concerned that most places that cared about dependency types should be switch statements that covered all the possible values (which is pretty much the only benefit you get from doing it that way). Having just gone over the typtype uses, there are only a couple of places where we'd win from having that sort of compile-time check for typtype. regards, tom lane
On Fri, Mar 30, 2007 at 05:08:42PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > After several rounds of patches, it appears that it might be easier to > > create a new typtype entry, which I'll tentatively call 'a' because it > > seems a little fragile and a lot inelegant and hard to maintain to > > have typtype='c' and typrelid=InvalidOid mean, "this is an array of > > complex types." > > Uh, wouldn't it be typtype = 'c' and typelem != 0 ? Right. The attached patch passes the current regression tests and at least to a "smoke test" level does what it's supposed to do. I'd really like to help refactor the whole array system to use 'a', tho. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Attachment
Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- David Fetter wrote: > On Fri, Mar 30, 2007 at 05:08:42PM -0400, Tom Lane wrote: > > David Fetter <david@fetter.org> writes: > > > After several rounds of patches, it appears that it might be easier to > > > create a new typtype entry, which I'll tentatively call 'a' because it > > > seems a little fragile and a lot inelegant and hard to maintain to > > > have typtype='c' and typrelid=InvalidOid mean, "this is an array of > > > complex types." > > > > Uh, wouldn't it be typtype = 'c' and typelem != 0 ? > > Right. The attached patch passes the current regression tests and at > least to a "smoke test" level does what it's supposed to do. I'd > really like to help refactor the whole array system to use 'a', tho. > > Cheers, > D > -- > David Fetter <david@fetter.org> http://fetter.org/ > phone: +1 415 235 3778 AIM: dfetter666 > Skype: davidfetter > > Remember to vote! > Consider donating to PostgreSQL: http://www.postgresql.org/about/donate [ Attachment, skipping... ] > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > > Your patch has been added to the PostgreSQL unapplied patches list at: > > http://momjian.postgresql.org/cgi-bin/pgpatches > > It will be applied as soon as one of the PostgreSQL committers reviews > and approves it. > So, hum, what happened to the idea of creating the array types only on demand? > > > David Fetter wrote: > > On Fri, Mar 30, 2007 at 05:08:42PM -0400, Tom Lane wrote: > > > David Fetter <david@fetter.org> writes: > > > > After several rounds of patches, it appears that it might be easier to > > > > create a new typtype entry, which I'll tentatively call 'a' because it > > > > seems a little fragile and a lot inelegant and hard to maintain to > > > > have typtype='c' and typrelid=InvalidOid mean, "this is an array of > > > > complex types." > > > > > > Uh, wouldn't it be typtype = 'c' and typelem != 0 ? > > > > Right. The attached patch passes the current regression tests and at > > least to a "smoke test" level does what it's supposed to do. I'd > > really like to help refactor the whole array system to use 'a', tho. > > > > Cheers, -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Apr 02, 2007 at 10:01:44PM -0400, Alvaro Herrera wrote: > Bruce Momjian wrote: > > > > Your patch has been added to the PostgreSQL unapplied patches list > > at: > > > > http://momjian.postgresql.org/cgi-bin/pgpatches > > > > It will be applied as soon as one of the PostgreSQL committers > > reviews and approves it. > > > > So, hum, what happened to the idea of creating the array types only > on demand? Scotched, as far as I could tell, partly due to nobody's having actually done work toward such a thing, and partly because the closest thing I've heard to an objection is pretty nebulous. :) It's a lot simpler to have them always, and it fits in with the larger picture of making arrays fully composable with other operations like DOMAIN, ENUM and TYPE. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Mon, Apr 02, 2007 at 10:01:44PM -0400, Alvaro Herrera wrote: >> So, hum, what happened to the idea of creating the array types only >> on demand? > Scotched, as far as I could tell, More like "you submitted a patch that entirely ignores multiple people's opinion on what is needed". Bruce may have put this into the patch queue, but do not labor under the delusion that that means it'll get applied as-is. The queue is currently operating as a list of open issues. regards, tom lane
On Tue, Apr 03, 2007 at 02:30:07AM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Mon, Apr 02, 2007 at 10:01:44PM -0400, Alvaro Herrera wrote: > >> So, hum, what happened to the idea of creating the array types > >> only on demand? > > > Scotched, as far as I could tell, > > More like "you submitted a patch that entirely ignores multiple > people's opinion on what is needed". > > Bruce may have put this into the patch queue, but do not labor under > the delusion that that means it'll get applied as-is. I assure you I'm not. Two glaring things it's missing are regression tests and documentation. I should have those in this week. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Mon, Apr 02, 2007 at 10:01:44PM -0400, Alvaro Herrera wrote: > >> So, hum, what happened to the idea of creating the array types only > >> on demand? > > > Scotched, as far as I could tell, > > More like "you submitted a patch that entirely ignores multiple people's > opinion on what is needed". > > Bruce may have put this into the patch queue, but do not labor under > the delusion that that means it'll get applied as-is. The queue is > currently operating as a list of open issues. Correct. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote: > David Fetter <david@fetter.org> writes: > >> On Mon, Apr 02, 2007 at 10:01:44PM -0400, Alvaro Herrera wrote: >> >>> So, hum, what happened to the idea of creating the array types only >>> on demand? >>> > > >> Scotched, as far as I could tell, >> > > More like "you submitted a patch that entirely ignores multiple people's > opinion on what is needed". > > Bruce may have put this into the patch queue, but do not labor under > the delusion that that means it'll get applied as-is. The queue is > currently operating as a list of open issues. > > One of the things that's been bothering me about this proposal is that it leaves untouched and indeed greatly expands the scope of the typename mangling we do. (i.e. adding an entry to pg_type with _ prepended). Up to now we've only used this gadget in a way that might matter a lot on user defined non-composite types, I think, and now we have expanded that to include enums, which are really a special case of user defined non-composites which don't require an extra C module. That's a comparatively small window, but this proposal will extend it to all composites, which is quite a large expansion in scope. And since _ is a perfectly legal initial char for an identifier, if type _foo exists then any attempt to create a table or view or composite called foo will fail. Is it possible to fix this, or am I trying to shut the stable door after the horse has well and truly bolted? If it can be fixed, I'd like to see it fixed before we fix the problem David is trying to address here. It's been suggested to me that this is an insignificant corner case. But I have often seen coding standards that actually require certain classes of identifier to being with _, so it's very far from a merely theoretical point. I'm slightly inclined to agree with David that the danger of catalog bloat isn't that great, and might not justify the extra work that some sort of explicit array creation would involve (e.g. changes in grammar, pg_dump), as long as we are agreed that we don't want array types ever to have their own user definable names or settable namespace. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > One of the things that's been bothering me about this proposal is that > it leaves untouched and indeed greatly expands the scope of the typename > mangling we do. (i.e. adding an entry to pg_type with _ prepended). Yeah, that's been bothering me too. One of the problems with the patch as-is is that it extends the 62-instead-of-63-char limit to table names as well as type names. I've been thinking of proposing that we add a column to pg_type that points from a type to its array type (if any), ie the reverse link from typelem. If we had that then the parser could follow that to determine which type is foo[], instead of relying on the _foo naming convention. I don't suggest that we stop using the naming convention, but it would no longer be a hard-and-fast rule, just a convention. In particular we could rejigger things around the edges to reduce the name conflict problem. For instance the rule for forming array type names could be "prepend _, truncate to less than 64 bytes if necessary, then substitute numbers at the end if needed to get something unique". This is not all that different from what we do now to get unique serial sequence names, for example. This would also open the door to supporting CREATE TYPE foo AS ARRAY OF bar without having to have any restrictions about the name of foo. I'd still much rather do things that way for arrays of composites than invent a ton of pg_type entries that are mostly going to go unused. regards, tom lane PS: Has anyone looked at what it will take to make the entries in an array-of-composite be something smaller than full tuples? It's not going to be anything but a toy unless you can get the per-entry overhead down to something sane. Perhaps the MinimalTuple representation would work.
On Sun, Apr 08, 2007 at 07:08:38PM -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > One of the things that's been bothering me about this proposal is that > > it leaves untouched and indeed greatly expands the scope of the typename > > mangling we do. (i.e. adding an entry to pg_type with _ prepended). > > Yeah, that's been bothering me too. One of the problems with the patch > as-is is that it extends the 62-instead-of-63-char limit to table names > as well as type names. I did this by copying some code which already creates array names, so should that code change to do something different, the 62-instead-of-63-char thing would go away along with it. I agree that the prepended _s are far from optimal. > I've been thinking of proposing that we add a column to pg_type that > points from a type to its array type (if any), ie the reverse link > from typelem. If we had that then the parser could follow that to > determine which type is foo[], instead of relying on the _foo naming > convention. I don't suggest that we stop using the naming convention, > but it would no longer be a hard-and-fast rule, just a convention. That'd be neat :) > In particular we could rejigger things around the edges to reduce > the name conflict problem. For instance the rule for forming array type > names could be "prepend _, truncate to less than 64 bytes if necessary, > then substitute numbers at the end if needed to get something unique". > This is not all that different from what we do now to get unique > serial sequence names, for example. > > This would also open the door to supporting > > CREATE TYPE foo AS ARRAY OF bar I'm sorry to keep harping on this, but I really don't see a use case and do see foot guns both with making the array types optional and with decoupling their names from those of their respective compound types. When they're optional, we get all kinds of "stepping on a step that isn't there" issues, and when they're decoupled, operations like, "ALTER TABLE foo RENAME TO bar" have either surprising or undefined behavior, or both. > without having to have any restrictions about the name of foo. > I'd still much rather do things that way for arrays of composites > than invent a ton of pg_type entries that are mostly going to go > unused. I'm sure there's a better way than my first attempt. > PS: Has anyone looked at what it will take to make the entries in an > array-of-composite be something smaller than full tuples? It's not > going to be anything but a toy unless you can get the per-entry > overhead down to something sane. Perhaps the MinimalTuple > representation would work. Sounds neat, too :) Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Tom Lane wrote: > > I've been thinking of proposing that we add a column to pg_type that > points from a type to its array type (if any), ie the reverse link > from typelem. If we had that then the parser could follow that to > determine which type is foo[], instead of relying on the _foo naming > convention. good. > I don't suggest that we stop using the naming convention, > but it would no longer be a hard-and-fast rule, just a convention. > In particular we could rejigger things around the edges to reduce > the name conflict problem. For instance the rule for forming array type > names could be "prepend _, truncate to less than 64 bytes if necessary, > then substitute numbers at the end if needed to get something unique". > This is not all that different from what we do now to get unique > serial sequence names, for example. > Sounds OK but I'd add something that might make it even more unlikely to generate a name clash. > This would also open the door to supporting > > CREATE TYPE foo AS ARRAY OF bar > > without having to have any restrictions about the name of foo. > I'd still much rather do things that way for arrays of composites > than invent a ton of pg_type entries that are mostly going to go > unused. > > > ISTM we should either do it all automatically or all manually. If you want user defined names for array types then we can forget name mangling for user defined types and do everything manually. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> I don't suggest that we stop using the naming convention, >> but it would no longer be a hard-and-fast rule, just a convention. >> In particular we could rejigger things around the edges to reduce >> the name conflict problem. For instance the rule for forming array type >> names could be "prepend _, truncate to less than 64 bytes if necessary, >> then substitute numbers at the end if needed to get something unique". >> This is not all that different from what we do now to get unique >> serial sequence names, for example. > Sounds OK but I'd add something that might make it even more unlikely to > generate a name clash. Like what? I don't want to stray far from _foo when we don't have to, because I'm sure there is user code out there that'll still rely on that naming convention; we shouldn't break it if we don't have to. regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tom Lane wrote: >> >>> I don't suggest that we stop using the naming convention, >>> but it would no longer be a hard-and-fast rule, just a convention. >>> In particular we could rejigger things around the edges to reduce >>> the name conflict problem. For instance the rule for forming array type >>> names could be "prepend _, truncate to less than 64 bytes if necessary, >>> then substitute numbers at the end if needed to get something unique". >>> This is not all that different from what we do now to get unique >>> serial sequence names, for example. >>> > > >> Sounds OK but I'd add something that might make it even more unlikely to >> generate a name clash. >> > > Like what? I don't want to stray far from _foo when we don't have to, > because I'm sure there is user code out there that'll still rely on > that naming convention; we shouldn't break it if we don't have to. > > > Oh, in that case maybe we'd better live with it :-( I certainly think we should deprecate relying on it. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I'm slightly inclined to agree with David that the danger of catalog > bloat isn't that great, and might not justify the extra work that some > sort of explicit array creation would involve (e.g. changes in grammar, > pg_dump), as long as we are agreed that we don't want array types ever > to have their own user definable names or settable namespace. I did some tests just now to determine the total number of catalog entries associated with a simple table definition. Assuming it has N user columns of built-in types (hence not requiring pg_depend entries for the datatypes), I count 1 pg_class entry for the table itself 1 pg_type entry for the rowtype N + 6 pg_attribute entries for the user and system columns 2 pg_depend entries (type -> table and table -> namespace) 2 pg_shdepend entries (ownership of table and type) Of course this goes up *fast* if you need a toast table, indexes, constraints, etc, but that's the irreducible minimum. Generating an array rowtype would add three more catalog entries to this (the array pg_type entry, a pg_depend arraytype->rowtype link, and another pg_shdepend entry), which isn't a huge percentage overhead. Obviously if we wanted to trim some fat here, getting rid of the redundant pg_attribute entries for system columns would be the first place to look. Based on this, I withdraw my efficiency concern about generating rowtypes for all user tables. I do, however, still object to generating them for system tables. In particular an array type for pg_statistic will actively Not Work and probably constitute a security hole, because of the "anyarray" hack we use there. BTW, I just noticed that we currently create array types with AUTO dependencies on their element type, meaning that you can drop them separately: regression=# create type fooey as enum ('a','b'); CREATE TYPE regression=# drop type _fooey; DROP TYPE Is this a bad idea? If we made the dependency INTERNAL then the system would refuse the drop above. I think we would have to do that if we wanted to add the base->array link I suggested, because otherwise this drop would leave a dangling pointer in pg_type. regards, tom lane
Tom Lane wrote: > > Based on this, I withdraw my efficiency concern about generating > rowtypes for all user tables. I do, however, still object to generating > them for system tables. In particular an array type for pg_statistic > will actively Not Work and probably constitute a security hole, because > of the "anyarray" hack we use there. > > How would we do that? Not create the array types in bootstrap mode? Or just special-case pg_statistic? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > How would we do that? Not create the array types in bootstrap mode? Or > just special-case pg_statistic? Not generate them in bootstrap mode works for me. IIRC, there's code somewhere in there that allows anyarray to pass as a column type in bootstrap mode, so that seems to fit ... regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> How would we do that? Not create the array types in bootstrap mode? Or >> just special-case pg_statistic? >> > > Not generate them in bootstrap mode works for me. IIRC, there's code > somewhere in there that allows anyarray to pass as a column type in > bootstrap mode, so that seems to fit ... > > > OK, summarising what looks to me like a consensus position, ISTM the plan could be: . fix makeArrayTypeName() or similar to make it try harder to generate a unique non-clashing name . remove the existing "62 instead of 63" name length restrictions . autogenerate array types for all explicitly or implicitly created composite types other than for system catalog objects. . defer for the present any consideration of a "CREATE TYPE foo AS ARRAY ..." command. Regarding catalog objects, we might have to try a little harder than just not generating in bootstrap mode - IIRC we generate system views (including pg_stats) in non-bootstrap mode. Maybe we just need to exempt anything in the pg_catalog namespace. What would happen if a user created a view over pg_statistic? Should the test be to avoid arrays for things that depend on the catalogs? Or maybe we should go to the heart of the problem and simply check for pseudo-types directly. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Regarding catalog objects, we might have to try a little harder than > just not generating in bootstrap mode - IIRC we generate system views > (including pg_stats) in non-bootstrap mode. Maybe we just need to exempt > anything in the pg_catalog namespace. What would happen if a user > created a view over pg_statistic? Nothing: regression=# create view vvv as select * from pg_statistic; ERROR: column "stavalues1" has pseudo-type anyarray which means we do have an issue for the pg_stats view. Now that I look instead of guessing, the existing test in CheckAttributeType is not on bootstrap mode but standalone mode: /* Special hack for pg_statistic: allow ANYARRAY during initdb */ if (atttypid != ANYARRAYOID || IsUnderPostmaster) ereport(ERROR, (errcode(ERRCODE_INVALID_TABLE_DEFINITION), errmsg("column \"%s\" has pseudo-type %s", attname, format_type_be(atttypid)))); so for consistency we should use the same condition to suppress types for system catalogs. > Or maybe we should go to the heart > of the problem and simply check for pseudo-types directly. Actually we may have an issue already: regression=# create table zzz (f1 pg_statistic); CREATE TABLE I couldn't make it misbehave in a short amount of trying: regression=# insert into zzz values(row(0,0,0,0,0,0,0,0,0,0,0,0,0,array[1,2],null,null,null,array[12,13],null,null,null)); ERROR: ROW() column has type integer[] instead of type anyarray but I don't feel comfortable about this at all. Maybe CheckAttributeType should be made to recurse into composite columns. regards, tom lane
On Mon, Apr 09, 2007 at 10:40:49AM -0400, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > Regarding catalog objects, we might have to try a little harder than > > just not generating in bootstrap mode - IIRC we generate system views > > (including pg_stats) in non-bootstrap mode. Maybe we just need to exempt > > anything in the pg_catalog namespace. What would happen if a user > > created a view over pg_statistic? > > Nothing: > > regression=# create view vvv as select * from pg_statistic; > ERROR: column "stavalues1" has pseudo-type anyarray > > which means we do have an issue for the pg_stats view. Now that I look > instead of guessing, the existing test in CheckAttributeType is not on > bootstrap mode but standalone mode: > > /* Special hack for pg_statistic: allow ANYARRAY during initdb */ > if (atttypid != ANYARRAYOID || IsUnderPostmaster) > ereport(ERROR, > (errcode(ERRCODE_INVALID_TABLE_DEFINITION), > errmsg("column \"%s\" has pseudo-type %s", > attname, format_type_be(atttypid)))); > > so for consistency we should use the same condition to suppress types > for system catalogs. Groovy :) > > Or maybe we should go to the heart > > of the problem and simply check for pseudo-types directly. > > Actually we may have an issue already: > > regression=# create table zzz (f1 pg_statistic); > CREATE TABLE > > I couldn't make it misbehave in a short amount of trying: > > regression=# insert into zzz values(row(0,0,0,0,0,0,0,0,0,0,0,0,0,array[1,2],null,null,null,array[12,13],null,null,null)); > ERROR: ROW() column has type integer[] instead of type anyarray > > but I don't feel comfortable about this at all. Maybe > CheckAttributeType should be made to recurse into composite columns. That'd be great :) Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
On Mon, Apr 09, 2007 at 10:14:41AM -0400, Andrew Dunstan wrote: > . defer for the present any consideration of a "CREATE TYPE foo AS ARRAY > ..." command. What is the rationale for allowing people to name the array type. When I originally proposed the syntax I presumed that the array name would be kept internal and hidden from the user, just that it would exist after that command. What possible reason is there for allowing the user to give the array type a name? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Martijn van Oosterhout wrote: > On Mon, Apr 09, 2007 at 10:14:41AM -0400, Andrew Dunstan wrote: > >> . defer for the present any consideration of a "CREATE TYPE foo AS ARRAY >> ..." command. >> > > What is the rationale for allowing people to name the array type. When > I originally proposed the syntax I presumed that the array name would > be kept internal and hidden from the user, just that it would exist > after that command. > > What possible reason is there for allowing the user to give the array > type a name? > > Have a nice day, > Some type systems have named array types, some don't. I can live happily with either. Are array types anonymous in the standard? At any rate, the point of the remark was to take this off the table for now. cheers andrew
On Mon, Apr 09, 2007 at 04:07:16PM -0400, Andrew Dunstan wrote: > Some type systems have named array types, some don't. I can live happily > with either. Are array types anonymous in the standard? Yes, they're anonymous in the standard. That doesn't mean we can't give them names if we wanted... > At any rate, the point of the remark was to take this off the table for now. Sure, once the array types are created automatically the command becomes completely redundant. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
I wrote: > > OK, summarising what looks to me like a consensus position, ISTM the > plan could be: > > . fix makeArrayTypeName() or similar to make it try harder to generate > a unique non-clashing name > . remove the existing "62 instead of 63" name length restrictions > . autogenerate array types for all explicitly or implicitly created > composite types other than for system catalog objects. > . defer for the present any consideration of a "CREATE TYPE foo AS > ARRAY ..." command. > > Regarding catalog objects, we might have to try a little harder than > just not generating in bootstrap mode - IIRC we generate system views > (including pg_stats) in non-bootstrap mode. Maybe we just need to > exempt anything in the pg_catalog namespace. What would happen if a > user created a view over pg_statistic? Should the test be to avoid > arrays for things that depend on the catalogs? Or maybe we should go > to the heart of the problem and simply check for pseudo-types directly. > > I've been working on David's patch and done the following: . inhibit creation of array types for composites during initdb . some bug fixes . have CheckAttributeType recurse into composite types, so you can no longer create a table/type with a composite field which contains a pseudo-type column (like pg_statistic) However, there are still some oddities. For example, a change to or removal of the base type affects the array type, but the array type can be directly operated on (e.g. alter type _aa set schema foo ). I'm inclined to say we should prevent direct operations on array types, and they should live or die by their parent types. Thoughts? cheers andrew
On Sun, May 06, 2007 at 01:33:47PM -0400, Andrew Dunstan wrote: > However, there are still some oddities. For example, a change to or > removal of the base type affects the array type, but the array type > can be directly operated on (e.g. alter type _aa set schema foo ). > I'm inclined to say we should prevent direct operations on array > types, and they should live or die by their parent types. > > Thoughts? +1 on binding the array types tightly to the parent types. Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote! Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
Awhile back I wrote: > I did some tests just now to determine the total number of catalog > entries associated with a simple table definition. Assuming it has > N user columns of built-in types (hence not requiring pg_depend entries > for the datatypes), I count > 1 pg_class entry for the table itself > 1 pg_type entry for the rowtype > N + 6 pg_attribute entries for the user and system columns > 2 pg_depend entries (type -> table and table -> namespace) > 2 pg_shdepend entries (ownership of table and type) > Of course this goes up *fast* if you need a toast table, indexes, > constraints, etc, but that's the irreducible minimum. > Generating an array rowtype would add three more catalog entries to this > (the array pg_type entry, a pg_depend arraytype->rowtype link, and > another pg_shdepend entry), which isn't a huge percentage overhead. > Obviously if we wanted to trim some fat here, getting rid of the > redundant pg_attribute entries for system columns would be the first > place to look. BTW, in the array patch as just committed, I was able to get rid of the pg_shdepend entries for a table rowtype (when it's not a free-standing composite type) and for an array type; instead they indirectly depend on the owner of the parent table or element type respectively. So the net increase from 8.2 is only one catalog entry (we save one existing pg_shdepend entry for the rowtype, and then add a pg_type entry for the array type and a pg_depend entry to link it to the rowtype). regards, tom lane
On 5/11/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > BTW, in the array patch as just committed, I was able to get rid of the I am testing this feature, no problem so far. It's fast, and works exactly as advertised! Great work! (aiui, no domain arrays for 8.3?) merlin