Thread: Container Types
The standard has several constructs for creating new types from other types. I don't mean anything like CREATE TYPE here, I mean things like this: - ROW(a, b, c), (<explicit row value constructor>) - ARRAY[a, b, c], (<array value constructor by enumeration>) - PERIOD(a, b), (<period predicand>) - MULTISET[a, b, c], (<multiset value constructor by enumeration>) - MDARRAY[x(1:3)][a, b, c], (<md-array value constructor by enumeration>) I am not sure what magic we use for the row value constructor. We handle ARRAY by creating an array type for every non-array type that is created. Periods are very similar to range types and we have to create new functions such as int4range(a,b) and int8range(a,b) instead of some kind of generic RANGE(a, b, '[)') and not worrying about what the type is as long as there is a btree opclass for it. Obviously there would have to be an actual type in order to store it in a table, but what I am most interested in here is being able to create them on the fly. I do not think it is feasible to create N new types for every type like we do for arrays on the off-chance you would want to put it in a PERIOD for example. For those who know the code much better than I do, what would be a plausible way forward to support these containers? -- Vik Fearing
On Tue, 2022-12-20 at 10:24 +0100, Vik Fearing wrote: > Obviously there would have to be an actual type in order to store it > in > a table, but what I am most interested in here is being able to > create > them on the fly. I do not think it is feasible to create N new types > for every type like we do for arrays on the off-chance you would want > to > put it in a PERIOD for example. By "on the fly" do you mean when creating real objects, like a table? In that case it might not be so hard, because we can just create an ordinary entry in pg_type. But for this to be a complete feature, I think we need the container types to be useful when constructed within a query, too. E.g. SELECT two_things(v1, v2) FROM foo; where the result of two_things is some new type two_things_int_text which is based on the types of v1 and v2 and has never been used before. I don't think it's reasonable to create a permanent pg_type entry on the fly to answer a read-only query. But we could introduce some notion of an ephemeral in-memory pg_type entry with its own OID, and create that on the fly. One way to do that might be to reserve some of the system OID space (e.g. 15000-16000) for OIDs for temporary catalog entries, and then have some in-memory structure that holds those temporary entries. Any lookups in that range would search the in-memory structure instead of the real catalog. All of this is easier said than done, but I think it could work. We'd also need to think about how to infer types through a container type, e.g. SELECT second_thing(two_things(v1,v2)) FROM foo; should infer that the return type of second_thing() is the type of v2. To do that, perhaps pg_proc entries can include some kind of type sublanguage to do this inference, e.g. "a, b -> b" for second_thing(), or "a, b -> a" for first_thing(). Regards, Jeff Davis
Hi, On 2023-10-25 15:03:04 -0700, Jeff Davis wrote: > On Tue, 2022-12-20 at 10:24 +0100, Vik Fearing wrote: > > Obviously there would have to be an actual type in order to store it > > in > > a table, but what I am most interested in here is being able to > > create > > them on the fly. I do not think it is feasible to create N new types > > for every type like we do for arrays on the off-chance you would want > > to > > put it in a PERIOD for example. > > By "on the fly" do you mean when creating real objects, like a table? > In that case it might not be so hard, because we can just create an > ordinary entry in pg_type. > > But for this to be a complete feature, I think we need the container > types to be useful when constructed within a query, too. E.g. > > SELECT two_things(v1, v2) FROM foo; > > where the result of two_things is some new type two_things_int_text > which is based on the types of v1 and v2 and has never been used > before. > > I don't think it's reasonable to create a permanent pg_type entry on > ethe fly to answer a read-only query. But we could introduce some notion > of an ephemeral in-memory pg_type entry with its own OID, and create > that on the fly. I don't think particularly like the idea of an in-memory pg_type entry. But I'm not sure we need that anyway - we already have this problem with record types. We support both named record types (tables and explicitly created composite types) and ad-hoc ones (created if you write ROW(foo, bar) or something like that). If a record's typmod is negative, it refers to an anonymous row type, if positive it's a named typmod. We even have support for sharing such ad-hoc rowtypes across backends for parallel query... I'd look whether you can generalize that infrastructure. Greetings, Andres Freund
On Wed, 2023-10-25 at 16:01 -0700, Andres Freund wrote: > I'd look whether you can generalize that infrastructure. I had briefly looked at using the record type mechanism before, and it seemed like a challenge because it doesn't really work when passing through a function call: CREATE TABLE t(a INT, b TEXT); INSERT INTO t VALUES(1, 'one'); CREATE FUNCTION id(RECORD) RETURNS RECORD LANGUAGE plpgsql AS $$ BEGIN RETURN $1; END; $$; SELECT t.a FROM t; -- 1 SELECT (id(t)).a FROM t; -- ERROR But now that I think about it, that's really a type inference limitation, and that needs to be solved regardless. After the type inference figures out what the right type is, then I think you're right that an OID is not required to track it, and however we do track it should be able to reuse some of the existing infrastructure for dealing with record types. Regards, Jeff Davis