Re: [HACKERS] [PATCH] Generic type subscripting - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: [HACKERS] [PATCH] Generic type subscripting |
Date | |
Msg-id | 3455666.1607474043@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [HACKERS] [PATCH] Generic type subscripting (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
Andres Freund <andres@anarazel.de> writes: > On 2020-12-07 17:25:41 -0500, Tom Lane wrote: >> I can see that that should work for the two existing implementations >> of EEO_CASE, but I wasn't sure if you wanted to wire in an assumption >> that it'll always work. > I don't think it's likely to be a problem, and if it ends up being one, > we can still deduplicate the ops at that point... Seems reasonable. Here's a v38 that addresses the semantic loose ends I was worried about. I decided that it's worth allowing subscripting functions to dictate whether they should be considered strict or not, at least for the fetch side (store is still assumed nonstrict always) and whether they should be considered leakproof or not. That requires only a minimal amount of extra code. While the planner does have to do extra catalog lookups to check strictness and leakproofness, those are not common things to need to check, so I don't think we're paying anything in performance for the flexibility. I left out the option of "strict store" because that *would* have required extra code (to generate a nullness test on the replacement value) and the potential use-case seems too narrow to justify that. I also left out any option to control volatility or parallel safety, again on the grounds of lack of use-case; plus, planner checks for those properties would have been in significantly hotter code paths. I'm waiting on your earlier patch to rewrite the llvmjit_expr.c code, but otherwise I think this is ready to go. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 2d44df19fe..ca2f9f3215 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -426,23 +426,28 @@ foreign_expr_walker(Node *node, return false; /* - * Recurse to remaining subexpressions. Since the container - * subscripts must yield (noncollatable) integers, they won't - * affect the inner_cxt state. + * Recurse into the remaining subexpressions. The container + * subscripts will not affect collation of the SubscriptingRef + * result, so do those first and reset inner_cxt afterwards. */ if (!foreign_expr_walker((Node *) sr->refupperindexpr, glob_cxt, &inner_cxt)) return false; + inner_cxt.collation = InvalidOid; + inner_cxt.state = FDW_COLLATE_NONE; if (!foreign_expr_walker((Node *) sr->reflowerindexpr, glob_cxt, &inner_cxt)) return false; + inner_cxt.collation = InvalidOid; + inner_cxt.state = FDW_COLLATE_NONE; if (!foreign_expr_walker((Node *) sr->refexpr, glob_cxt, &inner_cxt)) return false; /* - * Container subscripting should yield same collation as - * input, but for safety use same logic as for function nodes. + * Container subscripting typically yields same collation as + * refexpr's, but in case it doesn't, use same logic as for + * function nodes. */ collation = sr->refcollid; if (collation == InvalidOid) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 79069ddfab..583a5ce3b9 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -8740,6 +8740,21 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </para></entry> </row> + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>typsubscript</structfield> <type>regproc</type> + (references <link linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.<structfield>oid</structfield>) + </para> + <para> + Subscripting handler function's OID, or zero if this type doesn't + support subscripting. Types that are <quote>true</quote> array + types have <structfield>typsubscript</structfield> + = <function>array_subscript_handler</function>, but other types may + have other handler functions to implement specialized subscripting + behavior. + </para></entry> + </row> + <row> <entry role="catalog_table_entry"><para role="column_definition"> <structfield>typelem</structfield> <type>oid</type> @@ -8747,19 +8762,12 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l </para> <para> If <structfield>typelem</structfield> is not 0 then it - identifies another row in <structname>pg_type</structname>. - The current type can then be subscripted like an array yielding - values of type <structfield>typelem</structfield>. A - <quote>true</quote> array type is variable length - (<structfield>typlen</structfield> = -1), - but some fixed-length (<structfield>typlen</structfield> > 0) types - also have nonzero <structfield>typelem</structfield>, for example - <type>name</type> and <type>point</type>. - If a fixed-length type has a <structfield>typelem</structfield> then - its internal representation must be some number of values of the - <structfield>typelem</structfield> data type with no other data. - Variable-length array types have a header defined by the array - subroutines. + identifies another row in <structname>pg_type</structname>, + defining the type yielded by subscripting. This should be 0 + if <structfield>typsubscript</structfield> is 0. However, it can + be 0 when <structfield>typsubscript</structfield> isn't 0, if the + handler doesn't need <structfield>typelem</structfield> to + determine the subscripting result type. </para></entry> </row> diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml index 970b517db9..fc09282db7 100644 --- a/doc/src/sgml/ref/create_type.sgml +++ b/doc/src/sgml/ref/create_type.sgml @@ -43,6 +43,7 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> ( [ , TYPMOD_IN = <replaceable class="parameter">type_modifier_input_function</replaceable> ] [ , TYPMOD_OUT = <replaceable class="parameter">type_modifier_output_function</replaceable> ] [ , ANALYZE = <replaceable class="parameter">analyze_function</replaceable> ] + [ , SUBSCRIPT = <replaceable class="parameter">subscript_function</replaceable> ] [ , INTERNALLENGTH = { <replaceable class="parameter">internallength</replaceable> | VARIABLE } ] [ , PASSEDBYVALUE ] [ , ALIGNMENT = <replaceable class="parameter">alignment</replaceable> ] @@ -196,8 +197,9 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> <replaceable class="parameter">receive_function</replaceable>, <replaceable class="parameter">send_function</replaceable>, <replaceable class="parameter">type_modifier_input_function</replaceable>, - <replaceable class="parameter">type_modifier_output_function</replaceable> and - <replaceable class="parameter">analyze_function</replaceable> + <replaceable class="parameter">type_modifier_output_function</replaceable>, + <replaceable class="parameter">analyze_function</replaceable>, and + <replaceable class="parameter">subscript_function</replaceable> are optional. Generally these functions have to be coded in C or another low-level language. </para> @@ -318,6 +320,26 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> in <filename>src/include/commands/vacuum.h</filename>. </para> + <para> + The optional <replaceable class="parameter">subscript_function</replaceable> + allows the data type to be subscripted in SQL commands. Specifying this + function does not cause the type to be considered a <quote>true</quote> + array type; for example, it will not be a candidate for the result type + of <literal>ARRAY[]</literal> constructs. But if subscripting a value + of the type is a natural notation for extracting data from it, then + a <replaceable class="parameter">subscript_function</replaceable> can + be written to define what that means. The subscript function must be + declared to take a single argument of type <type>internal</type>, and + return an <type>internal</type> result, which is a pointer to a struct + of methods (functions) that implement subscripting. + The detailed API for subscript functions appears + in <filename>src/include/nodes/subscripting.h</filename>; + it may also be useful to read the array implementation + in <filename>src/backend/utils/adt/arraysubs.c</filename>. + Additional information appears in + <xref linkend="sql-createtype-array"/> below. + </para> + <para> While the details of the new type's internal representation are only known to the I/O functions and other functions you create to work with @@ -428,11 +450,12 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> </para> <para> - To indicate that a type is an array, specify the type of the array + To indicate that a type is a fixed-length subscriptable type, + specify the type of the array elements using the <literal>ELEMENT</literal> key word. For example, to define an array of 4-byte integers (<type>int4</type>), specify - <literal>ELEMENT = int4</literal>. More details about array types - appear below. + <literal>ELEMENT = int4</literal>. For more details, + see <xref linkend="sql-createtype-array"/> below. </para> <para> @@ -456,7 +479,7 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> </para> </refsect2> - <refsect2> + <refsect2 id="sql-createtype-array" xreflabel="Array Types"> <title>Array Types</title> <para> @@ -469,7 +492,9 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> repeated until a non-colliding name is found.) This implicitly-created array type is variable length and uses the built-in input and output functions <literal>array_in</literal> and - <literal>array_out</literal>. The array type tracks any changes in its + <literal>array_out</literal>. Furthermore, this type is what the system + uses for constructs such as <literal>ARRAY[]</literal> over the + user-defined type. The array type tracks any changes in its element type's owner or schema, and is dropped if the element type is. </para> @@ -485,13 +510,27 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> using <literal>point[0]</literal> and <literal>point[1]</literal>. Note that this facility only works for fixed-length types whose internal form - is exactly a sequence of identical fixed-length fields. A subscriptable - variable-length type must have the generalized internal representation - used by <literal>array_in</literal> and <literal>array_out</literal>. + is exactly a sequence of identical fixed-length fields. For historical reasons (i.e., this is clearly wrong but it's far too late to change it), subscripting of fixed-length array types starts from zero, rather than from one as for variable-length arrays. </para> + + <para> + Specifying the <option>SUBSCRIPT</option> option allows a data type to + be subscripted, even though the system does not otherwise regard it as + an array type. The behavior just described for fixed-length arrays is + actually implemented by the <option>SUBSCRIPT</option> handler + function <function>raw_array_subscript_handler</function>, which is + used automatically if you specify <option>ELEMENT</option> for a + fixed-length type without also writing <option>SUBSCRIPT</option>. + When specifying a custom <option>SUBSCRIPT</option> function, it is + not necessary to specify <option>ELEMENT</option> unless + the <option>SUBSCRIPT</option> handler function needs to + consult <structfield>typelem</structfield> to find out what to return, + or if you want an explicit dependency from the new type to the + subscripting output type. + </para> </refsect2> </refsect1> @@ -654,6 +693,16 @@ CREATE TYPE <replaceable class="parameter">name</replaceable> </listitem> </varlistentry> + <varlistentry> + <term><replaceable class="parameter">subscript_function</replaceable></term> + <listitem> + <para> + The name of a function that defines what subscripting a value of the + data type does. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><replaceable class="parameter">internallength</replaceable></term> <listitem> diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index c626161408..c4594b0b09 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/backend/catalog/aclchk.c @@ -3114,7 +3114,7 @@ ExecGrant_Type(InternalGrant *istmt) pg_type_tuple = (Form_pg_type) GETSTRUCT(tuple); - if (pg_type_tuple->typelem != 0 && pg_type_tuple->typlen == -1) + if (IsTrueArrayType(pg_type_tuple)) ereport(ERROR, (errcode(ERRCODE_INVALID_GRANT_OPERATION), errmsg("cannot set privileges of array types"), @@ -4392,7 +4392,7 @@ pg_type_aclmask(Oid type_oid, Oid roleid, AclMode mask, AclMaskHow how) * "True" array types don't manage permissions of their own; consult the * element type instead. */ - if (OidIsValid(typeForm->typelem) && typeForm->typlen == -1) + if (IsTrueArrayType(typeForm)) { Oid elttype_oid = typeForm->typelem; diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index 245c2f4fc8..119006159b 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -2074,6 +2074,22 @@ find_expr_references_walker(Node *node, context->addrs); /* fall through to examine arguments */ } + else if (IsA(node, SubscriptingRef)) + { + SubscriptingRef *sbsref = (SubscriptingRef *) node; + + /* + * The refexpr should provide adequate dependency on refcontainertype, + * and that type in turn depends on refelemtype. However, a custom + * subscripting handler might set refrestype to something different + * from either of those, in which case we'd better record it. + */ + if (sbsref->refrestype != sbsref->refcontainertype && + sbsref->refrestype != sbsref->refelemtype) + add_object_address(OCLASS_TYPE, sbsref->refrestype, 0, + context->addrs); + /* fall through to examine arguments */ + } else if (IsA(node, SubPlan)) { /* Extra work needed here if we ever need this case */ diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c index 4cd7d76938..51b5c4f7f6 100644 --- a/src/backend/catalog/heap.c +++ b/src/backend/catalog/heap.c @@ -1079,6 +1079,7 @@ AddNewRelationType(const char *typeName, InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ InvalidOid, /* analyze procedure - default */ + InvalidOid, /* subscript procedure - none */ InvalidOid, /* array element type - irrelevant */ false, /* this is not an array type */ new_array_type, /* array type if any */ @@ -1358,6 +1359,7 @@ heap_create_with_catalog(const char *relname, InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ F_ARRAY_TYPANALYZE, /* array analyze procedure */ + F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */ new_type_oid, /* array element type - the rowtype */ true, /* yes, this is an array type */ InvalidOid, /* this has no array type */ diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c index aeb4a54f63..4252875ef5 100644 --- a/src/backend/catalog/pg_type.c +++ b/src/backend/catalog/pg_type.c @@ -103,6 +103,7 @@ TypeShellMake(const char *typeName, Oid typeNamespace, Oid ownerId) values[Anum_pg_type_typisdefined - 1] = BoolGetDatum(false); values[Anum_pg_type_typdelim - 1] = CharGetDatum(DEFAULT_TYPDELIM); values[Anum_pg_type_typrelid - 1] = ObjectIdGetDatum(InvalidOid); + values[Anum_pg_type_typsubscript - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_type_typelem - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_type_typarray - 1] = ObjectIdGetDatum(InvalidOid); values[Anum_pg_type_typinput - 1] = ObjectIdGetDatum(F_SHELL_IN); @@ -208,6 +209,7 @@ TypeCreate(Oid newTypeOid, Oid typmodinProcedure, Oid typmodoutProcedure, Oid analyzeProcedure, + Oid subscriptProcedure, Oid elementType, bool isImplicitArray, Oid arrayType, @@ -357,6 +359,7 @@ TypeCreate(Oid newTypeOid, values[Anum_pg_type_typisdefined - 1] = BoolGetDatum(true); values[Anum_pg_type_typdelim - 1] = CharGetDatum(typDelim); values[Anum_pg_type_typrelid - 1] = ObjectIdGetDatum(relationOid); + values[Anum_pg_type_typsubscript - 1] = ObjectIdGetDatum(subscriptProcedure); values[Anum_pg_type_typelem - 1] = ObjectIdGetDatum(elementType); values[Anum_pg_type_typarray - 1] = ObjectIdGetDatum(arrayType); values[Anum_pg_type_typinput - 1] = ObjectIdGetDatum(inputProcedure); @@ -667,7 +670,7 @@ GenerateTypeDependencies(HeapTuple typeTuple, recordDependencyOnCurrentExtension(&myself, rebuild); } - /* Normal dependencies on the I/O functions */ + /* Normal dependencies on the I/O and support functions */ if (OidIsValid(typeForm->typinput)) { ObjectAddressSet(referenced, ProcedureRelationId, typeForm->typinput); @@ -710,6 +713,12 @@ GenerateTypeDependencies(HeapTuple typeTuple, add_exact_object_address(&referenced, addrs_normal); } + if (OidIsValid(typeForm->typsubscript)) + { + ObjectAddressSet(referenced, ProcedureRelationId, typeForm->typsubscript); + add_exact_object_address(&referenced, addrs_normal); + } + /* Normal dependency from a domain to its base type. */ if (OidIsValid(typeForm->typbasetype)) { diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c index 483bb65ddc..29fe52d2ce 100644 --- a/src/backend/commands/typecmds.c +++ b/src/backend/commands/typecmds.c @@ -115,6 +115,7 @@ static Oid findTypeSendFunction(List *procname, Oid typeOid); static Oid findTypeTypmodinFunction(List *procname); static Oid findTypeTypmodoutFunction(List *procname); static Oid findTypeAnalyzeFunction(List *procname, Oid typeOid); +static Oid findTypeSubscriptingFunction(List *procname, Oid typeOid); static Oid findRangeSubOpclass(List *opcname, Oid subtype); static Oid findRangeCanonicalFunction(List *procname, Oid typeOid); static Oid findRangeSubtypeDiffFunction(List *procname, Oid subtype); @@ -149,6 +150,7 @@ DefineType(ParseState *pstate, List *names, List *parameters) List *typmodinName = NIL; List *typmodoutName = NIL; List *analyzeName = NIL; + List *subscriptName = NIL; char category = TYPCATEGORY_USER; bool preferred = false; char delimiter = DEFAULT_TYPDELIM; @@ -167,6 +169,7 @@ DefineType(ParseState *pstate, List *names, List *parameters) DefElem *typmodinNameEl = NULL; DefElem *typmodoutNameEl = NULL; DefElem *analyzeNameEl = NULL; + DefElem *subscriptNameEl = NULL; DefElem *categoryEl = NULL; DefElem *preferredEl = NULL; DefElem *delimiterEl = NULL; @@ -183,6 +186,7 @@ DefineType(ParseState *pstate, List *names, List *parameters) Oid typmodinOid = InvalidOid; Oid typmodoutOid = InvalidOid; Oid analyzeOid = InvalidOid; + Oid subscriptOid = InvalidOid; char *array_type; Oid array_oid; Oid typoid; @@ -288,6 +292,8 @@ DefineType(ParseState *pstate, List *names, List *parameters) else if (strcmp(defel->defname, "analyze") == 0 || strcmp(defel->defname, "analyse") == 0) defelp = &analyzeNameEl; + else if (strcmp(defel->defname, "subscript") == 0) + defelp = &subscriptNameEl; else if (strcmp(defel->defname, "category") == 0) defelp = &categoryEl; else if (strcmp(defel->defname, "preferred") == 0) @@ -358,6 +364,8 @@ DefineType(ParseState *pstate, List *names, List *parameters) typmodoutName = defGetQualifiedName(typmodoutNameEl); if (analyzeNameEl) analyzeName = defGetQualifiedName(analyzeNameEl); + if (subscriptNameEl) + subscriptName = defGetQualifiedName(subscriptNameEl); if (categoryEl) { char *p = defGetString(categoryEl); @@ -482,6 +490,24 @@ DefineType(ParseState *pstate, List *names, List *parameters) if (analyzeName) analyzeOid = findTypeAnalyzeFunction(analyzeName, typoid); + /* + * Likewise look up the subscripting procedure if any. If it is not + * specified, but a typelem is specified, allow that if + * raw_array_subscript_handler can be used. (This is for backwards + * compatibility; maybe someday we should throw an error instead.) + */ + if (subscriptName) + subscriptOid = findTypeSubscriptingFunction(subscriptName, typoid); + else if (OidIsValid(elemType)) + { + if (internalLength > 0 && !byValue && get_typlen(elemType) > 0) + subscriptOid = F_RAW_ARRAY_SUBSCRIPT_HANDLER; + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("element type cannot be specified without a valid subscripting procedure"))); + } + /* * Check permissions on functions. We choose to require the creator/owner * of a type to also own the underlying functions. Since creating a type @@ -516,6 +542,9 @@ DefineType(ParseState *pstate, List *names, List *parameters) if (analyzeOid && !pg_proc_ownercheck(analyzeOid, GetUserId())) aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_FUNCTION, NameListToString(analyzeName)); + if (subscriptOid && !pg_proc_ownercheck(subscriptOid, GetUserId())) + aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_FUNCTION, + NameListToString(subscriptName)); #endif /* @@ -551,8 +580,9 @@ DefineType(ParseState *pstate, List *names, List *parameters) typmodinOid, /* typmodin procedure */ typmodoutOid, /* typmodout procedure */ analyzeOid, /* analyze procedure */ + subscriptOid, /* subscript procedure */ elemType, /* element type ID */ - false, /* this is not an array type */ + false, /* this is not an implicit array type */ array_oid, /* array type we are about to create */ InvalidOid, /* base type ID (only for domains) */ defaultValue, /* default type value */ @@ -592,6 +622,7 @@ DefineType(ParseState *pstate, List *names, List *parameters) typmodinOid, /* typmodin procedure */ typmodoutOid, /* typmodout procedure */ F_ARRAY_TYPANALYZE, /* analyze procedure */ + F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */ typoid, /* element type ID */ true, /* yes this is an array type */ InvalidOid, /* no further array type */ @@ -800,6 +831,12 @@ DefineDomain(CreateDomainStmt *stmt) /* Analysis function */ analyzeProcedure = baseType->typanalyze; + /* + * Domains don't need a subscript procedure, since they are not + * subscriptable on their own. If the base type is subscriptable, the + * parser will reduce the type to the base type before subscripting. + */ + /* Inherited default value */ datum = SysCacheGetAttr(TYPEOID, typeTup, Anum_pg_type_typdefault, &isnull); @@ -993,6 +1030,7 @@ DefineDomain(CreateDomainStmt *stmt) InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ analyzeProcedure, /* analyze procedure */ + InvalidOid, /* subscript procedure - none */ InvalidOid, /* no array element type */ false, /* this isn't an array */ domainArrayOid, /* array type we are about to create */ @@ -1033,6 +1071,7 @@ DefineDomain(CreateDomainStmt *stmt) InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ F_ARRAY_TYPANALYZE, /* analyze procedure */ + F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */ address.objectId, /* element type ID */ true, /* yes this is an array type */ InvalidOid, /* no further array type */ @@ -1148,6 +1187,7 @@ DefineEnum(CreateEnumStmt *stmt) InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ InvalidOid, /* analyze procedure - default */ + InvalidOid, /* subscript procedure - none */ InvalidOid, /* element type ID */ false, /* this is not an array type */ enumArrayOid, /* array type we are about to create */ @@ -1188,6 +1228,7 @@ DefineEnum(CreateEnumStmt *stmt) InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ F_ARRAY_TYPANALYZE, /* analyze procedure */ + F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */ enumTypeAddr.objectId, /* element type ID */ true, /* yes this is an array type */ InvalidOid, /* no further array type */ @@ -1476,6 +1517,7 @@ DefineRange(CreateRangeStmt *stmt) InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ F_RANGE_TYPANALYZE, /* analyze procedure */ + InvalidOid, /* subscript procedure - none */ InvalidOid, /* element type ID - none */ false, /* this is not an array type */ rangeArrayOid, /* array type we are about to create */ @@ -1519,6 +1561,7 @@ DefineRange(CreateRangeStmt *stmt) InvalidOid, /* typmodin procedure - none */ InvalidOid, /* typmodout procedure - none */ F_ARRAY_TYPANALYZE, /* analyze procedure */ + F_ARRAY_SUBSCRIPT_HANDLER, /* array subscript procedure */ typoid, /* element type ID */ true, /* yes this is an array type */ InvalidOid, /* no further array type */ @@ -1616,7 +1659,7 @@ makeRangeConstructors(const char *name, Oid namespace, /* - * Find suitable I/O functions for a type. + * Find suitable I/O and other support functions for a type. * * typeOid is the type's OID (which will already exist, if only as a shell * type). @@ -1904,6 +1947,45 @@ findTypeAnalyzeFunction(List *procname, Oid typeOid) return procOid; } +static Oid +findTypeSubscriptingFunction(List *procname, Oid typeOid) +{ + Oid argList[1]; + Oid procOid; + + /* + * Subscripting support functions always take one INTERNAL argument and + * return INTERNAL. (The argument is not used, but we must have it to + * maintain type safety.) + */ + argList[0] = INTERNALOID; + + procOid = LookupFuncName(procname, 1, argList, true); + if (!OidIsValid(procOid)) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_FUNCTION), + errmsg("function %s does not exist", + func_signature_string(procname, 1, NIL, argList)))); + + if (get_func_rettype(procOid) != INTERNALOID) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("type subscripting function %s must return type %s", + NameListToString(procname), "internal"))); + + /* + * We disallow array_subscript_handler() from being selected explicitly, + * since that must only be applied to autogenerated array types. + */ + if (procOid == F_ARRAY_SUBSCRIPT_HANDLER) + ereport(ERROR, + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("user-defined types cannot use subscripting function %s", + NameListToString(procname)))); + + return procOid; +} + /* * Find suitable support functions and opclasses for a range type. */ @@ -3221,8 +3303,7 @@ RenameType(RenameStmt *stmt) errhint("Use ALTER TABLE instead."))); /* don't allow direct alteration of array types, either */ - if (OidIsValid(typTup->typelem) && - get_array_type(typTup->typelem) == typeOid) + if (IsTrueArrayType(typTup)) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot alter array type %s", @@ -3303,8 +3384,7 @@ AlterTypeOwner(List *names, Oid newOwnerId, ObjectType objecttype) errhint("Use ALTER TABLE instead."))); /* don't allow direct alteration of array types, either */ - if (OidIsValid(typTup->typelem) && - get_array_type(typTup->typelem) == typeOid) + if (IsTrueArrayType(typTup)) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("cannot alter array type %s", @@ -3869,8 +3949,7 @@ AlterType(AlterTypeStmt *stmt) /* * For the same reasons, don't allow direct alteration of array types. */ - if (OidIsValid(typForm->typelem) && - get_array_type(typForm->typelem) == typeOid) + if (IsTrueArrayType(typForm)) ereport(ERROR, (errcode(ERRCODE_WRONG_OBJECT_TYPE), errmsg("%s is not a base type", diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index 79b325c7cf..0134ecc261 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -40,6 +40,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/subscripting.h" #include "optimizer/optimizer.h" #include "pgstat.h" #include "utils/acl.h" @@ -2523,19 +2524,51 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, ExprState *state, Datum *resv, bool *resnull) { bool isAssignment = (sbsref->refassgnexpr != NULL); - SubscriptingRefState *sbsrefstate = palloc0(sizeof(SubscriptingRefState)); + int nupper = list_length(sbsref->refupperindexpr); + int nlower = list_length(sbsref->reflowerindexpr); + const SubscriptRoutines *sbsroutines; + SubscriptingRefState *sbsrefstate; + SubscriptExecSteps methods; + char *ptr; List *adjust_jumps = NIL; ListCell *lc; int i; + /* Look up the subscripting support methods */ + sbsroutines = getSubscriptingRoutines(sbsref->refcontainertype, NULL); + + /* Allocate sbsrefstate, with enough space for per-subscript arrays too */ + sbsrefstate = palloc0(MAXALIGN(sizeof(SubscriptingRefState)) + + (nupper + nlower) * (sizeof(Datum) + + 2 * sizeof(bool))); + /* Fill constant fields of SubscriptingRefState */ sbsrefstate->isassignment = isAssignment; - sbsrefstate->refelemtype = sbsref->refelemtype; - sbsrefstate->refattrlength = get_typlen(sbsref->refcontainertype); - get_typlenbyvalalign(sbsref->refelemtype, - &sbsrefstate->refelemlength, - &sbsrefstate->refelembyval, - &sbsrefstate->refelemalign); + sbsrefstate->numupper = nupper; + sbsrefstate->numlower = nlower; + /* Set up per-subscript arrays */ + ptr = ((char *) sbsrefstate) + MAXALIGN(sizeof(SubscriptingRefState)); + sbsrefstate->upperindex = (Datum *) ptr; + ptr += nupper * sizeof(Datum); + sbsrefstate->lowerindex = (Datum *) ptr; + ptr += nlower * sizeof(Datum); + sbsrefstate->upperprovided = (bool *) ptr; + ptr += nupper * sizeof(bool); + sbsrefstate->lowerprovided = (bool *) ptr; + ptr += nlower * sizeof(bool); + sbsrefstate->upperindexnull = (bool *) ptr; + ptr += nupper * sizeof(bool); + sbsrefstate->lowerindexnull = (bool *) ptr; + /* ptr += nlower * sizeof(bool); */ + + /* + * Let the container-type-specific code have a chance. It must fill the + * "methods" struct with function pointers for us to possibly use in + * execution steps below; and it can optionally set up some data pointed + * to by the workspace field. + */ + memset(&methods, 0, sizeof(methods)); + sbsroutines->exec_setup(sbsref, sbsrefstate, &methods); /* * Evaluate array input. It's safe to do so into resv/resnull, because we @@ -2546,11 +2579,11 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, ExecInitExprRec(sbsref->refexpr, state, resv, resnull); /* - * If refexpr yields NULL, and it's a fetch, then result is NULL. We can - * implement this with just JUMP_IF_NULL, since we evaluated the array - * into the desired target location. + * If refexpr yields NULL, and the operation should be strict, then result + * is NULL. We can implement this with just JUMP_IF_NULL, since we + * evaluated the array into the desired target location. */ - if (!isAssignment) + if (!isAssignment && sbsroutines->fetch_strict) { scratch->opcode = EEOP_JUMP_IF_NULL; scratch->d.jump.jumpdone = -1; /* adjust later */ @@ -2559,19 +2592,6 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, state->steps_len - 1); } - /* Verify subscript list lengths are within limit */ - if (list_length(sbsref->refupperindexpr) > MAXDIM) - ereport(ERROR, - (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), - errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)", - list_length(sbsref->refupperindexpr), MAXDIM))); - - if (list_length(sbsref->reflowerindexpr) > MAXDIM) - ereport(ERROR, - (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), - errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)", - list_length(sbsref->reflowerindexpr), MAXDIM))); - /* Evaluate upper subscripts */ i = 0; foreach(lc, sbsref->refupperindexpr) @@ -2582,28 +2602,18 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, if (!e) { sbsrefstate->upperprovided[i] = false; - i++; - continue; + sbsrefstate->upperindexnull[i] = true; + } + else + { + sbsrefstate->upperprovided[i] = true; + /* Each subscript is evaluated into appropriate array entry */ + ExecInitExprRec(e, state, + &sbsrefstate->upperindex[i], + &sbsrefstate->upperindexnull[i]); } - - sbsrefstate->upperprovided[i] = true; - - /* Each subscript is evaluated into subscriptvalue/subscriptnull */ - ExecInitExprRec(e, state, - &sbsrefstate->subscriptvalue, &sbsrefstate->subscriptnull); - - /* ... and then SBSREF_SUBSCRIPT saves it into step's workspace */ - scratch->opcode = EEOP_SBSREF_SUBSCRIPT; - scratch->d.sbsref_subscript.state = sbsrefstate; - scratch->d.sbsref_subscript.off = i; - scratch->d.sbsref_subscript.isupper = true; - scratch->d.sbsref_subscript.jumpdone = -1; /* adjust later */ - ExprEvalPushStep(state, scratch); - adjust_jumps = lappend_int(adjust_jumps, - state->steps_len - 1); i++; } - sbsrefstate->numupper = i; /* Evaluate lower subscripts similarly */ i = 0; @@ -2615,39 +2625,43 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, if (!e) { sbsrefstate->lowerprovided[i] = false; - i++; - continue; + sbsrefstate->lowerindexnull[i] = true; } + else + { + sbsrefstate->lowerprovided[i] = true; + /* Each subscript is evaluated into appropriate array entry */ + ExecInitExprRec(e, state, + &sbsrefstate->lowerindex[i], + &sbsrefstate->lowerindexnull[i]); + } + i++; + } - sbsrefstate->lowerprovided[i] = true; - - /* Each subscript is evaluated into subscriptvalue/subscriptnull */ - ExecInitExprRec(e, state, - &sbsrefstate->subscriptvalue, &sbsrefstate->subscriptnull); - - /* ... and then SBSREF_SUBSCRIPT saves it into step's workspace */ - scratch->opcode = EEOP_SBSREF_SUBSCRIPT; + /* SBSREF_SUBSCRIPTS checks and converts all the subscripts at once */ + if (methods.sbs_check_subscripts) + { + scratch->opcode = EEOP_SBSREF_SUBSCRIPTS; + scratch->d.sbsref_subscript.subscriptfunc = methods.sbs_check_subscripts; scratch->d.sbsref_subscript.state = sbsrefstate; - scratch->d.sbsref_subscript.off = i; - scratch->d.sbsref_subscript.isupper = false; scratch->d.sbsref_subscript.jumpdone = -1; /* adjust later */ ExprEvalPushStep(state, scratch); adjust_jumps = lappend_int(adjust_jumps, state->steps_len - 1); - i++; } - sbsrefstate->numlower = i; - - /* Should be impossible if parser is sane, but check anyway: */ - if (sbsrefstate->numlower != 0 && - sbsrefstate->numupper != sbsrefstate->numlower) - elog(ERROR, "upper and lower index lists are not same length"); if (isAssignment) { Datum *save_innermost_caseval; bool *save_innermost_casenull; + /* Check for unimplemented methods */ + if (!methods.sbs_assign) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("type %s does not support subscripted assignment", + format_type_be(sbsref->refcontainertype)))); + /* * We might have a nested-assignment situation, in which the * refassgnexpr is itself a FieldStore or SubscriptingRef that needs @@ -2664,7 +2678,13 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, */ if (isAssignmentIndirectionExpr(sbsref->refassgnexpr)) { + if (!methods.sbs_fetch_old) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("type %s does not support subscripted assignment", + format_type_be(sbsref->refcontainertype)))); scratch->opcode = EEOP_SBSREF_OLD; + scratch->d.sbsref.subscriptfunc = methods.sbs_fetch_old; scratch->d.sbsref.state = sbsrefstate; ExprEvalPushStep(state, scratch); } @@ -2684,17 +2704,17 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, /* and perform the assignment */ scratch->opcode = EEOP_SBSREF_ASSIGN; + scratch->d.sbsref.subscriptfunc = methods.sbs_assign; scratch->d.sbsref.state = sbsrefstate; ExprEvalPushStep(state, scratch); - } else { /* array fetch is much simpler */ scratch->opcode = EEOP_SBSREF_FETCH; + scratch->d.sbsref.subscriptfunc = methods.sbs_fetch; scratch->d.sbsref.state = sbsrefstate; ExprEvalPushStep(state, scratch); - } /* adjust jump targets */ @@ -2702,7 +2722,7 @@ ExecInitSubscriptingRef(ExprEvalStep *scratch, SubscriptingRef *sbsref, { ExprEvalStep *as = &state->steps[lfirst_int(lc)]; - if (as->opcode == EEOP_SBSREF_SUBSCRIPT) + if (as->opcode == EEOP_SBSREF_SUBSCRIPTS) { Assert(as->d.sbsref_subscript.jumpdone == -1); as->d.sbsref_subscript.jumpdone = state->steps_len; diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c index c09371ad58..6b9fc38134 100644 --- a/src/backend/executor/execExprInterp.c +++ b/src/backend/executor/execExprInterp.c @@ -417,7 +417,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) &&CASE_EEOP_FIELDSELECT, &&CASE_EEOP_FIELDSTORE_DEFORM, &&CASE_EEOP_FIELDSTORE_FORM, - &&CASE_EEOP_SBSREF_SUBSCRIPT, + &&CASE_EEOP_SBSREF_SUBSCRIPTS, &&CASE_EEOP_SBSREF_OLD, &&CASE_EEOP_SBSREF_ASSIGN, &&CASE_EEOP_SBSREF_FETCH, @@ -1396,12 +1396,10 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) EEO_NEXT(); } - EEO_CASE(EEOP_SBSREF_SUBSCRIPT) + EEO_CASE(EEOP_SBSREF_SUBSCRIPTS) { - /* Process an array subscript */ - - /* too complex for an inline implementation */ - if (ExecEvalSubscriptingRef(state, op)) + /* Precheck SubscriptingRef subscript(s) */ + if (op->d.sbsref_subscript.subscriptfunc(state, op, econtext)) { EEO_NEXT(); } @@ -1413,37 +1411,11 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) } EEO_CASE(EEOP_SBSREF_OLD) + EEO_CASE(EEOP_SBSREF_ASSIGN) + EEO_CASE(EEOP_SBSREF_FETCH) { - /* - * Fetch the old value in an sbsref assignment, in case it's - * referenced (via a CaseTestExpr) inside the assignment - * expression. - */ - - /* too complex for an inline implementation */ - ExecEvalSubscriptingRefOld(state, op); - - EEO_NEXT(); - } - - /* - * Perform SubscriptingRef assignment - */ - EEO_CASE(EEOP_SBSREF_ASSIGN) - { - /* too complex for an inline implementation */ - ExecEvalSubscriptingRefAssign(state, op); - - EEO_NEXT(); - } - - /* - * Fetch subset of an array. - */ - EEO_CASE(EEOP_SBSREF_FETCH) - { - /* too complex for an inline implementation */ - ExecEvalSubscriptingRefFetch(state, op); + /* Perform a SubscriptingRef fetch or assignment */ + op->d.sbsref.subscriptfunc(state, op, econtext); EEO_NEXT(); } @@ -3122,200 +3094,6 @@ ExecEvalFieldStoreForm(ExprState *state, ExprEvalStep *op, ExprContext *econtext *op->resnull = false; } -/* - * Process a subscript in a SubscriptingRef expression. - * - * If subscript is NULL, throw error in assignment case, or in fetch case - * set result to NULL and return false (instructing caller to skip the rest - * of the SubscriptingRef sequence). - * - * Subscript expression result is in subscriptvalue/subscriptnull. - * On success, integer subscript value has been saved in upperindex[] or - * lowerindex[] for use later. - */ -bool -ExecEvalSubscriptingRef(ExprState *state, ExprEvalStep *op) -{ - SubscriptingRefState *sbsrefstate = op->d.sbsref_subscript.state; - int *indexes; - int off; - - /* If any index expr yields NULL, result is NULL or error */ - if (sbsrefstate->subscriptnull) - { - if (sbsrefstate->isassignment) - ereport(ERROR, - (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), - errmsg("array subscript in assignment must not be null"))); - *op->resnull = true; - return false; - } - - /* Convert datum to int, save in appropriate place */ - if (op->d.sbsref_subscript.isupper) - indexes = sbsrefstate->upperindex; - else - indexes = sbsrefstate->lowerindex; - off = op->d.sbsref_subscript.off; - - indexes[off] = DatumGetInt32(sbsrefstate->subscriptvalue); - - return true; -} - -/* - * Evaluate SubscriptingRef fetch. - * - * Source container is in step's result variable. - */ -void -ExecEvalSubscriptingRefFetch(ExprState *state, ExprEvalStep *op) -{ - SubscriptingRefState *sbsrefstate = op->d.sbsref.state; - - /* Should not get here if source container (or any subscript) is null */ - Assert(!(*op->resnull)); - - if (sbsrefstate->numlower == 0) - { - /* Scalar case */ - *op->resvalue = array_get_element(*op->resvalue, - sbsrefstate->numupper, - sbsrefstate->upperindex, - sbsrefstate->refattrlength, - sbsrefstate->refelemlength, - sbsrefstate->refelembyval, - sbsrefstate->refelemalign, - op->resnull); - } - else - { - /* Slice case */ - *op->resvalue = array_get_slice(*op->resvalue, - sbsrefstate->numupper, - sbsrefstate->upperindex, - sbsrefstate->lowerindex, - sbsrefstate->upperprovided, - sbsrefstate->lowerprovided, - sbsrefstate->refattrlength, - sbsrefstate->refelemlength, - sbsrefstate->refelembyval, - sbsrefstate->refelemalign); - } -} - -/* - * Compute old container element/slice value for a SubscriptingRef assignment - * expression. Will only be generated if the new-value subexpression - * contains SubscriptingRef or FieldStore. The value is stored into the - * SubscriptingRefState's prevvalue/prevnull fields. - */ -void -ExecEvalSubscriptingRefOld(ExprState *state, ExprEvalStep *op) -{ - SubscriptingRefState *sbsrefstate = op->d.sbsref.state; - - if (*op->resnull) - { - /* whole array is null, so any element or slice is too */ - sbsrefstate->prevvalue = (Datum) 0; - sbsrefstate->prevnull = true; - } - else if (sbsrefstate->numlower == 0) - { - /* Scalar case */ - sbsrefstate->prevvalue = array_get_element(*op->resvalue, - sbsrefstate->numupper, - sbsrefstate->upperindex, - sbsrefstate->refattrlength, - sbsrefstate->refelemlength, - sbsrefstate->refelembyval, - sbsrefstate->refelemalign, - &sbsrefstate->prevnull); - } - else - { - /* Slice case */ - /* this is currently unreachable */ - sbsrefstate->prevvalue = array_get_slice(*op->resvalue, - sbsrefstate->numupper, - sbsrefstate->upperindex, - sbsrefstate->lowerindex, - sbsrefstate->upperprovided, - sbsrefstate->lowerprovided, - sbsrefstate->refattrlength, - sbsrefstate->refelemlength, - sbsrefstate->refelembyval, - sbsrefstate->refelemalign); - sbsrefstate->prevnull = false; - } -} - -/* - * Evaluate SubscriptingRef assignment. - * - * Input container (possibly null) is in result area, replacement value is in - * SubscriptingRefState's replacevalue/replacenull. - */ -void -ExecEvalSubscriptingRefAssign(ExprState *state, ExprEvalStep *op) -{ - SubscriptingRefState *sbsrefstate = op->d.sbsref_subscript.state; - - /* - * For an assignment to a fixed-length container type, both the original - * container and the value to be assigned into it must be non-NULL, else - * we punt and return the original container. - */ - if (sbsrefstate->refattrlength > 0) - { - if (*op->resnull || sbsrefstate->replacenull) - return; - } - - /* - * For assignment to varlena arrays, we handle a NULL original array by - * substituting an empty (zero-dimensional) array; insertion of the new - * element will result in a singleton array value. It does not matter - * whether the new element is NULL. - */ - if (*op->resnull) - { - *op->resvalue = PointerGetDatum(construct_empty_array(sbsrefstate->refelemtype)); - *op->resnull = false; - } - - if (sbsrefstate->numlower == 0) - { - /* Scalar case */ - *op->resvalue = array_set_element(*op->resvalue, - sbsrefstate->numupper, - sbsrefstate->upperindex, - sbsrefstate->replacevalue, - sbsrefstate->replacenull, - sbsrefstate->refattrlength, - sbsrefstate->refelemlength, - sbsrefstate->refelembyval, - sbsrefstate->refelemalign); - } - else - { - /* Slice case */ - *op->resvalue = array_set_slice(*op->resvalue, - sbsrefstate->numupper, - sbsrefstate->upperindex, - sbsrefstate->lowerindex, - sbsrefstate->upperprovided, - sbsrefstate->lowerprovided, - sbsrefstate->replacevalue, - sbsrefstate->replacenull, - sbsrefstate->refattrlength, - sbsrefstate->refelemlength, - sbsrefstate->refelembyval, - sbsrefstate->refelemalign); - } -} - /* * Evaluate a rowtype coercion operation. * This may require rearranging field positions. diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c index da5e3a2c1d..9a6af90914 100644 --- a/src/backend/jit/llvm/llvmjit_expr.c +++ b/src/backend/jit/llvm/llvmjit_expr.c @@ -1113,23 +1113,72 @@ llvm_compile_expr(ExprState *state) break; } - case EEOP_SBSREF_OLD: - build_EvalXFunc(b, mod, "ExecEvalSubscriptingRefOld", - v_state, op); - LLVMBuildBr(b, opblocks[opno + 1]); - break; + case EEOP_SBSREF_SUBSCRIPTS: + { + int jumpdone = op->d.sbsref_subscript.jumpdone; + LLVMTypeRef param_types[3]; + LLVMValueRef v_params[3]; + LLVMTypeRef v_functype; + LLVMValueRef v_func; + LLVMValueRef v_ret; - case EEOP_SBSREF_ASSIGN: - build_EvalXFunc(b, mod, "ExecEvalSubscriptingRefAssign", - v_state, op); - LLVMBuildBr(b, opblocks[opno + 1]); - break; + param_types[0] = l_ptr(StructExprState); + param_types[1] = l_ptr(TypeSizeT); + param_types[2] = l_ptr(StructExprContext); + v_functype = LLVMFunctionType(TypeParamBool, + param_types, + lengthof(param_types), + false); + v_func = l_ptr_const(op->d.sbsref_subscript.subscriptfunc, + l_ptr(v_functype)); + + v_params[0] = v_state; + v_params[1] = l_ptr_const(op, l_ptr(TypeSizeT)); + v_params[2] = v_econtext; + v_ret = LLVMBuildCall(b, + v_func, + v_params, lengthof(v_params), ""); + v_ret = LLVMBuildZExt(b, v_ret, TypeStorageBool, ""); + + LLVMBuildCondBr(b, + LLVMBuildICmp(b, LLVMIntEQ, v_ret, + l_sbool_const(1), ""), + opblocks[opno + 1], + opblocks[jumpdone]); + break; + } + + case EEOP_SBSREF_OLD: + case EEOP_SBSREF_ASSIGN: case EEOP_SBSREF_FETCH: - build_EvalXFunc(b, mod, "ExecEvalSubscriptingRefFetch", - v_state, op); - LLVMBuildBr(b, opblocks[opno + 1]); - break; + { + LLVMTypeRef param_types[3]; + LLVMValueRef v_params[3]; + LLVMTypeRef v_functype; + LLVMValueRef v_func; + + param_types[0] = l_ptr(StructExprState); + param_types[1] = l_ptr(TypeSizeT); + param_types[2] = l_ptr(StructExprContext); + + v_functype = LLVMFunctionType(LLVMVoidType(), + param_types, + lengthof(param_types), + false); + v_func = l_ptr_const(op->d.sbsref.subscriptfunc, + l_ptr(v_functype)); + + v_params[0] = v_state; + v_params[1] = l_ptr_const(op, l_ptr(TypeSizeT)); + v_params[2] = v_econtext; + LLVMBuildCall(b, + v_func, + v_params, lengthof(v_params), ""); + + LLVMBuildBr(b, opblocks[opno + 1]); + break; + } case EEOP_CASE_TESTVAL: { @@ -1744,23 +1793,6 @@ llvm_compile_expr(ExprState *state) LLVMBuildBr(b, opblocks[opno + 1]); break; - case EEOP_SBSREF_SUBSCRIPT: - { - int jumpdone = op->d.sbsref_subscript.jumpdone; - LLVMValueRef v_ret; - - v_ret = build_EvalXFunc(b, mod, "ExecEvalSubscriptingRef", - v_state, op); - v_ret = LLVMBuildZExt(b, v_ret, TypeStorageBool, ""); - - LLVMBuildCondBr(b, - LLVMBuildICmp(b, LLVMIntEQ, v_ret, - l_sbool_const(1), ""), - opblocks[opno + 1], - opblocks[jumpdone]); - break; - } - case EEOP_DOMAIN_TESTVAL: { LLVMBasicBlockRef b_avail, diff --git a/src/backend/jit/llvm/llvmjit_types.c b/src/backend/jit/llvm/llvmjit_types.c index 1ed3cafa2f..ae3c88aad9 100644 --- a/src/backend/jit/llvm/llvmjit_types.c +++ b/src/backend/jit/llvm/llvmjit_types.c @@ -124,10 +124,6 @@ void *referenced_functions[] = ExecEvalSQLValueFunction, ExecEvalScalarArrayOp, ExecEvalSubPlan, - ExecEvalSubscriptingRef, - ExecEvalSubscriptingRefAssign, - ExecEvalSubscriptingRefFetch, - ExecEvalSubscriptingRefOld, ExecEvalSysVar, ExecEvalWholeRowVar, ExecEvalXmlExpr, diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 910906f639..70f8b718e0 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -1548,6 +1548,7 @@ _copySubscriptingRef(const SubscriptingRef *from) COPY_SCALAR_FIELD(refcontainertype); COPY_SCALAR_FIELD(refelemtype); + COPY_SCALAR_FIELD(refrestype); COPY_SCALAR_FIELD(reftypmod); COPY_SCALAR_FIELD(refcollid); COPY_NODE_FIELD(refupperindexpr); diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c index 687609f59e..541e0e6b48 100644 --- a/src/backend/nodes/equalfuncs.c +++ b/src/backend/nodes/equalfuncs.c @@ -276,6 +276,7 @@ _equalSubscriptingRef(const SubscriptingRef *a, const SubscriptingRef *b) { COMPARE_SCALAR_FIELD(refcontainertype); COMPARE_SCALAR_FIELD(refelemtype); + COMPARE_SCALAR_FIELD(refrestype); COMPARE_SCALAR_FIELD(reftypmod); COMPARE_SCALAR_FIELD(refcollid); COMPARE_NODE_FIELD(refupperindexpr); diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c index 1dc873ed25..963f71e99d 100644 --- a/src/backend/nodes/nodeFuncs.c +++ b/src/backend/nodes/nodeFuncs.c @@ -66,15 +66,7 @@ exprType(const Node *expr) type = ((const WindowFunc *) expr)->wintype; break; case T_SubscriptingRef: - { - const SubscriptingRef *sbsref = (const SubscriptingRef *) expr; - - /* slice and/or store operations yield the container type */ - if (sbsref->reflowerindexpr || sbsref->refassgnexpr) - type = sbsref->refcontainertype; - else - type = sbsref->refelemtype; - } + type = ((const SubscriptingRef *) expr)->refrestype; break; case T_FuncExpr: type = ((const FuncExpr *) expr)->funcresulttype; @@ -286,7 +278,6 @@ exprTypmod(const Node *expr) case T_Param: return ((const Param *) expr)->paramtypmod; case T_SubscriptingRef: - /* typmod is the same for container or element */ return ((const SubscriptingRef *) expr)->reftypmod; case T_FuncExpr: { diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 8f5e4e71b2..d78b16ed1d 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -1194,6 +1194,7 @@ _outSubscriptingRef(StringInfo str, const SubscriptingRef *node) WRITE_OID_FIELD(refcontainertype); WRITE_OID_FIELD(refelemtype); + WRITE_OID_FIELD(refrestype); WRITE_INT_FIELD(reftypmod); WRITE_OID_FIELD(refcollid); WRITE_NODE_FIELD(refupperindexpr); diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 169d5581b9..0f6a77afc4 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -671,6 +671,7 @@ _readSubscriptingRef(void) READ_OID_FIELD(refcontainertype); READ_OID_FIELD(refelemtype); + READ_OID_FIELD(refrestype); READ_INT_FIELD(reftypmod); READ_OID_FIELD(refcollid); READ_NODE_FIELD(refupperindexpr); diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index cb7fa66180..e3a81a7a02 100644 --- a/src/backend/optimizer/util/clauses.c +++ b/src/backend/optimizer/util/clauses.c @@ -32,6 +32,7 @@ #include "miscadmin.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/subscripting.h" #include "nodes/supportnodes.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" @@ -839,13 +840,16 @@ contain_nonstrict_functions_walker(Node *node, void *context) } if (IsA(node, SubscriptingRef)) { - /* - * subscripting assignment is nonstrict, but subscripting itself is - * strict - */ - if (((SubscriptingRef *) node)->refassgnexpr != NULL) - return true; + SubscriptingRef *sbsref = (SubscriptingRef *) node; + const SubscriptRoutines *sbsroutines; + /* Subscripting assignment is always presumed nonstrict */ + if (sbsref->refassgnexpr != NULL) + return true; + /* Otherwise we must look up the subscripting support methods */ + sbsroutines = getSubscriptingRoutines(sbsref->refcontainertype, NULL); + if (!sbsroutines->fetch_strict) + return true; /* else fall through to check args */ } if (IsA(node, DistinctExpr)) @@ -1135,12 +1139,14 @@ contain_leaked_vars_walker(Node *node, void *context) case T_SubscriptingRef: { SubscriptingRef *sbsref = (SubscriptingRef *) node; - - /* - * subscripting assignment is leaky, but subscripted fetches - * are not - */ - if (sbsref->refassgnexpr != NULL) + const SubscriptRoutines *sbsroutines; + + /* Consult the subscripting support method info */ + sbsroutines = getSubscriptingRoutines(sbsref->refcontainertype, + NULL); + if (!(sbsref->refassgnexpr != NULL ? + sbsroutines->store_leakproof : + sbsroutines->fetch_leakproof)) { /* Node is leaky, so reject if it contains Vars */ if (contain_var_clause(node)) @@ -2859,6 +2865,11 @@ eval_const_expressions_mutator(Node *node, * known to be immutable, and for which we need no smarts * beyond "simplify if all inputs are constants". * + * Treating SubscriptingRef this way assumes that subscripting + * fetch and assignment are both immutable. This constrains + * type-specific subscripting implementations; maybe we should + * relax it someday. + * * Treating MinMaxExpr this way amounts to assuming that the * btree comparison function it calls is immutable; see the * reasoning in contain_mutable_functions_walker. @@ -3122,10 +3133,10 @@ eval_const_expressions_mutator(Node *node, { /* * This case could be folded into the generic handling used - * for SubscriptingRef etc. But because the simplification - * logic is so trivial, applying evaluate_expr() to perform it - * would be a heavy overhead. BooleanTest is probably common - * enough to justify keeping this bespoke implementation. + * for ArrayExpr etc. But because the simplification logic is + * so trivial, applying evaluate_expr() to perform it would be + * a heavy overhead. BooleanTest is probably common enough to + * justify keeping this bespoke implementation. */ BooleanTest *btest = (BooleanTest *) node; BooleanTest *newbtest; diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c index a2924e3d1c..da6c3ae4b5 100644 --- a/src/backend/parser/parse_coerce.c +++ b/src/backend/parser/parse_coerce.c @@ -26,6 +26,7 @@ #include "parser/parse_type.h" #include "utils/builtins.h" #include "utils/datum.h" /* needed for datumIsEqual() */ +#include "utils/fmgroids.h" #include "utils/lsyscache.h" #include "utils/syscache.h" #include "utils/typcache.h" @@ -2854,8 +2855,8 @@ find_typmod_coercion_function(Oid typeId, targetType = typeidType(typeId); typeForm = (Form_pg_type) GETSTRUCT(targetType); - /* Check for a varlena array type */ - if (typeForm->typelem != InvalidOid && typeForm->typlen == -1) + /* Check for a "true" array type */ + if (IsTrueArrayType(typeForm)) { /* Yes, switch our attention to the element type */ typeId = typeForm->typelem; diff --git a/src/backend/parser/parse_collate.c b/src/backend/parser/parse_collate.c index bf800f5937..13e62a2015 100644 --- a/src/backend/parser/parse_collate.c +++ b/src/backend/parser/parse_collate.c @@ -667,6 +667,29 @@ assign_collations_walker(Node *node, assign_collations_context *context) &loccontext); } break; + case T_SubscriptingRef: + { + /* + * The subscripts are treated as independent + * expressions not contributing to the node's + * collation. Only the container, and the source + * expression if any, contribute. (This models + * the old behavior, in which the subscripts could + * be counted on to be integers and thus not + * contribute anything.) + */ + SubscriptingRef *sbsref = (SubscriptingRef *) node; + + assign_expr_collations(context->pstate, + (Node *) sbsref->refupperindexpr); + assign_expr_collations(context->pstate, + (Node *) sbsref->reflowerindexpr); + (void) assign_collations_walker((Node *) sbsref->refexpr, + &loccontext); + (void) assign_collations_walker((Node *) sbsref->refassgnexpr, + &loccontext); + } + break; default: /* diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 1e62d31aca..ffc96e2a6f 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -406,10 +406,9 @@ transformIndirection(ParseState *pstate, A_Indirection *ind) result = (Node *) transformContainerSubscripts(pstate, result, exprType(result), - InvalidOid, exprTypmod(result), subscripts, - NULL); + false); subscripts = NIL; newresult = ParseFuncOrColumn(pstate, @@ -429,10 +428,9 @@ transformIndirection(ParseState *pstate, A_Indirection *ind) result = (Node *) transformContainerSubscripts(pstate, result, exprType(result), - InvalidOid, exprTypmod(result), subscripts, - NULL); + false); return result; } diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c index 6e98fe55fc..e90f6c9d01 100644 --- a/src/backend/parser/parse_node.c +++ b/src/backend/parser/parse_node.c @@ -20,6 +20,7 @@ #include "mb/pg_wchar.h" #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" +#include "nodes/subscripting.h" #include "parser/parse_coerce.h" #include "parser/parse_expr.h" #include "parser/parse_relation.h" @@ -182,23 +183,16 @@ pcb_error_callback(void *arg) /* * transformContainerType() - * Identify the types involved in a subscripting operation for container + * Identify the actual container type for a subscripting operation. * - * - * On entry, containerType/containerTypmod identify the type of the input value - * to be subscripted (which could be a domain type). These are modified if - * necessary to identify the actual container type and typmod, and the - * container's element type is returned. An error is thrown if the input isn't - * an array type. + * containerType/containerTypmod are modified if necessary to identify + * the actual container type and typmod. This mainly involves smashing + * any domain to its base type, but there are some special considerations. + * Note that caller still needs to check if the result type is a container. */ -Oid +void transformContainerType(Oid *containerType, int32 *containerTypmod) { - Oid origContainerType = *containerType; - Oid elementType; - HeapTuple type_tuple_container; - Form_pg_type type_struct_container; - /* * If the input is a domain, smash to base type, and extract the actual * typmod to be applied to the base type. Subscripting a domain is an @@ -209,35 +203,16 @@ transformContainerType(Oid *containerType, int32 *containerTypmod) *containerType = getBaseTypeAndTypmod(*containerType, containerTypmod); /* - * Here is an array specific code. We treat int2vector and oidvector as - * though they were domains over int2[] and oid[]. This is needed because - * array slicing could create an array that doesn't satisfy the - * dimensionality constraints of the xxxvector type; so we want the result - * of a slice operation to be considered to be of the more general type. + * We treat int2vector and oidvector as though they were domains over + * int2[] and oid[]. This is needed because array slicing could create an + * array that doesn't satisfy the dimensionality constraints of the + * xxxvector type; so we want the result of a slice operation to be + * considered to be of the more general type. */ if (*containerType == INT2VECTOROID) *containerType = INT2ARRAYOID; else if (*containerType == OIDVECTOROID) *containerType = OIDARRAYOID; - - /* Get the type tuple for the container */ - type_tuple_container = SearchSysCache1(TYPEOID, ObjectIdGetDatum(*containerType)); - if (!HeapTupleIsValid(type_tuple_container)) - elog(ERROR, "cache lookup failed for type %u", *containerType); - type_struct_container = (Form_pg_type) GETSTRUCT(type_tuple_container); - - /* needn't check typisdefined since this will fail anyway */ - - elementType = type_struct_container->typelem; - if (elementType == InvalidOid) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("cannot subscript type %s because it is not an array", - format_type_be(origContainerType)))); - - ReleaseSysCache(type_tuple_container); - - return elementType; } /* @@ -249,13 +224,14 @@ transformContainerType(Oid *containerType, int32 *containerTypmod) * an expression that represents the result of extracting a single container * element or a container slice. * - * In a container assignment, we are given a destination container value plus a - * source value that is to be assigned to a single element or a slice of that - * container. We produce an expression that represents the new container value - * with the source data inserted into the right part of the container. + * Container assignments are treated basically the same as container fetches + * here. The caller will modify the result node to insert the source value + * that is to be assigned to the element or slice that a fetch would have + * retrieved. The execution result will be a new container value with + * the source value inserted into the right part of the container. * - * For both cases, if the source container is of a domain-over-array type, - * the result is of the base array type or its element type; essentially, + * For both cases, if the source is of a domain-over-container type, the + * result is the same as if it had been of the container type; essentially, * we must fold a domain to its base type before applying subscripting. * (Note that int2vector and oidvector are treated as domains here.) * @@ -264,48 +240,48 @@ transformContainerType(Oid *containerType, int32 *containerTypmod) * containerType OID of container's datatype (should match type of * containerBase, or be the base type of containerBase's * domain type) - * elementType OID of container's element type (fetch with - * transformContainerType, or pass InvalidOid to do it here) - * containerTypMod typmod for the container (which is also typmod for the - * elements) + * containerTypMod typmod for the container * indirection Untransformed list of subscripts (must not be NIL) - * assignFrom NULL for container fetch, else transformed expression for - * source. + * isAssignment True if this will become a container assignment. */ SubscriptingRef * transformContainerSubscripts(ParseState *pstate, Node *containerBase, Oid containerType, - Oid elementType, int32 containerTypMod, List *indirection, - Node *assignFrom) + bool isAssignment) { + SubscriptingRef *sbsref; + const SubscriptRoutines *sbsroutines; + Oid elementType; bool isSlice = false; - List *upperIndexpr = NIL; - List *lowerIndexpr = NIL; ListCell *idx; - SubscriptingRef *sbsref; /* - * Caller may or may not have bothered to determine elementType. Note - * that if the caller did do so, containerType/containerTypMod must be as - * modified by transformContainerType, ie, smash domain to base type. + * Determine the actual container type, smashing any domain. In the + * assignment case the caller already did this, since it also needs to + * know the actual container type. */ - if (!OidIsValid(elementType)) - elementType = transformContainerType(&containerType, &containerTypMod); + if (!isAssignment) + transformContainerType(&containerType, &containerTypMod); /* + * Verify that the container type is subscriptable, and get its support + * functions and typelem. + */ + sbsroutines = getSubscriptingRoutines(containerType, &elementType); + + /* + * Detect whether any of the indirection items are slice specifiers. + * * A list containing only simple subscripts refers to a single container * element. If any of the items are slice specifiers (lower:upper), then - * the subscript expression means a container slice operation. In this - * case, we convert any non-slice items to slices by treating the single - * subscript as the upper bound and supplying an assumed lower bound of 1. - * We have to prescan the list to see if there are any slice items. + * the subscript expression means a container slice operation. */ foreach(idx, indirection) { - A_Indices *ai = (A_Indices *) lfirst(idx); + A_Indices *ai = lfirst_node(A_Indices, idx); if (ai->is_slice) { @@ -314,121 +290,36 @@ transformContainerSubscripts(ParseState *pstate, } } - /* - * Transform the subscript expressions. - */ - foreach(idx, indirection) - { - A_Indices *ai = lfirst_node(A_Indices, idx); - Node *subexpr; - - if (isSlice) - { - if (ai->lidx) - { - subexpr = transformExpr(pstate, ai->lidx, pstate->p_expr_kind); - /* If it's not int4 already, try to coerce */ - subexpr = coerce_to_target_type(pstate, - subexpr, exprType(subexpr), - INT4OID, -1, - COERCION_ASSIGNMENT, - COERCE_IMPLICIT_CAST, - -1); - if (subexpr == NULL) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("array subscript must have type integer"), - parser_errposition(pstate, exprLocation(ai->lidx)))); - } - else if (!ai->is_slice) - { - /* Make a constant 1 */ - subexpr = (Node *) makeConst(INT4OID, - -1, - InvalidOid, - sizeof(int32), - Int32GetDatum(1), - false, - true); /* pass by value */ - } - else - { - /* Slice with omitted lower bound, put NULL into the list */ - subexpr = NULL; - } - lowerIndexpr = lappend(lowerIndexpr, subexpr); - } - else - Assert(ai->lidx == NULL && !ai->is_slice); - - if (ai->uidx) - { - subexpr = transformExpr(pstate, ai->uidx, pstate->p_expr_kind); - /* If it's not int4 already, try to coerce */ - subexpr = coerce_to_target_type(pstate, - subexpr, exprType(subexpr), - INT4OID, -1, - COERCION_ASSIGNMENT, - COERCE_IMPLICIT_CAST, - -1); - if (subexpr == NULL) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("array subscript must have type integer"), - parser_errposition(pstate, exprLocation(ai->uidx)))); - } - else - { - /* Slice with omitted upper bound, put NULL into the list */ - Assert(isSlice && ai->is_slice); - subexpr = NULL; - } - upperIndexpr = lappend(upperIndexpr, subexpr); - } - - /* - * If doing an array store, coerce the source value to the right type. - * (This should agree with the coercion done by transformAssignedExpr.) - */ - if (assignFrom != NULL) - { - Oid typesource = exprType(assignFrom); - Oid typeneeded = isSlice ? containerType : elementType; - Node *newFrom; - - newFrom = coerce_to_target_type(pstate, - assignFrom, typesource, - typeneeded, containerTypMod, - COERCION_ASSIGNMENT, - COERCE_IMPLICIT_CAST, - -1); - if (newFrom == NULL) - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("array assignment requires type %s" - " but expression is of type %s", - format_type_be(typeneeded), - format_type_be(typesource)), - errhint("You will need to rewrite or cast the expression."), - parser_errposition(pstate, exprLocation(assignFrom)))); - assignFrom = newFrom; - } - /* * Ready to build the SubscriptingRef node. */ - sbsref = (SubscriptingRef *) makeNode(SubscriptingRef); - if (assignFrom != NULL) - sbsref->refassgnexpr = (Expr *) assignFrom; + sbsref = makeNode(SubscriptingRef); sbsref->refcontainertype = containerType; sbsref->refelemtype = elementType; + /* refrestype is to be set by container-specific logic */ sbsref->reftypmod = containerTypMod; /* refcollid will be set by parse_collate.c */ - sbsref->refupperindexpr = upperIndexpr; - sbsref->reflowerindexpr = lowerIndexpr; + /* refupperindexpr, reflowerindexpr are to be set by container logic */ sbsref->refexpr = (Expr *) containerBase; - sbsref->refassgnexpr = (Expr *) assignFrom; + sbsref->refassgnexpr = NULL; /* caller will fill if it's an assignment */ + + /* + * Call the container-type-specific logic to transform the subscripts and + * determine the subscripting result type. + */ + sbsroutines->transform(sbsref, indirection, pstate, + isSlice, isAssignment); + + /* + * Verify we got a valid type (this defends, for example, against someone + * using array_subscript_handler as typsubscript without setting typelem). + */ + if (!OidIsValid(sbsref->refrestype)) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("cannot subscript type %s because it does not support subscripting", + format_type_be(containerType)))); return sbsref; } diff --git a/src/backend/parser/parse_target.c b/src/backend/parser/parse_target.c index ce68663cc2..3dda8e2847 100644 --- a/src/backend/parser/parse_target.c +++ b/src/backend/parser/parse_target.c @@ -861,7 +861,7 @@ transformAssignmentIndirection(ParseState *pstate, if (targetIsSubscripting) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("array assignment to \"%s\" requires type %s" + errmsg("subscripted assignment to \"%s\" requires type %s" " but expression is of type %s", targetName, format_type_be(targetTypeId), @@ -901,26 +901,37 @@ transformAssignmentSubscripts(ParseState *pstate, int location) { Node *result; + SubscriptingRef *sbsref; Oid containerType; int32 containerTypMod; - Oid elementTypeId; Oid typeNeeded; + int32 typmodNeeded; Oid collationNeeded; Assert(subscripts != NIL); - /* Identify the actual array type and element type involved */ + /* Identify the actual container type involved */ containerType = targetTypeId; containerTypMod = targetTypMod; - elementTypeId = transformContainerType(&containerType, &containerTypMod); + transformContainerType(&containerType, &containerTypMod); - /* Identify type that RHS must provide */ - typeNeeded = isSlice ? containerType : elementTypeId; + /* Process subscripts and identify required type for RHS */ + sbsref = transformContainerSubscripts(pstate, + basenode, + containerType, + containerTypMod, + subscripts, + true); + + typeNeeded = sbsref->refrestype; + typmodNeeded = sbsref->reftypmod; /* - * container normally has same collation as elements, but there's an - * exception: we might be subscripting a domain over a container type. In - * that case use collation of the base type. + * Container normally has same collation as its elements, but there's an + * exception: we might be subscripting a domain over a container type. In + * that case use collation of the base type. (This is shaky for arbitrary + * subscripting semantics, but it doesn't matter all that much since we + * only use this to label the collation of a possible CaseTestExpr.) */ if (containerType == targetTypeId) collationNeeded = targetCollation; @@ -933,21 +944,22 @@ transformAssignmentSubscripts(ParseState *pstate, targetName, true, typeNeeded, - containerTypMod, + typmodNeeded, collationNeeded, indirection, next_indirection, rhs, location); - /* process subscripts */ - result = (Node *) transformContainerSubscripts(pstate, - basenode, - containerType, - elementTypeId, - containerTypMod, - subscripts, - rhs); + /* + * Insert the already-properly-coerced RHS into the SubscriptingRef. Then + * set refrestype and reftypmod back to the container type's values. + */ + sbsref->refassgnexpr = (Expr *) rhs; + sbsref->refrestype = containerType; + sbsref->reftypmod = containerTypMod; + + result = (Node *) sbsref; /* If target was a domain over container, need to coerce up to the domain */ if (containerType != targetTypeId) diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile index f6ec7b64cd..ce09ad7375 100644 --- a/src/backend/utils/adt/Makefile +++ b/src/backend/utils/adt/Makefile @@ -17,6 +17,7 @@ OBJS = \ array_typanalyze.o \ array_userfuncs.o \ arrayfuncs.o \ + arraysubs.o \ arrayutils.o \ ascii.o \ bool.o \ diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index a7ea7656c7..4c8a739bc4 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -2044,7 +2044,8 @@ array_get_element_expanded(Datum arraydatum, * array bound. * * NOTE: we assume it is OK to scribble on the provided subscript arrays - * lowerIndx[] and upperIndx[]. These are generally just temporaries. + * lowerIndx[] and upperIndx[]; also, these arrays must be of size MAXDIM + * even when nSubscripts is less. These are generally just temporaries. */ Datum array_get_slice(Datum arraydatum, @@ -2772,7 +2773,8 @@ array_set_element_expanded(Datum arraydatum, * (XXX TODO: allow a corresponding behavior for multidimensional arrays) * * NOTE: we assume it is OK to scribble on the provided index arrays - * lowerIndx[] and upperIndx[]. These are generally just temporaries. + * lowerIndx[] and upperIndx[]; also, these arrays must be of size MAXDIM + * even when nSubscripts is less. These are generally just temporaries. * * NOTE: For assignments, we throw an error for silly subscripts etc, * rather than returning a NULL or empty array as the fetch operations do. diff --git a/src/backend/utils/adt/arraysubs.c b/src/backend/utils/adt/arraysubs.c new file mode 100644 index 0000000000..a081288f42 --- /dev/null +++ b/src/backend/utils/adt/arraysubs.c @@ -0,0 +1,577 @@ +/*------------------------------------------------------------------------- + * + * arraysubs.c + * Subscripting support functions for arrays. + * + * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * + * IDENTIFICATION + * src/backend/utils/adt/arraysubs.c + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "executor/execExpr.h" +#include "nodes/makefuncs.h" +#include "nodes/nodeFuncs.h" +#include "nodes/subscripting.h" +#include "parser/parse_coerce.h" +#include "parser/parse_expr.h" +#include "utils/array.h" +#include "utils/builtins.h" +#include "utils/lsyscache.h" + + +/* SubscriptingRefState.workspace for array subscripting execution */ +typedef struct ArraySubWorkspace +{ + /* Values determined during expression compilation */ + Oid refelemtype; /* OID of the array element type */ + int16 refattrlength; /* typlen of array type */ + int16 refelemlength; /* typlen of the array element type */ + bool refelembyval; /* is the element type pass-by-value? */ + char refelemalign; /* typalign of the element type */ + + /* + * Subscript values converted to integers. Note that these arrays must be + * of length MAXDIM even when dealing with fewer subscripts, because + * array_get/set_slice may scribble on the extra entries. + */ + int upperindex[MAXDIM]; + int lowerindex[MAXDIM]; +} ArraySubWorkspace; + + +/* + * Finish parse analysis of a SubscriptingRef expression for an array. + * + * Transform the subscript expressions, coerce them to integers, + * and determine the result type of the SubscriptingRef node. + */ +static void +array_subscript_transform(SubscriptingRef *sbsref, + List *indirection, + ParseState *pstate, + bool isSlice, + bool isAssignment) +{ + List *upperIndexpr = NIL; + List *lowerIndexpr = NIL; + ListCell *idx; + + /* + * Transform the subscript expressions, and separate upper and lower + * bounds into two lists. + * + * If we have a container slice expression, we convert any non-slice + * indirection items to slices by treating the single subscript as the + * upper bound and supplying an assumed lower bound of 1. + */ + foreach(idx, indirection) + { + A_Indices *ai = lfirst_node(A_Indices, idx); + Node *subexpr; + + if (isSlice) + { + if (ai->lidx) + { + subexpr = transformExpr(pstate, ai->lidx, pstate->p_expr_kind); + /* If it's not int4 already, try to coerce */ + subexpr = coerce_to_target_type(pstate, + subexpr, exprType(subexpr), + INT4OID, -1, + COERCION_ASSIGNMENT, + COERCE_IMPLICIT_CAST, + -1); + if (subexpr == NULL) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("array subscript must have type integer"), + parser_errposition(pstate, exprLocation(ai->lidx)))); + } + else if (!ai->is_slice) + { + /* Make a constant 1 */ + subexpr = (Node *) makeConst(INT4OID, + -1, + InvalidOid, + sizeof(int32), + Int32GetDatum(1), + false, + true); /* pass by value */ + } + else + { + /* Slice with omitted lower bound, put NULL into the list */ + subexpr = NULL; + } + lowerIndexpr = lappend(lowerIndexpr, subexpr); + } + else + Assert(ai->lidx == NULL && !ai->is_slice); + + if (ai->uidx) + { + subexpr = transformExpr(pstate, ai->uidx, pstate->p_expr_kind); + /* If it's not int4 already, try to coerce */ + subexpr = coerce_to_target_type(pstate, + subexpr, exprType(subexpr), + INT4OID, -1, + COERCION_ASSIGNMENT, + COERCE_IMPLICIT_CAST, + -1); + if (subexpr == NULL) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("array subscript must have type integer"), + parser_errposition(pstate, exprLocation(ai->uidx)))); + } + else + { + /* Slice with omitted upper bound, put NULL into the list */ + Assert(isSlice && ai->is_slice); + subexpr = NULL; + } + upperIndexpr = lappend(upperIndexpr, subexpr); + } + + /* ... and store the transformed lists into the SubscriptRef node */ + sbsref->refupperindexpr = upperIndexpr; + sbsref->reflowerindexpr = lowerIndexpr; + + /* Verify subscript list lengths are within implementation limit */ + if (list_length(upperIndexpr) > MAXDIM) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)", + list_length(upperIndexpr), MAXDIM))); + /* We need not check lowerIndexpr separately */ + + /* + * Determine the result type of the subscripting operation. It's the same + * as the array type if we're slicing, else it's the element type. In + * either case, the typmod is the same as the array's, so we need not + * change reftypmod. + */ + if (isSlice) + sbsref->refrestype = sbsref->refcontainertype; + else + sbsref->refrestype = sbsref->refelemtype; +} + +/* + * During execution, process the subscripts in a SubscriptingRef expression. + * + * The subscript expressions are already evaluated in Datum form in the + * SubscriptingRefState's arrays. Check and convert them as necessary. + * + * If any subscript is NULL, we throw error in assignment cases, or in fetch + * cases set result to NULL and return false (instructing caller to skip the + * rest of the SubscriptingRef sequence). + * + * We convert all the subscripts to plain integers and save them in the + * sbsrefstate->workspace arrays. + */ +static bool +array_subscript_check_subscripts(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref_subscript.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + + /* Process upper subscripts */ + for (int i = 0; i < sbsrefstate->numupper; i++) + { + if (sbsrefstate->upperprovided[i]) + { + /* If any index expr yields NULL, result is NULL or error */ + if (sbsrefstate->upperindexnull[i]) + { + if (sbsrefstate->isassignment) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("array subscript in assignment must not be null"))); + *op->resnull = true; + return false; + } + workspace->upperindex[i] = DatumGetInt32(sbsrefstate->upperindex[i]); + } + } + + /* Likewise for lower subscripts */ + for (int i = 0; i < sbsrefstate->numlower; i++) + { + if (sbsrefstate->lowerprovided[i]) + { + /* If any index expr yields NULL, result is NULL or error */ + if (sbsrefstate->lowerindexnull[i]) + { + if (sbsrefstate->isassignment) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("array subscript in assignment must not be null"))); + *op->resnull = true; + return false; + } + workspace->lowerindex[i] = DatumGetInt32(sbsrefstate->lowerindex[i]); + } + } + + return true; +} + +/* + * Evaluate SubscriptingRef fetch for an array element. + * + * Source container is in step's result variable (it's known not NULL, since + * we set fetch_strict to true), and indexes have already been evaluated into + * workspace array. + */ +static void +array_subscript_fetch(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + + /* Should not get here if source array (or any subscript) is null */ + Assert(!(*op->resnull)); + + *op->resvalue = array_get_element(*op->resvalue, + sbsrefstate->numupper, + workspace->upperindex, + workspace->refattrlength, + workspace->refelemlength, + workspace->refelembyval, + workspace->refelemalign, + op->resnull); +} + +/* + * Evaluate SubscriptingRef fetch for an array slice. + * + * Source container is in step's result variable (it's known not NULL, since + * we set fetch_strict to true), and indexes have already been evaluated into + * workspace array. + */ +static void +array_subscript_fetch_slice(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + + /* Should not get here if source array (or any subscript) is null */ + Assert(!(*op->resnull)); + + *op->resvalue = array_get_slice(*op->resvalue, + sbsrefstate->numupper, + workspace->upperindex, + workspace->lowerindex, + sbsrefstate->upperprovided, + sbsrefstate->lowerprovided, + workspace->refattrlength, + workspace->refelemlength, + workspace->refelembyval, + workspace->refelemalign); + /* The slice is never NULL, so no need to change *op->resnull */ +} + +/* + * Evaluate SubscriptingRef assignment for an array element assignment. + * + * Input container (possibly null) is in result area, replacement value is in + * SubscriptingRefState's replacevalue/replacenull. + */ +static void +array_subscript_assign(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + Datum arraySource = *op->resvalue; + + /* + * For an assignment to a fixed-length array type, both the original array + * and the value to be assigned into it must be non-NULL, else we punt and + * return the original array. + */ + if (workspace->refattrlength > 0) + { + if (*op->resnull || sbsrefstate->replacenull) + return; + } + + /* + * For assignment to varlena arrays, we handle a NULL original array by + * substituting an empty (zero-dimensional) array; insertion of the new + * element will result in a singleton array value. It does not matter + * whether the new element is NULL. + */ + if (*op->resnull) + { + arraySource = PointerGetDatum(construct_empty_array(workspace->refelemtype)); + *op->resnull = false; + } + + *op->resvalue = array_set_element(arraySource, + sbsrefstate->numupper, + workspace->upperindex, + sbsrefstate->replacevalue, + sbsrefstate->replacenull, + workspace->refattrlength, + workspace->refelemlength, + workspace->refelembyval, + workspace->refelemalign); + /* The result is never NULL, so no need to change *op->resnull */ +} + +/* + * Evaluate SubscriptingRef assignment for an array slice assignment. + * + * Input container (possibly null) is in result area, replacement value is in + * SubscriptingRefState's replacevalue/replacenull. + */ +static void +array_subscript_assign_slice(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + Datum arraySource = *op->resvalue; + + /* + * For an assignment to a fixed-length array type, both the original array + * and the value to be assigned into it must be non-NULL, else we punt and + * return the original array. + */ + if (workspace->refattrlength > 0) + { + if (*op->resnull || sbsrefstate->replacenull) + return; + } + + /* + * For assignment to varlena arrays, we handle a NULL original array by + * substituting an empty (zero-dimensional) array; insertion of the new + * element will result in a singleton array value. It does not matter + * whether the new element is NULL. + */ + if (*op->resnull) + { + arraySource = PointerGetDatum(construct_empty_array(workspace->refelemtype)); + *op->resnull = false; + } + + *op->resvalue = array_set_slice(arraySource, + sbsrefstate->numupper, + workspace->upperindex, + workspace->lowerindex, + sbsrefstate->upperprovided, + sbsrefstate->lowerprovided, + sbsrefstate->replacevalue, + sbsrefstate->replacenull, + workspace->refattrlength, + workspace->refelemlength, + workspace->refelembyval, + workspace->refelemalign); + /* The result is never NULL, so no need to change *op->resnull */ +} + +/* + * Compute old array element value for a SubscriptingRef assignment + * expression. Will only be called if the new-value subexpression + * contains SubscriptingRef or FieldStore. This is the same as the + * regular fetch case, except that we have to handle a null array, + * and the value should be stored into the SubscriptingRefState's + * prevvalue/prevnull fields. + */ +static void +array_subscript_fetch_old(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + + if (*op->resnull) + { + /* whole array is null, so any element is too */ + sbsrefstate->prevvalue = (Datum) 0; + sbsrefstate->prevnull = true; + } + else + sbsrefstate->prevvalue = array_get_element(*op->resvalue, + sbsrefstate->numupper, + workspace->upperindex, + workspace->refattrlength, + workspace->refelemlength, + workspace->refelembyval, + workspace->refelemalign, + &sbsrefstate->prevnull); +} + +/* + * Compute old array slice value for a SubscriptingRef assignment + * expression. Will only be called if the new-value subexpression + * contains SubscriptingRef or FieldStore. This is the same as the + * regular fetch case, except that we have to handle a null array, + * and the value should be stored into the SubscriptingRefState's + * prevvalue/prevnull fields. + * + * Note: this is presently dead code, because the new value for a + * slice would have to be an array, so it couldn't directly contain a + * FieldStore; nor could it contain a SubscriptingRef assignment, since + * we consider adjacent subscripts to index one multidimensional array + * not nested array types. Future generalizations might make this + * reachable, however. + */ +static void +array_subscript_fetch_old_slice(ExprState *state, + ExprEvalStep *op, + ExprContext *econtext) +{ + SubscriptingRefState *sbsrefstate = op->d.sbsref.state; + ArraySubWorkspace *workspace = (ArraySubWorkspace *) sbsrefstate->workspace; + + if (*op->resnull) + { + /* whole array is null, so any slice is too */ + sbsrefstate->prevvalue = (Datum) 0; + sbsrefstate->prevnull = true; + } + else + { + sbsrefstate->prevvalue = array_get_slice(*op->resvalue, + sbsrefstate->numupper, + workspace->upperindex, + workspace->lowerindex, + sbsrefstate->upperprovided, + sbsrefstate->lowerprovided, + workspace->refattrlength, + workspace->refelemlength, + workspace->refelembyval, + workspace->refelemalign); + /* slices of non-null arrays are never null */ + sbsrefstate->prevnull = false; + } +} + +/* + * Set up execution state for an array subscript operation. + */ +static void +array_exec_setup(const SubscriptingRef *sbsref, + SubscriptingRefState *sbsrefstate, + SubscriptExecSteps *methods) +{ + bool is_slice = (sbsrefstate->numlower != 0); + ArraySubWorkspace *workspace; + + /* + * Enforce the implementation limit on number of array subscripts. This + * check isn't entirely redundant with checking at parse time; conceivably + * the expression was stored by a backend with a different MAXDIM value. + */ + if (sbsrefstate->numupper > MAXDIM) + ereport(ERROR, + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), + errmsg("number of array dimensions (%d) exceeds the maximum allowed (%d)", + sbsrefstate->numupper, MAXDIM))); + + /* Should be impossible if parser is sane, but check anyway: */ + if (sbsrefstate->numlower != 0 && + sbsrefstate->numupper != sbsrefstate->numlower) + elog(ERROR, "upper and lower index lists are not same length"); + + /* + * Allocate type-specific workspace. + */ + workspace = (ArraySubWorkspace *) palloc(sizeof(ArraySubWorkspace)); + sbsrefstate->workspace = workspace; + + /* + * Collect datatype details we'll need at execution. + */ + workspace->refelemtype = sbsref->refelemtype; + workspace->refattrlength = get_typlen(sbsref->refcontainertype); + get_typlenbyvalalign(sbsref->refelemtype, + &workspace->refelemlength, + &workspace->refelembyval, + &workspace->refelemalign); + + /* + * Pass back pointers to appropriate step execution functions. + */ + methods->sbs_check_subscripts = array_subscript_check_subscripts; + if (is_slice) + { + methods->sbs_fetch = array_subscript_fetch_slice; + methods->sbs_assign = array_subscript_assign_slice; + methods->sbs_fetch_old = array_subscript_fetch_old_slice; + } + else + { + methods->sbs_fetch = array_subscript_fetch; + methods->sbs_assign = array_subscript_assign; + methods->sbs_fetch_old = array_subscript_fetch_old; + } +} + +/* + * array_subscript_handler + * Subscripting handler for standard varlena arrays. + * + * This should be used only for "true" array types, which have array headers + * as understood by the varlena array routines, and are referenced by the + * element type's pg_type.typarray field. + */ +Datum +array_subscript_handler(PG_FUNCTION_ARGS) +{ + static const SubscriptRoutines sbsroutines = { + .transform = array_subscript_transform, + .exec_setup = array_exec_setup, + .fetch_strict = true, /* fetch returns NULL for NULL inputs */ + .fetch_leakproof = true, /* fetch returns NULL for bad subscript */ + .store_leakproof = false /* ... but assignment throws error */ + }; + + PG_RETURN_POINTER(&sbsroutines); +} + +/* + * raw_array_subscript_handler + * Subscripting handler for "raw" arrays. + * + * A "raw" array just contains N independent instances of the element type. + * Currently we require both the element type and the array type to be fixed + * length, but it wouldn't be too hard to relax that for the array type. + * + * As of now, all the support code is shared with standard varlena arrays. + * We may split those into separate code paths, but probably that would yield + * only marginal speedups. The main point of having a separate handler is + * so that pg_type.typsubscript clearly indicates the type's semantics. + */ +Datum +raw_array_subscript_handler(PG_FUNCTION_ARGS) +{ + static const SubscriptRoutines sbsroutines = { + .transform = array_subscript_transform, + .exec_setup = array_exec_setup, + .fetch_strict = true, /* fetch returns NULL for NULL inputs */ + .fetch_leakproof = true, /* fetch returns NULL for bad subscript */ + .store_leakproof = false /* ... but assignment throws error */ + }; + + PG_RETURN_POINTER(&sbsroutines); +} diff --git a/src/backend/utils/adt/format_type.c b/src/backend/utils/adt/format_type.c index f2816e4f37..013409aee7 100644 --- a/src/backend/utils/adt/format_type.c +++ b/src/backend/utils/adt/format_type.c @@ -22,6 +22,7 @@ #include "catalog/pg_type.h" #include "mb/pg_wchar.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/lsyscache.h" #include "utils/numeric.h" #include "utils/syscache.h" @@ -138,15 +139,14 @@ format_type_extended(Oid type_oid, int32 typemod, bits16 flags) typeform = (Form_pg_type) GETSTRUCT(tuple); /* - * Check if it's a regular (variable length) array type. Fixed-length - * array types such as "name" shouldn't get deconstructed. As of Postgres - * 8.1, rather than checking typlen we check the toast property, and don't + * Check if it's a "true" array type. Pseudo-array types such as "name" + * shouldn't get deconstructed. Also check the toast property, and don't * deconstruct "plain storage" array types --- this is because we don't * want to show oidvector as oid[]. */ array_base_type = typeform->typelem; - if (array_base_type != InvalidOid && + if (IsTrueArrayType(typeform) && typeform->typstorage != TYPSTORAGE_PLAIN) { /* Switch our attention to the array element type */ diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c index d370348a1c..12557ce3af 100644 --- a/src/backend/utils/adt/jsonfuncs.c +++ b/src/backend/utils/adt/jsonfuncs.c @@ -26,6 +26,7 @@ #include "miscadmin.h" #include "utils/array.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/hsearch.h" #include "utils/json.h" #include "utils/jsonb.h" @@ -3011,7 +3012,7 @@ prepare_column_cache(ColumnIOData *column, column->io.composite.base_typmod = typmod; column->io.composite.domain_info = NULL; } - else if (type->typlen == -1 && OidIsValid(type->typelem)) + else if (IsTrueArrayType(type)) { column->typcat = TYPECAT_ARRAY; column->io.array.element_info = MemoryContextAllocZero(mcxt, diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c index ae23299162..6e5c7379e2 100644 --- a/src/backend/utils/cache/lsyscache.c +++ b/src/backend/utils/cache/lsyscache.c @@ -2634,8 +2634,9 @@ get_typ_typrelid(Oid typid) * * Given the type OID, get the typelem (InvalidOid if not an array type). * - * NB: this only considers varlena arrays to be true arrays; InvalidOid is - * returned if the input is a fixed-length array type. + * NB: this only succeeds for "true" arrays having array_subscript_handler + * as typsubscript. For other types, InvalidOid is returned independently + * of whether they have typelem or typsubscript set. */ Oid get_element_type(Oid typid) @@ -2648,7 +2649,7 @@ get_element_type(Oid typid) Form_pg_type typtup = (Form_pg_type) GETSTRUCT(tp); Oid result; - if (typtup->typlen == -1) + if (IsTrueArrayType(typtup)) result = typtup->typelem; else result = InvalidOid; @@ -2731,7 +2732,7 @@ get_base_element_type(Oid typid) Oid result; /* This test must match get_element_type */ - if (typTup->typlen == -1) + if (IsTrueArrayType(typTup)) result = typTup->typelem; else result = InvalidOid; @@ -2966,6 +2967,64 @@ type_is_collatable(Oid typid) } +/* + * get_typsubscript + * + * Given the type OID, return the type's subscripting handler's OID, + * if it has one. + * + * If typelemp isn't NULL, we also store the type's typelem value there. + * This saves some callers an extra catalog lookup. + */ +RegProcedure +get_typsubscript(Oid typid, Oid *typelemp) +{ + HeapTuple tp; + + tp = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid)); + if (HeapTupleIsValid(tp)) + { + Form_pg_type typform = (Form_pg_type) GETSTRUCT(tp); + RegProcedure handler = typform->typsubscript; + + if (typelemp) + *typelemp = typform->typelem; + ReleaseSysCache(tp); + return handler; + } + else + { + if (typelemp) + *typelemp = InvalidOid; + return InvalidOid; + } +} + +/* + * getSubscriptingRoutines + * + * Given the type OID, fetch the type's subscripting methods struct. + * Fail if type is not subscriptable. + * + * If typelemp isn't NULL, we also store the type's typelem value there. + * This saves some callers an extra catalog lookup. + */ +const struct SubscriptRoutines * +getSubscriptingRoutines(Oid typid, Oid *typelemp) +{ + RegProcedure typsubscript = get_typsubscript(typid, typelemp); + + if (!OidIsValid(typsubscript)) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("cannot subscript type %s because it does not support subscripting", + format_type_be(typid)))); + + return (const struct SubscriptRoutines *) + DatumGetPointer(OidFunctionCall0(typsubscript)); +} + + /* ---------- STATISTICS CACHE ---------- */ /* diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c index dca1d48e89..5883fde367 100644 --- a/src/backend/utils/cache/typcache.c +++ b/src/backend/utils/cache/typcache.c @@ -406,6 +406,7 @@ lookup_type_cache(Oid type_id, int flags) typentry->typstorage = typtup->typstorage; typentry->typtype = typtup->typtype; typentry->typrelid = typtup->typrelid; + typentry->typsubscript = typtup->typsubscript; typentry->typelem = typtup->typelem; typentry->typcollation = typtup->typcollation; typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA; @@ -450,6 +451,7 @@ lookup_type_cache(Oid type_id, int flags) typentry->typstorage = typtup->typstorage; typentry->typtype = typtup->typtype; typentry->typrelid = typtup->typrelid; + typentry->typsubscript = typtup->typsubscript; typentry->typelem = typtup->typelem; typentry->typcollation = typtup->typcollation; typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA; diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 3b36335aa6..673a670347 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -10794,11 +10794,13 @@ dumpBaseType(Archive *fout, TypeInfo *tyinfo) char *typmodin; char *typmodout; char *typanalyze; + char *typsubscript; Oid typreceiveoid; Oid typsendoid; Oid typmodinoid; Oid typmodoutoid; Oid typanalyzeoid; + Oid typsubscriptoid; char *typcategory; char *typispreferred; char *typdelim; @@ -10840,6 +10842,14 @@ dumpBaseType(Archive *fout, TypeInfo *tyinfo) else appendPQExpBufferStr(query, "false AS typcollatable, "); + if (fout->remoteVersion >= 140000) + appendPQExpBufferStr(query, + "typsubscript, " + "typsubscript::pg_catalog.oid AS typsubscriptoid, "); + else + appendPQExpBufferStr(query, + "'-' AS typsubscript, 0 AS typsubscriptoid, "); + /* Before 8.4, pg_get_expr does not allow 0 for its second arg */ if (fout->remoteVersion >= 80400) appendPQExpBufferStr(query, @@ -10862,11 +10872,13 @@ dumpBaseType(Archive *fout, TypeInfo *tyinfo) typmodin = PQgetvalue(res, 0, PQfnumber(res, "typmodin")); typmodout = PQgetvalue(res, 0, PQfnumber(res, "typmodout")); typanalyze = PQgetvalue(res, 0, PQfnumber(res, "typanalyze")); + typsubscript = PQgetvalue(res, 0, PQfnumber(res, "typsubscript")); typreceiveoid = atooid(PQgetvalue(res, 0, PQfnumber(res, "typreceiveoid"))); typsendoid = atooid(PQgetvalue(res, 0, PQfnumber(res, "typsendoid"))); typmodinoid = atooid(PQgetvalue(res, 0, PQfnumber(res, "typmodinoid"))); typmodoutoid = atooid(PQgetvalue(res, 0, PQfnumber(res, "typmodoutoid"))); typanalyzeoid = atooid(PQgetvalue(res, 0, PQfnumber(res, "typanalyzeoid"))); + typsubscriptoid = atooid(PQgetvalue(res, 0, PQfnumber(res, "typsubscriptoid"))); typcategory = PQgetvalue(res, 0, PQfnumber(res, "typcategory")); typispreferred = PQgetvalue(res, 0, PQfnumber(res, "typispreferred")); typdelim = PQgetvalue(res, 0, PQfnumber(res, "typdelim")); @@ -10935,6 +10947,9 @@ dumpBaseType(Archive *fout, TypeInfo *tyinfo) appendPQExpBufferStr(q, typdefault); } + if (OidIsValid(typsubscriptoid)) + appendPQExpBuffer(q, ",\n SUBSCRIPT = %s", typsubscript); + if (OidIsValid(tyinfo->typelem)) { char *elemType; diff --git a/src/include/c.h b/src/include/c.h index b21e4074dd..12ea056a35 100644 --- a/src/include/c.h +++ b/src/include/c.h @@ -592,13 +592,9 @@ typedef uint32 CommandId; #define InvalidCommandId (~(CommandId)0) /* - * Array indexing support + * Maximum number of array subscripts, for regular varlena arrays */ #define MAXDIM 6 -typedef struct -{ - int indx[MAXDIM]; -} IntArray; /* ---------------- * Variable-length datatypes all share the 'struct varlena' header. diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index fc2202b843..e6c7b070f6 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -10936,6 +10936,14 @@ proargnames => '{max_data_alignment,database_block_size,blocks_per_segment,wal_block_size,bytes_per_wal_segment,max_identifier_length,max_index_columns,max_toast_chunk_size,large_object_chunk_size,float8_pass_by_value,data_page_checksum_version}', prosrc => 'pg_control_init' }, +# subscripting support for built-in types +{ oid => '9255', descr => 'standard array subscripting support', + proname => 'array_subscript_handler', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'array_subscript_handler' }, +{ oid => '9256', descr => 'raw array subscripting support', + proname => 'raw_array_subscript_handler', prorettype => 'internal', + proargtypes => 'internal', prosrc => 'raw_array_subscript_handler' }, + # collation management functions { oid => '3445', descr => 'import collations from operating system', proname => 'pg_import_system_collations', procost => '100', diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat index 21a467a7a7..28240bdce3 100644 --- a/src/include/catalog/pg_type.dat +++ b/src/include/catalog/pg_type.dat @@ -48,9 +48,10 @@ { oid => '19', array_type_oid => '1003', descr => '63-byte type for storing system identifiers', typname => 'name', typlen => 'NAMEDATALEN', typbyval => 'f', - typcategory => 'S', typelem => 'char', typinput => 'namein', - typoutput => 'nameout', typreceive => 'namerecv', typsend => 'namesend', - typalign => 'c', typcollation => 'C' }, + typcategory => 'S', typsubscript => 'raw_array_subscript_handler', + typelem => 'char', typinput => 'namein', typoutput => 'nameout', + typreceive => 'namerecv', typsend => 'namesend', typalign => 'c', + typcollation => 'C' }, { oid => '20', array_type_oid => '1016', descr => '~18 digit integer, 8-byte storage', typname => 'int8', typlen => '8', typbyval => 'FLOAT8PASSBYVAL', @@ -64,7 +65,8 @@ { oid => '22', array_type_oid => '1006', descr => 'array of int2, used in system tables', typname => 'int2vector', typlen => '-1', typbyval => 'f', typcategory => 'A', - typelem => 'int2', typinput => 'int2vectorin', typoutput => 'int2vectorout', + typsubscript => 'array_subscript_handler', typelem => 'int2', + typinput => 'int2vectorin', typoutput => 'int2vectorout', typreceive => 'int2vectorrecv', typsend => 'int2vectorsend', typalign => 'i' }, { oid => '23', array_type_oid => '1007', @@ -104,7 +106,8 @@ { oid => '30', array_type_oid => '1013', descr => 'array of oids, used in system tables', typname => 'oidvector', typlen => '-1', typbyval => 'f', typcategory => 'A', - typelem => 'oid', typinput => 'oidvectorin', typoutput => 'oidvectorout', + typsubscript => 'array_subscript_handler', typelem => 'oid', + typinput => 'oidvectorin', typoutput => 'oidvectorout', typreceive => 'oidvectorrecv', typsend => 'oidvectorsend', typalign => 'i' }, # hand-built rowtype entries for bootstrapped catalogs @@ -178,13 +181,15 @@ { oid => '600', array_type_oid => '1017', descr => 'geometric point \'(x, y)\'', typname => 'point', typlen => '16', typbyval => 'f', typcategory => 'G', - typelem => 'float8', typinput => 'point_in', typoutput => 'point_out', - typreceive => 'point_recv', typsend => 'point_send', typalign => 'd' }, + typsubscript => 'raw_array_subscript_handler', typelem => 'float8', + typinput => 'point_in', typoutput => 'point_out', typreceive => 'point_recv', + typsend => 'point_send', typalign => 'd' }, { oid => '601', array_type_oid => '1018', descr => 'geometric line segment \'(pt1,pt2)\'', typname => 'lseg', typlen => '32', typbyval => 'f', typcategory => 'G', - typelem => 'point', typinput => 'lseg_in', typoutput => 'lseg_out', - typreceive => 'lseg_recv', typsend => 'lseg_send', typalign => 'd' }, + typsubscript => 'raw_array_subscript_handler', typelem => 'point', + typinput => 'lseg_in', typoutput => 'lseg_out', typreceive => 'lseg_recv', + typsend => 'lseg_send', typalign => 'd' }, { oid => '602', array_type_oid => '1019', descr => 'geometric path \'(pt1,...)\'', typname => 'path', typlen => '-1', typbyval => 'f', typcategory => 'G', @@ -193,9 +198,9 @@ { oid => '603', array_type_oid => '1020', descr => 'geometric box \'(lower left,upper right)\'', typname => 'box', typlen => '32', typbyval => 'f', typcategory => 'G', - typdelim => ';', typelem => 'point', typinput => 'box_in', - typoutput => 'box_out', typreceive => 'box_recv', typsend => 'box_send', - typalign => 'd' }, + typdelim => ';', typsubscript => 'raw_array_subscript_handler', + typelem => 'point', typinput => 'box_in', typoutput => 'box_out', + typreceive => 'box_recv', typsend => 'box_send', typalign => 'd' }, { oid => '604', array_type_oid => '1027', descr => 'geometric polygon \'(pt1,...)\'', typname => 'polygon', typlen => '-1', typbyval => 'f', typcategory => 'G', @@ -203,8 +208,9 @@ typsend => 'poly_send', typalign => 'd', typstorage => 'x' }, { oid => '628', array_type_oid => '629', descr => 'geometric line', typname => 'line', typlen => '24', typbyval => 'f', typcategory => 'G', - typelem => 'float8', typinput => 'line_in', typoutput => 'line_out', - typreceive => 'line_recv', typsend => 'line_send', typalign => 'd' }, + typsubscript => 'raw_array_subscript_handler', typelem => 'float8', + typinput => 'line_in', typoutput => 'line_out', typreceive => 'line_recv', + typsend => 'line_send', typalign => 'd' }, # OIDS 700 - 799 @@ -507,8 +513,9 @@ # Arrays of records have typcategory P, so they can't be autogenerated. { oid => '2287', typname => '_record', typlen => '-1', typbyval => 'f', typtype => 'p', - typcategory => 'P', typelem => 'record', typinput => 'array_in', - typoutput => 'array_out', typreceive => 'array_recv', typsend => 'array_send', + typcategory => 'P', typsubscript => 'array_subscript_handler', + typelem => 'record', typinput => 'array_in', typoutput => 'array_out', + typreceive => 'array_recv', typsend => 'array_send', typanalyze => 'array_typanalyze', typalign => 'd', typstorage => 'x' }, { oid => '2275', array_type_oid => '1263', descr => 'C-style string', typname => 'cstring', typlen => '-2', typbyval => 'f', typtype => 'p', diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index 6099e5f57c..15f2514a14 100644 --- a/src/include/catalog/pg_type.h +++ b/src/include/catalog/pg_type.h @@ -101,15 +101,18 @@ CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelati Oid typrelid BKI_DEFAULT(0) BKI_ARRAY_DEFAULT(0) BKI_LOOKUP(pg_class); /* - * If typelem is not 0 then it identifies another row in pg_type. The - * current type can then be subscripted like an array yielding values of - * type typelem. A non-zero typelem does not guarantee this type to be a - * "real" array type; some ordinary fixed-length types can also be - * subscripted (e.g., name, point). Variable-length types can *not* be - * turned into pseudo-arrays like that. Hence, the way to determine - * whether a type is a "true" array type is if: - * - * typelem != 0 and typlen == -1. + * Type-specific subscripting handler. If typsubscript is 0, it means + * that this type doesn't support subscripting. Note that various parts + * of the system deem types to be "true" array types only if their + * typsubscript is array_subscript_handler. + */ + regproc typsubscript BKI_DEFAULT(-) BKI_ARRAY_DEFAULT(array_subscript_handler) BKI_LOOKUP(pg_proc); + + /* + * If typelem is not 0 then it identifies another row in pg_type, defining + * the type yielded by subscripting. This should be 0 if typsubscript is + * 0. However, it can be 0 when typsubscript isn't 0, if the handler + * doesn't need typelem to determine the subscripting result type. */ Oid typelem BKI_DEFAULT(0) BKI_LOOKUP(pg_type); @@ -319,6 +322,11 @@ DECLARE_UNIQUE_INDEX(pg_type_typname_nsp_index, 2704, on pg_type using btree(typ (typid) == ANYCOMPATIBLENONARRAYOID || \ (typid) == ANYCOMPATIBLERANGEOID) +/* Is this a "true" array type? (Requires fmgroids.h) */ +#define IsTrueArrayType(typeForm) \ + (OidIsValid((typeForm)->typelem) && \ + (typeForm)->typsubscript == F_ARRAY_SUBSCRIPT_HANDLER) + /* * Backwards compatibility for ancient random spellings of pg_type OID macros. * Don't use these names in new code. @@ -351,6 +359,7 @@ extern ObjectAddress TypeCreate(Oid newTypeOid, Oid typmodinProcedure, Oid typmodoutProcedure, Oid analyzeProcedure, + Oid subscriptProcedure, Oid elementType, bool isImplicitArray, Oid arrayType, diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h index abb489e206..b4e0a9b7d3 100644 --- a/src/include/executor/execExpr.h +++ b/src/include/executor/execExpr.h @@ -32,6 +32,11 @@ typedef void (*ExecEvalSubroutine) (ExprState *state, struct ExprEvalStep *op, ExprContext *econtext); +/* API for out-of-line evaluation subroutines returning bool */ +typedef bool (*ExecEvalBoolSubroutine) (ExprState *state, + struct ExprEvalStep *op, + ExprContext *econtext); + /* * Discriminator for ExprEvalSteps. * @@ -185,8 +190,8 @@ typedef enum ExprEvalOp */ EEOP_FIELDSTORE_FORM, - /* Process a container subscript; short-circuit expression to NULL if NULL */ - EEOP_SBSREF_SUBSCRIPT, + /* Process container subscripts; possibly short-circuit result to NULL */ + EEOP_SBSREF_SUBSCRIPTS, /* * Compute old container element/slice when a SubscriptingRef assignment @@ -494,19 +499,19 @@ typedef struct ExprEvalStep int ncolumns; } fieldstore; - /* for EEOP_SBSREF_SUBSCRIPT */ + /* for EEOP_SBSREF_SUBSCRIPTS */ struct { + ExecEvalBoolSubroutine subscriptfunc; /* evaluation subroutine */ /* too big to have inline */ struct SubscriptingRefState *state; - int off; /* 0-based index of this subscript */ - bool isupper; /* is it upper or lower subscript? */ int jumpdone; /* jump here on null */ } sbsref_subscript; /* for EEOP_SBSREF_OLD / ASSIGN / FETCH */ struct { + ExecEvalSubroutine subscriptfunc; /* evaluation subroutine */ /* too big to have inline */ struct SubscriptingRefState *state; } sbsref; @@ -640,36 +645,41 @@ typedef struct SubscriptingRefState { bool isassignment; /* is it assignment, or just fetch? */ - Oid refelemtype; /* OID of the container element type */ - int16 refattrlength; /* typlen of container type */ - int16 refelemlength; /* typlen of the container element type */ - bool refelembyval; /* is the element type pass-by-value? */ - char refelemalign; /* typalign of the element type */ + /* workspace for type-specific subscripting code */ + void *workspace; - /* numupper and upperprovided[] are filled at compile time */ - /* at runtime, extracted subscript datums get stored in upperindex[] */ + /* numupper and upperprovided[] are filled at expression compile time */ + /* at runtime, subscripts are computed in upperindex[]/upperindexnull[] */ int numupper; - bool upperprovided[MAXDIM]; - int upperindex[MAXDIM]; + bool *upperprovided; /* indicates if this position is supplied */ + Datum *upperindex; + bool *upperindexnull; /* similarly for lower indexes, if any */ int numlower; - bool lowerprovided[MAXDIM]; - int lowerindex[MAXDIM]; - - /* subscript expressions get evaluated into here */ - Datum subscriptvalue; - bool subscriptnull; + bool *lowerprovided; + Datum *lowerindex; + bool *lowerindexnull; /* for assignment, new value to assign is evaluated into here */ Datum replacevalue; bool replacenull; - /* if we have a nested assignment, SBSREF_OLD puts old value here */ + /* if we have a nested assignment, sbs_fetch_old puts old value here */ Datum prevvalue; bool prevnull; } SubscriptingRefState; +/* Execution step methods used for SubscriptingRef */ +typedef struct SubscriptExecSteps +{ + /* See nodes/subscripting.h for more detail about these */ + ExecEvalBoolSubroutine sbs_check_subscripts; /* process subscripts */ + ExecEvalSubroutine sbs_fetch; /* fetch an element */ + ExecEvalSubroutine sbs_assign; /* assign to an element */ + ExecEvalSubroutine sbs_fetch_old; /* fetch old value for assignment */ +} SubscriptExecSteps; + /* functions in execExpr.c */ extern void ExprEvalPushStep(ExprState *es, const ExprEvalStep *s); @@ -712,10 +722,6 @@ extern void ExecEvalFieldStoreDeForm(ExprState *state, ExprEvalStep *op, ExprContext *econtext); extern void ExecEvalFieldStoreForm(ExprState *state, ExprEvalStep *op, ExprContext *econtext); -extern bool ExecEvalSubscriptingRef(ExprState *state, ExprEvalStep *op); -extern void ExecEvalSubscriptingRefFetch(ExprState *state, ExprEvalStep *op); -extern void ExecEvalSubscriptingRefOld(ExprState *state, ExprEvalStep *op); -extern void ExecEvalSubscriptingRefAssign(ExprState *state, ExprEvalStep *op); extern void ExecEvalConvertRowtype(ExprState *state, ExprEvalStep *op, ExprContext *econtext); extern void ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op); diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index cdbe781c73..dd85908fe2 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -390,14 +390,14 @@ typedef struct WindowFunc int location; /* token location, or -1 if unknown */ } WindowFunc; -/* ---------------- - * SubscriptingRef: describes a subscripting operation over a container - * (array, etc). +/* + * SubscriptingRef: describes a subscripting operation over a container + * (array, etc). * * A SubscriptingRef can describe fetching a single element from a container, - * fetching a part of container (e.g. array slice), storing a single element into - * a container, or storing a slice. The "store" cases work with an - * initial container value and a source value that is inserted into the + * fetching a part of a container (e.g. an array slice), storing a single + * element into a container, or storing a slice. The "store" cases work with + * an initial container value and a source value that is inserted into the * appropriate part of the container; the result of the operation is an * entire new modified container value. * @@ -410,23 +410,32 @@ typedef struct WindowFunc * * In the slice case, individual expressions in the subscript lists can be * NULL, meaning "substitute the array's current lower or upper bound". - * - * Note: the result datatype is the element type when fetching a single - * element; but it is the array type when doing subarray fetch or either - * type of store. + * (Non-array containers may or may not support this.) + * + * refcontainertype is the actual container type that determines the + * subscripting semantics. (This will generally be either the exposed type of + * refexpr, or the base type if that is a domain.) refelemtype is the type of + * the container's elements; this is saved for the use of the subscripting + * functions, but is not used by the core code. refrestype, reftypmod, and + * refcollid describe the type of the SubscriptingRef's result. In a store + * expression, refrestype will always match refcontainertype; in a fetch, + * it could be refelemtype for an element fetch, or refcontainertype for a + * slice fetch, or possibly something else as determined by type-specific + * subscripting logic. Likewise, reftypmod and refcollid will match the + * container's properties in a store, but could be different in a fetch. * * Note: for the cases where a container is returned, if refexpr yields a R/W - * expanded container, then the implementation is allowed to modify that object - * in-place and return the same object.) - * ---------------- + * expanded container, then the implementation is allowed to modify that + * object in-place and return the same object. */ typedef struct SubscriptingRef { Expr xpr; Oid refcontainertype; /* type of the container proper */ - Oid refelemtype; /* type of the container elements */ - int32 reftypmod; /* typmod of the container (and elements too) */ - Oid refcollid; /* OID of collation, or InvalidOid if none */ + Oid refelemtype; /* the container type's pg_type.typelem */ + Oid refrestype; /* type of the SubscriptingRef's result */ + int32 reftypmod; /* typmod of the result */ + Oid refcollid; /* collation of result, or InvalidOid if none */ List *refupperindexpr; /* expressions that evaluate to upper * container indexes */ List *reflowerindexpr; /* expressions that evaluate to lower @@ -434,7 +443,6 @@ typedef struct SubscriptingRef * container element */ Expr *refexpr; /* the expression that evaluates to a * container value */ - Expr *refassgnexpr; /* expression for the source value, or NULL if * fetch */ } SubscriptingRef; diff --git a/src/include/nodes/subscripting.h b/src/include/nodes/subscripting.h new file mode 100644 index 0000000000..3b0a60773d --- /dev/null +++ b/src/include/nodes/subscripting.h @@ -0,0 +1,167 @@ +/*------------------------------------------------------------------------- + * + * subscripting.h + * API for generic type subscripting + * + * Portions Copyright (c) 1996-2020, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * + * src/include/nodes/subscripting.h + * + *------------------------------------------------------------------------- + */ +#ifndef SUBSCRIPTING_H +#define SUBSCRIPTING_H + +#include "nodes/primnodes.h" + +/* Forward declarations, to avoid including other headers */ +struct ParseState; +struct SubscriptingRefState; +struct SubscriptExecSteps; + +/* + * The SQL-visible function that defines a subscripting method is declared + * subscripting_function(internal) returns internal + * but it actually is not passed any parameter. It must return a pointer + * to a "struct SubscriptRoutines" that provides pointers to the individual + * subscript parsing and execution methods. Typically the pointer will point + * to a "static const" variable, but at need it can point to palloc'd space. + * The type (after domain-flattening) of the head variable or expression + * of a subscripting construct determines which subscripting function is + * called for that construct. + * + * In addition to the method pointers, struct SubscriptRoutines includes + * several bool flags that specify properties of the subscripting actions + * this data type can perform: + * + * fetch_strict indicates that a fetch SubscriptRef is strict, i.e., returns + * NULL if any input (either the container or any subscript) is NULL. + * + * fetch_leakproof indicates that a fetch SubscriptRef is leakproof, i.e., + * will not throw any data-value-dependent errors. Typically this requires + * silently returning NULL for invalid subscripts. + * + * store_leakproof similarly indicates whether an assignment SubscriptRef is + * leakproof. (It is common to prefer throwing errors for invalid subscripts + * in assignments; that's fine, but it makes the operation not leakproof. + * In current usage there is no advantage in making assignments leakproof.) + * + * There is no store_strict flag. Such behavior would generally be + * undesirable, since for example a null subscript in an assignment would + * cause the entire container to become NULL. + * + * Regardless of these flags, all SubscriptRefs are expected to be immutable, + * that is they must always give the same results for the same inputs. + * They are expected to always be parallel-safe, as well. + */ + +/* + * The transform method is called during parse analysis of a subscripting + * construct. The SubscriptingRef node has been constructed, but some of + * its fields still need to be filled in, and the subscript expression(s) + * are still in raw form. The transform method is responsible for doing + * parse analysis of each subscript expression (using transformExpr), + * coercing the subscripts to whatever type it needs, and building the + * refupperindexpr and reflowerindexpr lists from those results. The + * reflowerindexpr list must be empty for an element operation, or the + * same length as refupperindexpr for a slice operation. Insert NULLs + * (that is, an empty parse tree, not a null Const node) for any omitted + * subscripts in a slice operation. (Of course, if the transform method + * does not care to support slicing, it can just throw an error if isSlice.) + * See array_subscript_transform() for sample code. + * + * The transform method is also responsible for identifying the result type + * of the subscripting operation. At call, refcontainertype and reftypmod + * describe the container type (this will be a base type not a domain), and + * refelemtype is set to the container type's pg_type.typelem value. The + * transform method must set refrestype and reftypmod to describe the result + * of subscripting. For arrays, refrestype is set to refelemtype for an + * element operation or refcontainertype for a slice, while reftypmod stays + * the same in either case; but other types might use other rules. The + * transform method should ignore refcollid, as that's determined later on + * during parsing. + * + * At call, refassgnexpr has not been filled in, so the SubscriptingRef node + * always looks like a fetch; refrestype should be set as though for a + * fetch, too. (The isAssignment parameter is typically only useful if the + * transform method wishes to throw an error for not supporting assignment.) + * To complete processing of an assignment, the core parser will coerce the + * element/slice source expression to the returned refrestype and reftypmod + * before putting it into refassgnexpr. It will then set refrestype and + * reftypmod to again describe the container type, since that's what an + * assignment must return. + */ +typedef void (*SubscriptTransform) (SubscriptingRef *sbsref, + List *indirection, + struct ParseState *pstate, + bool isSlice, + bool isAssignment); + +/* + * The exec_setup method is called during executor-startup compilation of a + * SubscriptingRef node in an expression. It must fill *methods with pointers + * to functions that can be called for execution of the node. Optionally, + * exec_setup can initialize sbsrefstate->workspace to point to some palloc'd + * workspace for execution. (Typically, such workspace is used to hold + * looked-up catalog data and/or provide space for the check_subscripts step + * to pass data forward to the other step functions.) See executor/execExpr.h + * for the definitions of these structs and other ones used in expression + * execution. + * + * The methods to be provided are: + * + * sbs_check_subscripts: examine the just-computed subscript values available + * in sbsrefstate's arrays, and possibly convert them into another form + * (stored in sbsrefstate->workspace). Return TRUE to continue with + * evaluation of the subscripting construct, or FALSE to skip it and return an + * overall NULL result. If this is a fetch and the data type's fetch_strict + * flag is true, then sbs_check_subscripts must return FALSE if there are any + * NULL subscripts. Otherwise it can choose to throw an error, or return + * FALSE, or let sbs_fetch or sbs_assign deal with the null subscripts. + * + * sbs_fetch: perform a subscripting fetch, using the container value in + * *op->resvalue and the subscripts from sbs_check_subscripts. If + * fetch_strict is true then all these inputs can be assumed non-NULL, + * otherwise sbs_fetch must check for null inputs. Place the result in + * *op->resvalue / *op->resnull. + * + * sbs_assign: perform a subscripting assignment, using the original + * container value in *op->resvalue / *op->resnull, the subscripts from + * sbs_check_subscripts, and the new element/slice value in + * sbsrefstate->replacevalue/replacenull. Any of these inputs might be NULL + * (unless sbs_check_subscripts rejected null subscripts). Place the result + * (an entire new container value) in *op->resvalue / *op->resnull. + * + * sbs_fetch_old: this is only used in cases where an element or slice + * assignment involves an assignment to a sub-field or sub-element + * (i.e., nested containers are involved). It must fetch the existing + * value of the target element or slice. This is exactly the same as + * sbs_fetch except that (a) it must cope with a NULL container, and + * with NULL subscripts if sbs_check_subscripts allows them (typically, + * returning NULL is good enough); and (b) the result must be placed in + * sbsrefstate->prevvalue/prevnull, without overwriting *op->resvalue. + * + * Subscripting implementations that do not support assignment need not + * provide sbs_assign or sbs_fetch_old methods. It might be reasonable + * to also omit sbs_check_subscripts, in which case the sbs_fetch method must + * combine the functionality of sbs_check_subscripts and sbs_fetch. (The + * main reason to have a separate sbs_check_subscripts method is so that + * sbs_fetch_old and sbs_assign need not duplicate subscript processing.) + * Set the relevant pointers to NULL for any omitted methods. + */ +typedef void (*SubscriptExecSetup) (const SubscriptingRef *sbsref, + struct SubscriptingRefState *sbsrefstate, + struct SubscriptExecSteps *methods); + +/* Struct returned by the SQL-visible subscript handler function */ +typedef struct SubscriptRoutines +{ + SubscriptTransform transform; /* parse analysis function */ + SubscriptExecSetup exec_setup; /* expression compilation function */ + bool fetch_strict; /* is fetch SubscriptRef strict? */ + bool fetch_leakproof; /* is fetch SubscriptRef leakproof? */ + bool store_leakproof; /* is assignment SubscriptRef leakproof? */ +} SubscriptRoutines; + +#endif /* SUBSCRIPTING_H */ diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h index d25819aa28..beb56fec87 100644 --- a/src/include/parser/parse_node.h +++ b/src/include/parser/parse_node.h @@ -313,15 +313,15 @@ extern void setup_parser_errposition_callback(ParseCallbackState *pcbstate, ParseState *pstate, int location); extern void cancel_parser_errposition_callback(ParseCallbackState *pcbstate); -extern Oid transformContainerType(Oid *containerType, int32 *containerTypmod); +extern void transformContainerType(Oid *containerType, int32 *containerTypmod); extern SubscriptingRef *transformContainerSubscripts(ParseState *pstate, Node *containerBase, Oid containerType, - Oid elementType, int32 containerTypMod, List *indirection, - Node *assignFrom); + bool isAssignment); + extern Const *make_const(ParseState *pstate, Value *value, int location); #endif /* PARSE_NODE_H */ diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h index fecfe1f4f6..475b842b09 100644 --- a/src/include/utils/lsyscache.h +++ b/src/include/utils/lsyscache.h @@ -17,6 +17,9 @@ #include "access/htup.h" #include "nodes/pg_list.h" +/* avoid including subscripting.h here */ +struct SubscriptRoutines; + /* Result list element for get_op_btree_interpretation */ typedef struct OpBtreeInterpretation { @@ -172,6 +175,9 @@ extern void getTypeBinaryOutputInfo(Oid type, Oid *typSend, bool *typIsVarlena); extern Oid get_typmodin(Oid typid); extern Oid get_typcollation(Oid typid); extern bool type_is_collatable(Oid typid); +extern RegProcedure get_typsubscript(Oid typid, Oid *typelemp); +extern const struct SubscriptRoutines *getSubscriptingRoutines(Oid typid, + Oid *typelemp); extern Oid getBaseType(Oid typid); extern Oid getBaseTypeAndTypmod(Oid typid, int32 *typmod); extern int32 get_typavgwidth(Oid typid, int32 typmod); diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h index cdd20e56d7..38c8fe0192 100644 --- a/src/include/utils/typcache.h +++ b/src/include/utils/typcache.h @@ -42,6 +42,7 @@ typedef struct TypeCacheEntry char typstorage; char typtype; Oid typrelid; + Oid typsubscript; Oid typelem; Oid typcollation; diff --git a/src/pl/plperl/plperl.c b/src/pl/plperl/plperl.c index 7844c500ee..4de756455d 100644 --- a/src/pl/plperl/plperl.c +++ b/src/pl/plperl/plperl.c @@ -2853,9 +2853,7 @@ compile_plperl_function(Oid fn_oid, bool is_trigger, bool is_event_trigger) prodesc->result_oid = rettype; prodesc->fn_retisset = procStruct->proretset; prodesc->fn_retistuple = type_is_rowtype(rettype); - - prodesc->fn_retisarray = - (typeStruct->typlen == -1 && typeStruct->typelem); + prodesc->fn_retisarray = IsTrueArrayType(typeStruct); fmgr_info_cxt(typeStruct->typinput, &(prodesc->result_in_func), @@ -2901,7 +2899,7 @@ compile_plperl_function(Oid fn_oid, bool is_trigger, bool is_event_trigger) } /* Identify array-type arguments */ - if (typeStruct->typelem != 0 && typeStruct->typlen == -1) + if (IsTrueArrayType(typeStruct)) prodesc->arg_arraytype[i] = argtype; else prodesc->arg_arraytype[i] = InvalidOid; diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c index 6df8e14629..b610b28d70 100644 --- a/src/pl/plpgsql/src/pl_comp.c +++ b/src/pl/plpgsql/src/pl_comp.c @@ -26,6 +26,7 @@ #include "parser/parse_type.h" #include "plpgsql.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/guc.h" #include "utils/lsyscache.h" #include "utils/memutils.h" @@ -2144,8 +2145,7 @@ build_datatype(HeapTuple typeTup, int32 typmod, * This test should include what get_element_type() checks. We also * disallow non-toastable array types (i.e. oidvector and int2vector). */ - typ->typisarray = (typeStruct->typlen == -1 && - OidIsValid(typeStruct->typelem) && + typ->typisarray = (IsTrueArrayType(typeStruct) && typeStruct->typstorage != TYPSTORAGE_PLAIN); } else if (typeStruct->typtype == TYPTYPE_DOMAIN) diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl/plpython/plpy_typeio.c index b4aeb7fd59..5e807b139f 100644 --- a/src/pl/plpython/plpy_typeio.c +++ b/src/pl/plpython/plpy_typeio.c @@ -352,9 +352,9 @@ PLy_output_setup_func(PLyObToDatum *arg, MemoryContext arg_mcxt, proc); } else if (typentry && - OidIsValid(typentry->typelem) && typentry->typlen == -1) + IsTrueArrayType(typentry)) { - /* Standard varlena array (cf. get_element_type) */ + /* Standard array */ arg->func = PLySequence_ToArray; /* Get base type OID to insert into constructed array */ /* (note this might not be the same as the immediate child type) */ @@ -470,9 +470,9 @@ PLy_input_setup_func(PLyDatumToOb *arg, MemoryContext arg_mcxt, proc); } else if (typentry && - OidIsValid(typentry->typelem) && typentry->typlen == -1) + IsTrueArrayType(typentry)) { - /* Standard varlena array (cf. get_element_type) */ + /* Standard array */ arg->func = PLyList_FromArray; /* Recursively set up conversion info for the element type */ arg->u.array.elm = (PLyDatumToOb *) diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out index c03ac65ff8..448b3ee526 100644 --- a/src/test/regress/expected/arrays.out +++ b/src/test/regress/expected/arrays.out @@ -27,12 +27,12 @@ INSERT INTO arrtest (a, b[1:2][1:2], c, d, e, f, g) INSERT INTO arrtest (a, b[1:2], c, d[1:2]) VALUES ('{}', '{3,4}', '{foo,bar}', '{bar,foo}'); INSERT INTO arrtest (b[2]) VALUES(now()); -- error, type mismatch -ERROR: array assignment to "b" requires type integer but expression is of type timestamp with time zone +ERROR: subscripted assignment to "b" requires type integer but expression is of type timestamp with time zone LINE 1: INSERT INTO arrtest (b[2]) VALUES(now()); ^ HINT: You will need to rewrite or cast the expression. INSERT INTO arrtest (b[1:2]) VALUES(now()); -- error, type mismatch -ERROR: array assignment to "b" requires type integer[] but expression is of type timestamp with time zone +ERROR: subscripted assignment to "b" requires type integer[] but expression is of type timestamp with time zone LINE 1: INSERT INTO arrtest (b[1:2]) VALUES(now()); ^ HINT: You will need to rewrite or cast the expression. @@ -237,7 +237,7 @@ UPDATE arrtest ERROR: array subscript in assignment must not be null -- Un-subscriptable type SELECT (now())[1]; -ERROR: cannot subscript type timestamp with time zone because it is not an array +ERROR: cannot subscript type timestamp with time zone because it does not support subscripting -- test slices with empty lower and/or upper index CREATE TEMP TABLE arrtest_s ( a int2[], diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out index 3b39137400..507b474b1b 100644 --- a/src/test/regress/expected/opr_sanity.out +++ b/src/test/regress/expected/opr_sanity.out @@ -31,7 +31,8 @@ begin if $2 = 'pg_catalog.any'::pg_catalog.regtype then return true; end if; if $2 = 'pg_catalog.anyarray'::pg_catalog.regtype then if EXISTS(select 1 from pg_catalog.pg_type where - oid = $1 and typelem != 0 and typlen = -1) + oid = $1 and typelem != 0 and + typsubscript = 'pg_catalog.array_subscript_handler'::pg_catalog.regproc) then return true; end if; end if; if $2 = 'pg_catalog.anyrange'::pg_catalog.regtype then @@ -55,7 +56,8 @@ begin if $2 = 'pg_catalog.any'::pg_catalog.regtype then return true; end if; if $2 = 'pg_catalog.anyarray'::pg_catalog.regtype then if EXISTS(select 1 from pg_catalog.pg_type where - oid = $1 and typelem != 0 and typlen = -1) + oid = $1 and typelem != 0 and + typsubscript = 'pg_catalog.array_subscript_handler'::pg_catalog.regproc) then return true; end if; end if; if $2 = 'pg_catalog.anyrange'::pg_catalog.regtype then diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out index ec1cd47623..13567ddf84 100644 --- a/src/test/regress/expected/type_sanity.out +++ b/src/test/regress/expected/type_sanity.out @@ -75,14 +75,15 @@ ORDER BY p1.oid; 5017 | pg_mcv_list (4 rows) --- Make sure typarray points to a varlena array type of our own base +-- Make sure typarray points to a "true" array type of our own base SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype, - p2.typelem, p2.typlen + p2.typsubscript FROM pg_type p1 LEFT JOIN pg_type p2 ON (p1.typarray = p2.oid) WHERE p1.typarray <> 0 AND - (p2.oid IS NULL OR p2.typelem <> p1.oid OR p2.typlen <> -1); - oid | basetype | arraytype | typelem | typlen ------+----------+-----------+---------+-------- + (p2.oid IS NULL OR + p2.typsubscript <> 'array_subscript_handler'::regproc); + oid | basetype | arraytype | typsubscript +-----+----------+-----------+-------------- (0 rows) -- Look for range types that do not have a pg_range entry @@ -448,6 +449,33 @@ WHERE p1.typarray = p2.oid AND -----+---------+----------+---------+---------- (0 rows) +-- Check for typelem set without a handler +SELECT p1.oid, p1.typname, p1.typelem +FROM pg_type AS p1 +WHERE p1.typelem != 0 AND p1.typsubscript = 0; + oid | typname | typelem +-----+---------+--------- +(0 rows) + +-- Check for misuse of standard subscript handlers +SELECT p1.oid, p1.typname, + p1.typelem, p1.typlen, p1.typbyval +FROM pg_type AS p1 +WHERE p1.typsubscript = 'array_subscript_handler'::regproc AND NOT + (p1.typelem != 0 AND p1.typlen = -1 AND NOT p1.typbyval); + oid | typname | typelem | typlen | typbyval +-----+---------+---------+--------+---------- +(0 rows) + +SELECT p1.oid, p1.typname, + p1.typelem, p1.typlen, p1.typbyval +FROM pg_type AS p1 +WHERE p1.typsubscript = 'raw_array_subscript_handler'::regproc AND NOT + (p1.typelem != 0 AND p1.typlen > 0 AND NOT p1.typbyval); + oid | typname | typelem | typlen | typbyval +-----+---------+---------+--------+---------- +(0 rows) + -- Check for bogus typanalyze routines SELECT p1.oid, p1.typname, p2.oid, p2.proname FROM pg_type AS p1, pg_proc AS p2 @@ -485,7 +513,7 @@ SELECT t.oid, t.typname, t.typanalyze FROM pg_type t WHERE t.typbasetype = 0 AND (t.typanalyze = 'array_typanalyze'::regproc) != - (typelem != 0 AND typlen < 0) + (t.typsubscript = 'array_subscript_handler'::regproc) ORDER BY 1; oid | typname | typanalyze -----+------------+------------ @@ -608,7 +636,8 @@ WHERE o.opcmethod != 403 OR ((o.opcintype != p1.rngsubtype) AND NOT (o.opcintype = 'pg_catalog.anyarray'::regtype AND EXISTS(select 1 from pg_catalog.pg_type where - oid = p1.rngsubtype and typelem != 0 and typlen = -1))); + oid = p1.rngsubtype and typelem != 0 and + typsubscript = 'array_subscript_handler'::regproc))); rngtypid | rngsubtype | opcmethod | opcname ----------+------------+-----------+--------- (0 rows) diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql index 307aab1deb..4189a5a4e0 100644 --- a/src/test/regress/sql/opr_sanity.sql +++ b/src/test/regress/sql/opr_sanity.sql @@ -34,7 +34,8 @@ begin if $2 = 'pg_catalog.any'::pg_catalog.regtype then return true; end if; if $2 = 'pg_catalog.anyarray'::pg_catalog.regtype then if EXISTS(select 1 from pg_catalog.pg_type where - oid = $1 and typelem != 0 and typlen = -1) + oid = $1 and typelem != 0 and + typsubscript = 'pg_catalog.array_subscript_handler'::pg_catalog.regproc) then return true; end if; end if; if $2 = 'pg_catalog.anyrange'::pg_catalog.regtype then @@ -59,7 +60,8 @@ begin if $2 = 'pg_catalog.any'::pg_catalog.regtype then return true; end if; if $2 = 'pg_catalog.anyarray'::pg_catalog.regtype then if EXISTS(select 1 from pg_catalog.pg_type where - oid = $1 and typelem != 0 and typlen = -1) + oid = $1 and typelem != 0 and + typsubscript = 'pg_catalog.array_subscript_handler'::pg_catalog.regproc) then return true; end if; end if; if $2 = 'pg_catalog.anyrange'::pg_catalog.regtype then diff --git a/src/test/regress/sql/type_sanity.sql b/src/test/regress/sql/type_sanity.sql index 5e433388cd..8c6e614f20 100644 --- a/src/test/regress/sql/type_sanity.sql +++ b/src/test/regress/sql/type_sanity.sql @@ -63,12 +63,13 @@ WHERE p1.typtype not in ('p') AND p1.typname NOT LIKE E'\\_%' p2.typelem = p1.oid and p1.typarray = p2.oid) ORDER BY p1.oid; --- Make sure typarray points to a varlena array type of our own base +-- Make sure typarray points to a "true" array type of our own base SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype, - p2.typelem, p2.typlen + p2.typsubscript FROM pg_type p1 LEFT JOIN pg_type p2 ON (p1.typarray = p2.oid) WHERE p1.typarray <> 0 AND - (p2.oid IS NULL OR p2.typelem <> p1.oid OR p2.typlen <> -1); + (p2.oid IS NULL OR + p2.typsubscript <> 'array_subscript_handler'::regproc); -- Look for range types that do not have a pg_range entry SELECT p1.oid, p1.typname @@ -323,6 +324,26 @@ WHERE p1.typarray = p2.oid AND p2.typalign != (CASE WHEN p1.typalign = 'd' THEN 'd'::"char" ELSE 'i'::"char" END); +-- Check for typelem set without a handler + +SELECT p1.oid, p1.typname, p1.typelem +FROM pg_type AS p1 +WHERE p1.typelem != 0 AND p1.typsubscript = 0; + +-- Check for misuse of standard subscript handlers + +SELECT p1.oid, p1.typname, + p1.typelem, p1.typlen, p1.typbyval +FROM pg_type AS p1 +WHERE p1.typsubscript = 'array_subscript_handler'::regproc AND NOT + (p1.typelem != 0 AND p1.typlen = -1 AND NOT p1.typbyval); + +SELECT p1.oid, p1.typname, + p1.typelem, p1.typlen, p1.typbyval +FROM pg_type AS p1 +WHERE p1.typsubscript = 'raw_array_subscript_handler'::regproc AND NOT + (p1.typelem != 0 AND p1.typlen > 0 AND NOT p1.typbyval); + -- Check for bogus typanalyze routines SELECT p1.oid, p1.typname, p2.oid, p2.proname @@ -356,7 +377,7 @@ SELECT t.oid, t.typname, t.typanalyze FROM pg_type t WHERE t.typbasetype = 0 AND (t.typanalyze = 'array_typanalyze'::regproc) != - (typelem != 0 AND typlen < 0) + (t.typsubscript = 'array_subscript_handler'::regproc) ORDER BY 1; -- **************** pg_class **************** @@ -452,7 +473,8 @@ WHERE o.opcmethod != 403 OR ((o.opcintype != p1.rngsubtype) AND NOT (o.opcintype = 'pg_catalog.anyarray'::regtype AND EXISTS(select 1 from pg_catalog.pg_type where - oid = p1.rngsubtype and typelem != 0 and typlen = -1))); + oid = p1.rngsubtype and typelem != 0 and + typsubscript = 'array_subscript_handler'::regproc))); -- canonical function, if any, had better match the range type
pgsql-hackers by date: