Re: SQL99 ARRAY support proposal - Mailing list pgsql-hackers

From Joe Conway
Subject Re: SQL99 ARRAY support proposal
Date
Msg-id 3E6F8D6F.70108@joeconway.com
Whole thread Raw
In response to Re: SQL99 ARRAY support proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: SQL99 ARRAY support proposal  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: SQL99 ARRAY support proposal  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
Tom Lane wrote:
> But I think I like better the notion of extending my bound-together-
> ANYARRAY-and-ANYELEMENT proposal,
> http://archives.postgresql.org/pgsql-hackers/2003-03/msg00319.php
>
> Suppose that we do that, and then further say that ANYARRAY or
> ANYELEMENT appearing as the return type implies that the return type
> is actually the common element or array type.  Then we have such
> useful behaviors as:
>
>     array_push(anyarray, anyelement) returns anyarray
>     array_pop(anyarray) returns anyelement
>     array_subscript(anyarray, int) yields anyelement
>     singleton_array(anyelement) yields anyarray

Before I get too far along, I'd like to get some feedback. The attached
patch implements Tom's bound-together-ANYARRAY-and-ANYELEMENT proposal
(and includes ANY as well, per earlier discussion). With it, the
following works:

CREATE OR REPLACE FUNCTION array_push (anyarray, anyelement)
RETURNS anyarray
AS '$libdir/plr','array_push'
LANGUAGE 'C';

regression=# select f1[2] from (select array_push('{1,2}'::integer[],3)
as f1) as t;
  f1
----
   2
(1 row)

Does it make sense to commit this now, or should it wait for the other
changes described below?

The following are my proposed next phases in array support changes.
Please let me know now if you think any of these won't fly (conceptually):

1) Implement SQL99/200x ARRAY[] changes proposed here:
    http://archives.postgresql.org/pgsql-hackers/2003-03/msg00297.php
    as modified by
    http://archives.postgresql.org/pgsql-hackers/2003-03/msg00315.php

2) Implement the following new builtin functions
      array_push(anyarray, anyelement) returns anyarray
      array_pop(anyarray) returns anyelement
      array_subscript(anyarray, int) yields anyelement
      singleton_array(anyelement) returns anyarray
         - any reason not to call this one simply "array"?
      split(text, text) returns text[]
         - split string into array on delimiter
      implode(text[], text) returns text
         - join array elements into a string using given string delimiter

3) Modify contrib/array functions as needed and move to the backend. Or
    possibly write equivalent functions from scratch -- I just noticed
    this in contrib/array:

    * This software is distributed under the GNU General Public License
    * either version 2, or (at your option) any later version.

    Is anyone still in contact with Massimo Dal Zotto? Any chance he
    would change the license to BSD?

4) Update "User's Guide"->"Data Types"->"Arrays" documentation and
    create a new section: "User's Guide"->
                          "Functions and Operators"->
                          "Array Functions and Operators"

Thoughts, comments, objections all welcomed.

Thanks,

Joe
Index: src/backend/parser/parse_coerce.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_coerce.c,v
retrieving revision 2.93
diff -c -r2.93 parse_coerce.c
*** src/backend/parser/parse_coerce.c    9 Feb 2003 06:56:28 -0000    2.93
--- src/backend/parser/parse_coerce.c    12 Mar 2003 16:17:39 -0000
***************
*** 188,194 ****

          ReleaseSysCache(targetType);
      }
!     else if (targetTypeId == ANYOID ||
               targetTypeId == ANYARRAYOID)
      {
          /* assume can_coerce_type verified that implicit coercion is okay */
--- 188,194 ----

          ReleaseSysCache(targetType);
      }
!     else if (targetTypeId == ANYOID || targetTypeId == ANYELEMENTOID ||
               targetTypeId == ANYARRAYOID)
      {
          /* assume can_coerce_type verified that implicit coercion is okay */
***************
*** 325,332 ****
              continue;
          }

!         /* accept if target is ANY */
!         if (targetTypeId == ANYOID)
              continue;

          /*
--- 325,332 ----
              continue;
          }

!         /* accept if target is ANY or ANYELEMENT */
!         if (targetTypeId == ANYOID || targetTypeId == ANYELEMENTOID)
              continue;

          /*
Index: src/backend/parser/parse_func.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/parser/parse_func.c,v
retrieving revision 1.144
diff -c -r1.144 parse_func.c
*** src/backend/parser/parse_func.c    9 Feb 2003 06:56:28 -0000    1.144
--- src/backend/parser/parse_func.c    12 Mar 2003 17:46:44 -0000
***************
*** 41,46 ****
--- 41,50 ----
                 List *fargs,
                 Oid *input_typeids,
                 Oid *function_typeids);
+ static Oid enforce_generic_type_consistency(Oid *oid_array,
+                                             Oid *true_oid_array,
+                                             int nargs,
+                                             Oid rettype);
  static int match_argtypes(int nargs,
                 Oid *input_typeids,
                 FuncCandidateList function_typeids,
***************
*** 309,314 ****
--- 313,324 ----
                     "\n\tYou may need to add explicit typecasts");
      }

+     /*
+      * enforce consistency with ANY, ANYARRAY, and ANYELEMENT argument
+      * and return types, possibly modifying return type along the way
+      */
+     rettype = enforce_generic_type_consistency(oid_array, true_oid_array, nargs, rettype);
+
      /* perform the necessary typecasting of arguments */
      make_arguments(nargs, fargs, oid_array, true_oid_array);

***************
*** 347,352 ****
--- 357,491 ----
      return retval;
  }

+ /*
+  * If ANY, ANYARRAY, or ANYELEMENT is used for a function's arguments or
+  * return type, make sure the runtime types are consistent with
+  * each other. The argument consistency rules are like so:
+  *
+  * 1) All arguments declared ANY should have matching datatypes.
+  * 2) All arguments declared ANYARRAY should have matching datatypes.
+  * 3) All arguments declared ANYELEMENT should have matching datatypes.
+  * 4) If there are arguments of both ANYELEMENT and ANYARRAY, make sure
+  *    the runtime scalar argument type is in fact the element type for
+  *    the runtime array argument type.
+  *
+  * Rules are applied to the function's return type (possibly altering it)
+  * if it is declared ANY, ANYARRAY, or ANYELEMENT:
+  *
+  * 1) If return type is ANY, and any argument is ANY, use the
+  *    arguments runtime type as the function's return type.
+  * 2) If return type is ANY but no argument is ANY, leave the return type as
+  *    is.
+  *    XXX should this case be rejected at the point of function creation?
+  * 3) If return type is ANYARRAY, and any argument is ANYARRAY, use the
+  *    arguments runtime type as the function's return type.
+  * 4) If return type is ANYARRAY, no argument is ANYARRAY, but any argument
+  *    is ANYELEMENT, use the runtime type of the argument to determine
+  *    the function's return type, i.e. the element type's corresponding
+  *    array type.
+  * 5) If return type is ANYARRAY, no argument is ANYARRAY or ANYELEMENT,
+  *    leave the return type as is.
+  *    XXX should this case be rejected at the point of function creation?
+  * 6) If return type is ANYELEMENT, and any argument is ANYELEMENT, use the
+  *    arguments runtime type as the function's return type.
+  * 7) If return type is ANYELEMENT, no argument is ANYELEMENT, but any argument
+  *    is ANYARRAY, use the runtime type of the argument to determine
+  *    the function's return type, i.e. the array type's corresponding
+  *    element type.
+  * 8) If return type is ANYELEMENT, no argument is ANYARRAY or ANYELEMENT
+  *    leave the return type as is.
+  *    XXX should this case be rejected at the point of function creation?
+  */
+ static Oid
+ enforce_generic_type_consistency(Oid *oid_array, Oid *true_oid_array, int nargs, Oid rettype)
+ {
+     int            j;
+     Oid            any_typeid = InvalidOid;
+     Oid            elem_typeid = InvalidOid;
+     Oid            array_typeid = InvalidOid;
+     Oid            array_typelem = InvalidOid;
+
+     /*
+      * Loop through the arguments to see if we have any that are
+      * ANYARRAY or ANYELEMENT. If so, require the runtime types to be
+      * self-consistent
+      */
+     for (j = 0; j < nargs; j++)
+     {
+         if (true_oid_array[j] == ANYOID)
+         {
+             if (OidIsValid(any_typeid) && oid_array[j] != any_typeid)
+                 elog(ERROR, "Inconsistent use of arguments declared ANY");
+             any_typeid = oid_array[j];
+         }
+
+         if (true_oid_array[j] == ANYELEMENTOID)
+         {
+             if (OidIsValid(elem_typeid) && oid_array[j] != elem_typeid)
+                 elog(ERROR, "Inconsistent use of arguments declared ANYELEMENT");
+             elem_typeid = oid_array[j];
+         }
+
+         if (true_oid_array[j] == ANYARRAYOID)
+         {
+             if (OidIsValid(array_typeid) && oid_array[j] != array_typeid)
+                 elog(ERROR, "Inconsistent use of arguments declared ANYARRAY");
+             array_typeid = oid_array[j];
+         }
+     }
+
+     /*
+      * Fast Track: if none of the arguments are ANY, ANYARRAY, or ANYELEMENT,
+      * return the original rettype now
+      */
+     if (!OidIsValid(any_typeid) &&
+         !OidIsValid(array_typeid) &&
+         !OidIsValid(elem_typeid))
+         return rettype;
+
+     /* if we return ANYOID check consistency with any like arguments */
+     if (rettype == ANYOID)
+     {
+         if (OidIsValid(any_typeid))
+             return any_typeid;
+         else
+             return rettype;
+     }
+
+     /* get the element type based on the array type, if we have one */
+     if (OidIsValid(array_typeid))
+     {
+         array_typelem = get_typelem(array_typeid);
+
+         if (!OidIsValid(elem_typeid))
+         {
+             /* if we don't have an element type yet, use the one we just got */
+             elem_typeid = array_typelem;
+         }
+         else if (array_typelem != elem_typeid)
+         {
+             /* otherwise, they better match */
+             elog(ERROR, "Argument declared ANYARRAY not consistent with " \
+                         "argument declared ANYELEMENT");
+         }
+     }
+
+     /* if we return ANYARRAYOID enforce consistency with any like arguments */
+     if (rettype == ANYARRAYOID)
+     {
+         if (OidIsValid(array_typeid))
+             return array_typeid;
+         else
+             return get_arraytype(elem_typeid);
+     }
+
+     /* if we return ANYELEMENTOID check consistency with any like arguments */
+     if (rettype == ANYELEMENTOID)
+         return elem_typeid;
+
+     /* we don't return a generic type; send back the original return type */
+     return rettype;
+ }

  /* match_argtypes()
   *
Index: src/backend/utils/cache/lsyscache.c
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/backend/utils/cache/lsyscache.c,v
retrieving revision 1.90
diff -c -r1.90 lsyscache.c
*** src/backend/utils/cache/lsyscache.c    3 Feb 2003 21:15:44 -0000    1.90
--- src/backend/utils/cache/lsyscache.c    12 Mar 2003 16:17:39 -0000
***************
*** 1040,1045 ****
--- 1040,1108 ----
  }

  /*
+  * get_typelem
+  *
+  *        Given the type OID, return the typelem field (element type OID
+  *        for array types)
+  */
+ Oid
+ get_typelem(Oid typid)
+ {
+     HeapTuple    tp;
+
+     tp = SearchSysCache(TYPEOID,
+                         ObjectIdGetDatum(typid),
+                         0, 0, 0);
+     if (HeapTupleIsValid(tp))
+     {
+         Form_pg_type typtup = (Form_pg_type) GETSTRUCT(tp);
+         Oid        result;
+
+         result = typtup->typelem;
+         ReleaseSysCache(tp);
+         return result;
+     }
+     else
+         return InvalidOid;
+ }
+
+ /*
+  * get_arraytype
+  *
+  *        Given an element type OID, return the OID the corresponding
+  *        array type
+  */
+ Oid
+ get_arraytype(Oid elem_typeid)
+ {
+     HeapTuple    tp;
+
+     tp = SearchSysCache(TYPEOID,
+                         ObjectIdGetDatum(elem_typeid),
+                         0, 0, 0);
+     if (HeapTupleIsValid(tp))
+     {
+         Form_pg_type typtup = (Form_pg_type) GETSTRUCT(tp);
+         char   *elem_typename;
+         Oid        elem_namespaceId;
+         char   *array_typename;
+
+         elem_typename = pstrdup(NameStr(typtup->typname));
+         elem_namespaceId = typtup->typnamespace;
+         ReleaseSysCache(tp);
+
+         array_typename = makeArrayTypeName(elem_typename);
+
+         return GetSysCacheOid(TYPENAMENSP,
+                                  PointerGetDatum(array_typename),
+                                  ObjectIdGetDatum(elem_namespaceId),
+                                  0, 0);
+     }
+     else
+         return InvalidOid;
+ }
+
+ /*
   * get_typdefault
   *      Given a type OID, return the type's default value, if any.
   *
Index: src/include/catalog/pg_type.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/catalog/pg_type.h,v
retrieving revision 1.139
diff -c -r1.139 pg_type.h
*** src/include/catalog/pg_type.h    23 Jan 2003 23:39:06 -0000    1.139
--- src/include/catalog/pg_type.h    12 Mar 2003 16:17:39 -0000
***************
*** 523,528 ****
--- 523,530 ----
  #define ANYOID            2276
  DATA(insert OID = 2277 ( anyarray        PGNSP PGUID -1 f p t \054 0 0 anyarray_in anyarray_out    i x f 0 -1 0
_null__null_ )); 
  #define ANYARRAYOID        2277
+ DATA(insert OID = 2283 ( anyelement        PGNSP PGUID  4 t p t \054 0 0 any_in any_out    i p f 0 -1 0 _null_ _null_
));
+ #define ANYELEMENTOID    2283
  DATA(insert OID = 2278 ( void            PGNSP PGUID  4 t p t \054 0 0 void_in void_out    i p f 0 -1 0 _null_ _null_
));
  #define VOIDOID            2278
  DATA(insert OID = 2279 ( trigger        PGNSP PGUID  4 t p t \054 0 0 trigger_in trigger_out    i p f 0 -1 0 _null_
_null_)); 
Index: src/include/utils/lsyscache.h
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/include/utils/lsyscache.h,v
retrieving revision 1.67
diff -c -r1.67 lsyscache.h
*** src/include/utils/lsyscache.h    3 Feb 2003 21:15:45 -0000    1.67
--- src/include/utils/lsyscache.h    12 Mar 2003 16:17:39 -0000
***************
*** 55,60 ****
--- 55,62 ----
                       char *typalign);
  extern char get_typstorage(Oid typid);
  extern int32 get_typtypmod(Oid typid);
+ extern Oid get_typelem(Oid typid);
+ extern Oid get_arraytype(Oid elem_typeid);
  extern Node *get_typdefault(Oid typid);
  extern char get_typtype(Oid typid);
  extern Oid    get_typ_typrelid(Oid typid);
Index: src/test/regress/expected/type_sanity.out
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/test/regress/expected/type_sanity.out,v
retrieving revision 1.17
diff -c -r1.17 type_sanity.out
*** src/test/regress/expected/type_sanity.out    15 Jan 2003 19:35:48 -0000    1.17
--- src/test/regress/expected/type_sanity.out    12 Mar 2003 18:59:47 -0000
***************
*** 101,119 ****
  -----+---------+-----+---------
  (0 rows)

! -- As of 7.3, this check finds SET and refcursor, which are borrowing
! -- other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typinput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
      (p1.typelem != 0 AND p1.typlen < 0) AND NOT
      (p2.prorettype = p1.oid AND NOT p2.proretset)
  ORDER BY 1;
!  oid  |  typname  | oid |  proname
! ------+-----------+-----+-----------
!    32 | SET       | 109 | unknownin
!  1790 | refcursor |  46 | textin
! (2 rows)

  -- Varlena array types will point to array_in
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
--- 101,120 ----
  -----+---------+-----+---------
  (0 rows)

! -- As of 7.4, this check finds SET, refcursor, and anyelement, which are
! -- borrowing other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typinput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
      (p1.typelem != 0 AND p1.typlen < 0) AND NOT
      (p2.prorettype = p1.oid AND NOT p2.proretset)
  ORDER BY 1;
!  oid  |  typname   | oid  |  proname
! ------+------------+------+-----------
!    32 | SET        |  109 | unknownin
!  1790 | refcursor  |   46 | textin
!  2283 | anyelement | 2294 | any_in
! (3 rows)

  -- Varlena array types will point to array_in
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
***************
*** 126,133 ****
  (0 rows)

  -- Check for bogus typoutput routines
! -- As of 7.3, this check finds SET and refcursor, which are borrowing
! -- other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typoutput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
--- 127,134 ----
  (0 rows)

  -- Check for bogus typoutput routines
! -- As of 7.4, this check finds SET, refcursor, and anyelement, which are
! -- borrowing other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typoutput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
***************
*** 135,145 ****
       (p2.oid = 'array_out'::regproc AND
        p1.typelem != 0 AND p1.typlen = -1))
  ORDER BY 1;
!  oid  |  typname  | oid |  proname
! ------+-----------+-----+------------
!    32 | SET       | 110 | unknownout
!  1790 | refcursor |  47 | textout
! (2 rows)

  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
--- 136,147 ----
       (p2.oid = 'array_out'::regproc AND
        p1.typelem != 0 AND p1.typlen = -1))
  ORDER BY 1;
!  oid  |  typname   | oid  |  proname
! ------+------------+------+------------
!    32 | SET        |  110 | unknownout
!  1790 | refcursor  |   47 | textout
!  2283 | anyelement | 2295 | any_out
! (3 rows)

  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
Index: src/test/regress/sql/type_sanity.sql
===================================================================
RCS file: /opt/src/cvs/pgsql-server/src/test/regress/sql/type_sanity.sql,v
retrieving revision 1.17
diff -c -r1.17 type_sanity.sql
*** src/test/regress/sql/type_sanity.sql    15 Jan 2003 19:35:48 -0000    1.17
--- src/test/regress/sql/type_sanity.sql    12 Mar 2003 18:59:42 -0000
***************
*** 84,91 ****
        p2.proargtypes[1] = 'oid'::regtype AND
        p2.proargtypes[2] = 'int4'::regtype));

! -- As of 7.3, this check finds SET and refcursor, which are borrowing
! -- other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typinput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
--- 84,91 ----
        p2.proargtypes[1] = 'oid'::regtype AND
        p2.proargtypes[2] = 'int4'::regtype));

! -- As of 7.4, this check finds SET, refcursor, and anyelement, which are
! -- borrowing other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typinput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
***************
*** 102,109 ****

  -- Check for bogus typoutput routines

! -- As of 7.3, this check finds SET and refcursor, which are borrowing
! -- other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typoutput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
--- 102,109 ----

  -- Check for bogus typoutput routines

! -- As of 7.4, this check finds SET, refcursor, and anyelement, which are
! -- borrowing other types' I/O routines
  SELECT p1.oid, p1.typname, p2.oid, p2.proname
  FROM pg_type AS p1, pg_proc AS p2
  WHERE p1.typoutput = p2.oid AND p1.typtype in ('b', 'p') AND NOT

pgsql-hackers by date:

Previous
From: Rod Taylor
Date:
Subject: Re: Case insensitivity, and option?
Next
From: Tom Lane
Date:
Subject: Re: SQL99 ARRAY support proposal