Re: [HACKERS] Behavior of GENERATED columns per SQL2003 - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Date
Msg-id 200705150236.l4F2aEX09902@momjian.us
Whole thread Raw
In response to Re: [HACKERS] Behavior of GENERATED columns per SQL2003  (Zoltan Boszormenyi <zb@cybertec.at>)
List pgsql-patches
URL added to TODO item.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:
> Forwarded to -patches because of the attachment.
>
> -------- Eredeti ?zenet --------
> T?rgy:     Re: [HACKERS] Behavior of GENERATED columns per SQL2003
> D?tum:     Tue, 08 May 2007 12:38:32 +0200
> Felad?:     Zoltan Boszormenyi <zb@cybertec.at>
> C?mzett:     Tom Lane <tgl@sss.pgh.pa.us>
> CC:     pgsql-hackers@postgreSQL.org
> Hivatkoz?sok:     <18812.1178572575@sss.pgh.pa.us>
>
>
>
> Tom Lane ?rta:
> > I've been studying the SQL spec in a bit more detail and I'm suddenly
> > thinking that we've got the behavior all wrong in the current
> > GENERATED/IDENTITY patch.  In particular, it looks to me like we've
> > been implementing GENERATED ALWAYS AS (expr) according to the rules
> > that the spec in fact lays down only for GENERATED ALWAYS AS IDENTITY.
> > You'd think the two constructs would be pretty closely related but
> > the behaviors specified by the spec are light-years apart.  If you
> > look closely, a "generated column" in the meaning of section 4.14.8
> > is one that has GENERATED ALWAYS AS (expr), and identity columns are
> > *not* in this class.
> >
>
> True.
>
> > It looks to me like the behavior the spec intends for a generated column
> > is actually that it can be implemented as a "virtual column" occupying
> > no space on disk and instead computed on-the-fly when retrieved.
> >
>
> I think the keyword here is 'can be'. But having it stored gives a nice
> speedup in SELECTs vs. no speedup if you generate it on the fly.
>
> > Identity columns can have their values overridden by the
> > user (it's a little harder if GENERATED ALWAYS, but still possible),
> > and they don't change during an UPDATE unless specifically forced to.
> >
>
> Yes, I implemented it this way.
>
> > In contrast, generated columns cannot be overridden by
> > assignment, and are recomputed from their base columns during updates.
> >
>
> I see, I incorrectly made OVERRIDING SYSTEM VALUE
> to have an effect on generated columns. Now I don't need
> to pass around the list of the modified fields from
> rewriteTargetList(), I simply have to blindly update all of them
> both in INSERT and UPDATE. Fixed. And now I don't have to
> invent something to discover what fields were modified by
> BEFORE triggers.
>
> > This realization also explains the following, otherwise rather strange,
> > facts:
> >
> > * There is no GENERATED BY DEFAULT AS (expr) in the spec.
> >
>
> Yes, and because it already exists and called DEFAULT.
>
> > * GENERATED expressions are specifically disallowed from containing
> >   subselects, calling functions that access any SQL-data, or being
> >   nondeterministic; hence their values depend solely on the regular
> >   columns in the same row.
> >
>
> The sanity checks for the DEFAULT expression already
> handle subselect in PostgreSQL:
>
> db=# create table t1 (id float generated always as identity, t text,
> g text generated always as (case when t is null then '' else t end ||
> (select max(id) from t1));
> NOTICE:  CREATE TABLE will create implicit sequence "t1_id_seq" for
> serial column "t1.id"
> ERROR:  cannot use subquery in default expression
>
> But how do you check a function in general?
> Especially when it's not written in plpgsql?
> E.g. a C function can use SPI and SELECTs.
>
> Also, SQL:2003 doesn't allow any functions for the DEFAULT clause
> besides timestamp functions, e.g. NOW().
> But PostgreSQL already allows non-IMMUTABLE functions used in
> DEFAULT clauses. Would you want to restrict it?
>
> > * While identity columns are updated (if needed) before execution of
> >   BEFORE triggers, generated columns are updated after BEFORE triggers;
> >   hence a BEFORE trigger can override the value in one case and not the
> >   other.  (The current patch gets this wrong, btw.)
> >
>
> Where do you see that? Which version were you looking at?
> Identity columns are generated in rewriteTargetList(), way before any
> triggers.
> Generated column are computed in ExecInsert() and ExecUpdate(),
> certainly after applying BEFORE triggers in both cases and
> before CheckConstraint(). There was one bug in the UPDATE case, though,
> as UPDATE loops if it couldn't do its job in one go because of
> serialization.
> I fixed this.
>
> > * Generated columns are forcibly updated when their base columns change
> >   as a result of FK constraints (such as ON UPDATE CASCADE).
> >
>
> Isn't ExecUpdate() called on the referring table's row during such FK event?
> It seems yes to me:
>
> db=# create table t1 (id serial generated always as identity primary key,
> t text generated always as (id || '_1'));
> NOTICE:  CREATE TABLE will create implicit sequence "t1_id_seq" for
> serial column "t1.id"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey"
> for table "t1"
> CREATE TABLE
> db=# create table t2 (id serial generated always as identity primary key,
> id_t1 integer not null references t1(id) on update cascade,
> g text generated always as (id_t1 || '_1'));
> NOTICE:  CREATE TABLE will create implicit sequence "t2_id_seq" for
> serial column "t2.id"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey"
> for table "t2"
> CREATE TABLE
> db=# insert into t1 values (default, 'a1');
> INSERT 0 1
> db=# insert into t1 values (default, 'b2');
> INSERT 0 1
> db=# insert into t1 values (default, 'c3');
> INSERT 0 1
> db=# insert into t1 values (default, 'd4');
> INSERT 0 1
> db=# select * from t1;
>  id |  t
> ----+-----
>   1 | 1_1
>   2 | 2_1
>   3 | 3_1
>   4 | 4_1
> (4 rows)
>
> db=# insert into t2 values (default, 2);
> INSERT 0 1
> db=# select * from t2;
>  id | id_t1 |  g
> ----+-------+-----
>   1 |     2 | 2_1
> (1 row)
>
> db=# update t1 set id = 77 where id = 2;
> UPDATE 1
> db=# select * from t1;
>  id |  t
> ----+------
>   1 | 1_1
>   3 | 3_1
>   4 | 4_1
>  77 | 77_1    (<--- This was buggy, the generated value didn't change here,
>                                  as an oversight during the rewrite to
> make the generation behaviour
>                                  a property of the DEFAULT clause. Now
> fixed.)
> (4 rows)
>
> zozo=# select * from t2;
>  id | id_t1 |  g
> ----+-------+------
>   1 |    77 | 77_1 (<---- It worked this way before, too.)
> (1 row)
>
> Anyway, a new patch is necessary it seems,
> so it should apply cleanly to new CVS and have the
> above bugs fixed. Attached.
>
> > It looks to me like a BEFORE trigger is actually the only place that can
> > (transiently) see values of a generated column that are different from
> > the result of applying the generation expression on the rest of the row.
> > It's unclear whether that's intentional or an oversight.
> >
>
> I thought it was intentional. BEFORE triggers can change the base columns
> before storing them, but the generated columns should be consistent
> with their generation expressions and their base columns after executing
> the BEFORE triggers. If the triggers should see generated values
> then another recomputation is necessary to refresh the generated
> columns AND to make them unchangable by the triggers.
> It seems to be a waste of cycles to me. It should be documented, though.
>
> > Is anyone familiar with a database that implements SQL-spec generated
> > columns?  Do they actually store the columns?
> >
>
> David Fuhry answered this question with the advertised intention.
>
> >             regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> >                http://www.postgresql.org/docs/faq
> >
> >
>
> Best regards,
> Zolt?n B?sz?rm?nyi
>
> --
> ----------------------------------
> Zolt?n B?sz?rm?nyi
> Cybertec Geschwinde & Sch?nig GmbH
> http://www.postgresql.at/
>
>
>
>
> --
> ----------------------------------
> Zolt?n B?sz?rm?nyi
> Cybertec Geschwinde & Sch?nig GmbH
> http://www.postgresql.at/
>

[ application/x-tar is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

pgsql-patches by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: [DOCS] OS/X startup scripts
Next
From: Alvaro Herrera
Date:
Subject: Re: [DOCS] OS/X startup scripts