Thread: Delaying insertion of default values

Delaying insertion of default values

From
Tom Lane
Date:
If you have a default value for a column, say
create table t (a int4, b int4 default 12345);

and you write a command that depends on the default, say
insert into t (a) values (1);

the way that the default is currently accounted for is that the parser
rewrites the command into
insert into t (a,b) values (1, 12345);

(this happens in transformInsertStmt()).  It strikes me that doing this
in the parser is too early, and it needs to be done later, like after
the rewriter.  Why?  Because the rule mechanism stores rules as
parsetrees.  If the above INSERT is part of a rule, then the stored form
of the rule will look like the rewritten command, with the default
already attached.  This is bad: if I later alter the default for t.b,
the rule won't get updated.

(I can't currently change the default with ALTER TABLE, I think, but
sooner or later ALTER TABLE will be fixed.  I *can* alter t.b's default
by dumping the database, changing the CREATE TABLE command for t, and
reloading --- but the rule still won't be updated, because what's dumped
out for it will look like "insert into t values (1, 12345);" !  Try it
and see...)

I am inclined to think that attachment of default values should happen
in the planner, at the same time that the targetlist is reordered to
match the physical column order and dummy NULLs are inserted for missing
columns (ie, expand_targetlist()).  Certainly it must happen after the
rule mechanism.  Unless I hear objections, I will do that while I am
cleaning up INSERT processing for the INSERT ... SELECT ... GROUP BY bug.


More generally, I wonder whether it is such a good idea for rules to be
stored as parsetrees.  For example, I can't drop and recreate a table
mentioned in a rule attached to a different table, because the compiled
rule includes the OIDs of the tables it references.  So the compiled
rule will start failing if I do that.  (Right now, this causes a core
dump :-( ... apparently someone is assuming that the OID in an RTE will
never be bad ...)

With rules stored as parsetrees, we need to be very careful about how
much semantic knowledge gets factored into the parsetree before it is
frozen as a rule.  (This is another reason for pushing "optimization"
transformations out of the parser and into modules downstream of the
rule rewriter, BTW.)

Comments?  Storing rules as plain text would be too slow, perhaps,
but would it help any to store rules as "raw" parsetrees that haven't
yet gone through analyze.c?
        regards, tom lane


Re: [HACKERS] Delaying insertion of default values

From
Bruce Momjian
Date:
> More generally, I wonder whether it is such a good idea for rules to be
> stored as parsetrees.  For example, I can't drop and recreate a table
> mentioned in a rule attached to a different table, because the compiled
> rule includes the OIDs of the tables it references.  So the compiled
> rule will start failing if I do that.  (Right now, this causes a core
> dump :-( ... apparently someone is assuming that the OID in an RTE will
> never be bad ...)
> 
> With rules stored as parsetrees, we need to be very careful about how
> much semantic knowledge gets factored into the parsetree before it is
> frozen as a rule.  (This is another reason for pushing "optimization"
> transformations out of the parser and into modules downstream of the
> rule rewriter, BTW.)
> 
> Comments?  Storing rules as plain text would be too slow, perhaps,
> but would it help any to store rules as "raw" parsetrees that haven't
> yet gone through analyze.c?

All this sounds good, though we have so many TODO items, it seems a
little of a reach to be going after this.  Seems like a good thing to do
as you add that extra phase of query processing.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Delaying insertion of default values

From
Vadim Mikheev
Date:
Tom Lane wrote:
> 
> (this happens in transformInsertStmt()).  It strikes me that doing this
> in the parser is too early, and it needs to be done later, like after
> the rewriter.  Why?  Because the rule mechanism stores rules as
> parsetrees.  If the above INSERT is part of a rule, then the stored form
> of the rule will look like the rewritten command, with the default
> already attached.  This is bad: if I later alter the default for t.b,
> the rule won't get updated.
> 
> (I can't currently change the default with ALTER TABLE, I think, but
> sooner or later ALTER TABLE will be fixed.  I *can* alter t.b's default

ALTER TABLE could (or should?) re-compile table' rules...

> by dumping the database, changing the CREATE TABLE command for t, and
> reloading --- but the rule still won't be updated, because what's dumped
> out for it will look like "insert into t values (1, 12345);" !  Try it
> and see...)
> 
> I am inclined to think that attachment of default values should happen
> in the planner, at the same time that the targetlist is reordered to
> match the physical column order and dummy NULLs are inserted for missing
> columns (ie, expand_targetlist()).  Certainly it must happen after the

Why not? Not bad way, imho.

> rule mechanism.  Unless I hear objections, I will do that while I am
> cleaning up INSERT processing for the INSERT ... SELECT ... GROUP BY bug.

No objections -:).

Vadim


Re: [HACKERS] Delaying insertion of default values

From
wieck@debis.com (Jan Wieck)
Date:
Vadim wrote:

> ALTER TABLE could (or should?) re-compile table' rules...

    Rules should be recompilable for various reasons. DROP/CREATE
    of objects (relations, functions etc.)  referenced  in  rules
    changes their OID and needs recompilation too.

    Thus  we  need  to  store  the original rule text and a cross
    reference listing all the objects used in the rules  actions.
    That's two new system catalogs for me.

    Another  problem  with  rules coming up every so often is the
    rule plan string too big error. I'm actually  thinking  about
    arbitrary tuple sizes and will open another discussion thread
    on that, but I'm not sure how far we'll get this for v6.6 and
    if  the  solution  would  be  good  enough  to  handle system
    catalogs and syscache entries as well. To get  rules  out  of
    the  way  here  and beeing free to add this technique to user
    tables only I'll go ahead then and implement  rule  qual  and
    action splitting handled by the rule system itself anyway.

> > rule mechanism.  Unless I hear objections, I will do that while I am
> > cleaning up INSERT processing for the INSERT ... SELECT ... GROUP BY bug.
>
> No objections -:).

    This  would  be  obsolete when having the above recompilation
    implemented.  I'll add a support function that takes  an  OID
    which      should      be      called     at     any     DROP
    TABLE/VIEW/FUNCTION/OPERATOR  etc.  which  will  cause   rule
    recompilation on the next usage of the relation.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

Re: [HACKERS] Delaying insertion of default values

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> Vadim wrote:
> 
> > ALTER TABLE could (or should?) re-compile table' rules...
> 
>     Rules should be recompilable for various reasons. DROP/CREATE
>     of objects (relations, functions etc.)  referenced  in  rules
>     changes their OID and needs recompilation too.

Yes. And the same is true for stored procedures when we'll
get them.

> > > rule mechanism.  Unless I hear objections, I will do that while I am
> > > cleaning up INSERT processing for the INSERT ... SELECT ... GROUP BY bug.
> >
> > No objections -:).
> 
>     This  would  be  obsolete when having the above recompilation
>     implemented.  I'll add a support function that takes  an  OID
>     which      should      be      called     at     any     DROP
>     TABLE/VIEW/FUNCTION/OPERATOR  etc.  which  will  cause   rule
>     recompilation on the next usage of the relation.

Agreed. I didn't object but of course I more like general solution
- a way to invalidate stored rules/procedures/etc and re-compilate
them when need.

BTW, what's your plan for RI constraints, Jan?
Did you see my letter about statement level triggers?
If I'll get WAL implemented then it could be used for RI.
In any case I believe that statement level triggers
are very nice thing and they are better for RI than
rules.

Vadim



Re: [HACKERS] Delaying insertion of default values

From
wieck@debis.com (Jan Wieck)
Date:
Vadim wrote:

>
> Jan Wieck wrote:
> >
> > Vadim wrote:
> >
> > > ALTER TABLE could (or should?) re-compile table' rules...
> >
> >     Rules should be recompilable for various reasons. DROP/CREATE
> >     of objects (relations, functions etc.)  referenced  in  rules
> >     changes their OID and needs recompilation too.
>
> Yes. And the same is true for stored procedures when we'll
> get them.

    Don't  we  have  some  kind  of  them  already  with  the  PL
    functions?  They get compiled on each first use per  backend,
    and  I  think  that for a database under development (usually
    not a live system) it isn't too bad to need a reconnect after
    schema changes.

> BTW, what's your plan for RI constraints, Jan?
> Did you see my letter about statement level triggers?
> If I'll get WAL implemented then it could be used for RI.
> In any case I believe that statement level triggers
> are very nice thing and they are better for RI than
> rules.

    What's WAL?

    Let's  think  about  a  foreign  key  constraint that must be
    checked  at  transaction  commit  (deferred  constraint),  so
    someone can do

        BEGIN;
        SET CONSTRAINT reftab_check_refkey DEFERRED;
        UPDATE reftab SET refkey = 4711,
            prodname = 'New product'
            WHERE prodname = 'Temp product';
        INSERT INTO keytab (keyval, prodname)
            VALUES (4711, 'New product');
        COMMIT;

    The  statement  level trigger should not check all 25 million
    rows of reftab against keytab. It should only  check  the  10
    rows  that  got  updated  because  they matched. How does the
    statement level trigger get access to  the  qualification  of
    the  query  that  fired it. And how does it find out which of
    them WHERE meant because it will not be  able  to  find  them
    again with the same qual.

    Currently  rules  cannot do this job too. I planned to change
    the handling of snapshot as  discussed  and  to  implement  a
    deferred  querytree  list  ran  at  appropriate  times  (like
    COMMIT). Plus a new RAISE command that's internally most of a
    SELECT  but throwing an elog if it finds some rows. Such a CI
    rule would then look like:

        CREATE RULE reftab_check_refkey AS ON UPDATE TO reftab DO
            RAISE 'foreign key % not present', new.refkey
            WHERE NOT EXISTS
            (SELECT keyval FROM keytab WHERE keyval = new.refkey);

    This rule will get expanded by the rewriter to do a scan with
    the  snapshot when the UPDATE ran against reftab and with the
    qual expanded to match the updated old tuples only,  but  the
    subselect  will  have  the snapshot at commit time which will
    find the newly inserted keytab row. I don't see how statement
    level triggers can do it.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

Re: [HACKERS] Delaying insertion of default values

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> > BTW, what's your plan for RI constraints, Jan?
> > Did you see my letter about statement level triggers?
> > If I'll get WAL implemented then it could be used for RI.
> > In any case I believe that statement level triggers
> > are very nice thing and they are better for RI than
> > rules.
> 
>     What's WAL?

Write Ahead Log. We could backward scan WAL to get tid of
changed primary/unique/foreign table rows and check constraints.
More of that, we could write to WAL RI infos only for rows with
updated _keys_ to avoid check for cases when there was no key
update.

...
>     Currently  rules  cannot do this job too. I planned to change
>     the handling of snapshot as  discussed  and  to  implement  a
>     deferred  querytree  list  ran  at  appropriate  times  (like
>     COMMIT). Plus a new RAISE command that's internally most of a
>     SELECT  but throwing an elog if it finds some rows. Such a CI
>     rule would then look like:
> 
>         CREATE RULE reftab_check_refkey AS ON UPDATE TO reftab DO
>             RAISE 'foreign key % not present', new.refkey
>             WHERE NOT EXISTS
>             (SELECT keyval FROM keytab WHERE keyval = new.refkey);
> 
>     This rule will get expanded by the rewriter to do a scan with
>     the  snapshot when the UPDATE ran against reftab and with the
>     qual expanded to match the updated old tuples only,  but  the
>     subselect  will  have  the snapshot at commit time which will
>     find the newly inserted keytab row. I don't see how statement
>     level triggers can do it.

As far as I understand what is statement level trigger (SLT),
one is able to use NEW/OLD in queries of SLT just like as
NEW/OLD are used in rules. I would say that SLT-s are
rules powered by PL, and nothing more. You would just rewrite
each query of SLT with NEW/OLD in normal fashion. Using power
of PL _ANY_ constraints (not just simple RI ones) could be
implemented. 

Comments?

Vadim


Re: [HACKERS] Delaying insertion of default values

From
wieck@debis.com (Jan Wieck)
Date:
Vadim wrote:

> Jan Wieck wrote:
> >
> >     What's WAL?
>
> Write Ahead Log. We could backward scan WAL to get tid of
> changed primary/unique/foreign table rows and check constraints.
> More of that, we could write to WAL RI infos only for rows with
> updated _keys_ to avoid check for cases when there was no key
> update.

    Sounds reasonable.

>
> As far as I understand what is statement level trigger (SLT),
> one is able to use NEW/OLD in queries of SLT just like as
> NEW/OLD are used in rules. I would say that SLT-s are
> rules powered by PL, and nothing more. You would just rewrite
> each query of SLT with NEW/OLD in normal fashion. Using power
> of PL _ANY_ constraints (not just simple RI ones) could be
> implemented.

    Ah  - in contrast to what I thought SLT's would be. I thought
    an SLT would only be called once per statement, not once  per
    tuple (...  FOR EACH STATEMENT EXECUTE PROCEDURE ...).

    In my understanding an SLT couldn't have worked for something
    like

        UPDATE t1 SET b = t2.b WHERE t1.a = t2.a;

    Isn't this all still an AFTER trigger on ROW level that could
    be executed deferred?

    I like the aproach to give constraints the power of PL.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #

Re: [HACKERS] Delaying insertion of default values

From
Tom Lane
Date:
wieck@debis.com (Jan Wieck) writes:
>>>> rule mechanism.  Unless I hear objections, I will do that while I am
>>>> cleaning up INSERT processing for the INSERT ... SELECT ... GROUP BY bug.
>> 
>> No objections -:).

>     This  would  be  obsolete when having the above recompilation
>     implemented.

I plan to do it anyway, since I need to restructure analyze.c's handling
of INSERT and I believe that it would be cleaner to do the
default-adding work in expand_targetlist.  But recompiling rules is
needed to solve other problems, so that has to happen too.
        regards, tom lane


Re: [HACKERS] Delaying insertion of default values

From
Bruce Momjian
Date:
> Vadim wrote:
> 
> > ALTER TABLE could (or should?) re-compile table' rules...
> 
>     Rules should be recompilable for various reasons. DROP/CREATE
>     of objects (relations, functions etc.)  referenced  in  rules
>     changes their OID and needs recompilation too.

Added to TODO:
* Allow RULE recomplation

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Delaying insertion of default values

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> >
> > Write Ahead Log. We could backward scan WAL to get tid of
> > changed primary/unique/foreign table rows and check constraints.
> > More of that, we could write to WAL RI infos only for rows with
> > updated _keys_ to avoid check for cases when there was no key
> > update.
> 
>     Sounds reasonable.

Unfortunately, additional WAL reads are not goot for overall
system performance, but it's way.

> > As far as I understand what is statement level trigger (SLT),
> > one is able to use NEW/OLD in queries of SLT just like as
> > NEW/OLD are used in rules. I would say that SLT-s are
> > rules powered by PL, and nothing more. You would just rewrite
> > each query of SLT with NEW/OLD in normal fashion. Using power
> > of PL _ANY_ constraints (not just simple RI ones) could be
> > implemented.
> 
>     Ah  - in contrast to what I thought SLT's would be. I thought
>     an SLT would only be called once per statement, not once  per
>     tuple (...  FOR EACH STATEMENT EXECUTE PROCEDURE ...).

Yes, SLT is called once per statement, but queries in SLT are
able to see _all_ old/new tuples affected by statement, just
like rule action queries are able to do it.

For the case of checking existance of primary key trigger
over referencing table could execute

SELECT count(*) FROM new
WHERE NOT EXISTS
(SELECT keyval FROM keytab WHERE keyval = new.refkey);

and abort if count returned is > 0.
The query above will be just rewritten by rewrite system.
SLTes are rule actions + all these nice IF, FOR etc
PL statements -:)

Actually, query above must be modified to deal with
concurrent updates. Some other xaction can delete keyval
and query will not notice this. To see concurrent update/delete
query must be able to read dirty data and wait for other
xactions. It's not easy to do. I need in more time to
think about this issue.

Vadim


Re: [HACKERS] Delaying insertion of default values

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> >
> > Write Ahead Log. We could backward scan WAL to get tid of
> > changed primary/unique/foreign table rows and check constraints.
> > More of that, we could write to WAL RI infos only for rows with
> > updated _keys_ to avoid check for cases when there was no key
> > update.
> 
>     Sounds reasonable.

Unfortunately, additional WAL reads are not goot for overall
system performance, but it's way.

> > As far as I understand what is statement level trigger (SLT),
> > one is able to use NEW/OLD in queries of SLT just like as
> > NEW/OLD are used in rules. I would say that SLT-s are
> > rules powered by PL, and nothing more. You would just rewrite
> > each query of SLT with NEW/OLD in normal fashion. Using power
> > of PL _ANY_ constraints (not just simple RI ones) could be
> > implemented.
> 
>     Ah  - in contrast to what I thought SLT's would be. I thought
>     an SLT would only be called once per statement, not once  per
>     tuple (...  FOR EACH STATEMENT EXECUTE PROCEDURE ...).

Yes, SLT is called once per statement, but queries in SLT are
able to see _all_ old/new tuples affected by statement, just
like rule action queries are able to do it.

For the case of checking existance of primary key trigger
over referencing table could execute

SELECT count(*) FROM new
WHERE NOT EXISTS
(SELECT keyval FROM keytab WHERE keyval = new.refkey);

and abort if count returned is > 0.
The query above will be just rewritten by rewrite system.
SLTes are rule actions + all these nice IF, FOR etc
PL statements -:)

Actually, query above must be modified to deal with
concurrent updates. Some other xaction can delete keyval
and query will not notice this. To see concurrent update/delete
query must be able to read dirty data and wait for other
xactions. It's not easy to do. I need in more time to
think about this issue.

Vadim


Re: [HACKERS] Delaying insertion of default values

From
Vadim Mikheev
Date:
Jan Wieck wrote:
> 
> Vadim wrote:
> 
> > ALTER TABLE could (or should?) re-compile table' rules...
> 
>     Rules should be recompilable for various reasons. DROP/CREATE
>     of objects (relations, functions etc.)  referenced  in  rules
>     changes their OID and needs recompilation too.

Yes. And the same is true for stored procedures when we'll
get them.

> > > rule mechanism.  Unless I hear objections, I will do that while I am
> > > cleaning up INSERT processing for the INSERT ... SELECT ... GROUP BY bug.
> >
> > No objections -:).
> 
>     This  would  be  obsolete when having the above recompilation
>     implemented.  I'll add a support function that takes  an  OID
>     which      should      be      called     at     any     DROP
>     TABLE/VIEW/FUNCTION/OPERATOR  etc.  which  will  cause   rule
>     recompilation on the next usage of the relation.

Agreed. I didn't object but of course I more like general solution
- a way to invalidate stored rules/procedures/etc and re-compilate
them when need.

BTW, what's your plan for RI constraints, Jan?
Did you see my letter about statement level triggers?
If I'll get WAL implemented then it could be used for RI.
In any case I believe that statement level triggers
are very nice thing and they are better for RI than
rules.

Vadim