Re: [HACKERS] TIME QUALIFICATION - Mailing list pgsql-hackers

From Vadim Mikheev
Subject Re: [HACKERS] TIME QUALIFICATION
Date
Msg-id 36C02550.FA19B0EB@krs.ru
Whole thread Raw
In response to Re: [HACKERS] TIME QUALIFICATION  (jwieck@debis.com (Jan Wieck))
Responses Re: [HACKERS] TIME QUALIFICATION
List pgsql-hackers
Jan Wieck wrote:
> 
> Vadim wrote:
> 
> > It seems too complex to me. I again propose to use refcount
> > inside snapshot itself to prevent free-ing of snapshots.
> > Benefits: no copying in Executor, no QueryId --> Snapshot
> > lookup. Just add pointer to RTE. Parser will put NULL there:
> > as flag that current snapshot has to be used. ExecutorStart                ^^^^^^^^^^^^^^^^
Note: "current" here is "when actual execution will start", not
"when query was parsed/rewritten". ExecutorStart will substitute
QuerySnapshot for NULL snapshot pointers.

> > and deffered rules will increment refcount of current snapshot.
^^^^^^^^^^^^^^^^
I.e. - QuerySnapshot - as it was when rewriting/execution starts.

Sorry, I think that my explanation was bad, hope to fix this -:)

> > Deffered rules will also set snapshot pointers of appropriate
> > RTEs (to the current snapshot).
> 
>     Yes,  the  parser  could  allways  put  NULL  into  the RTE's
>     snapshot name field (or the name later for named  snapshots).
>     But  it's  the  rewrite  system  that has to tell for unnamed
>     snapshots, which ones have to be used on which RTE's.

Of course!

>     Let's have two simple tables with a rule (and assume  in  the
>     following that snapshot includes scan command Id):
> 
>         create table t1 (a int4);
>         create table t2 (b int4);
> 
>         create rule r1 as on delete to t1
>             do delete from t2 where b = old.a;
> 
>     We execute the following commands:
> 
>         begin;
>         delete from t1 where a = 5;
>         insert into t2 values (5);
>         commit;
> 
>     If 5 is in t2 after commit depends on if the rule is deferred
>     or not.  If it isn't deferred, 5 should be  there,  otherwise
>     not.
> 
>     The rule will create a parsetree like this:
> 
>         delete from t2 where t1.a = 5 and b = t1.a;
> 
>     So the tree has a rangetable containing t2 and t1 (along with
>     some other unused entries). But only the rule  system  knows,
>     that  the  RTE  for t2 came from the rule and must be scanned
>     with the visibility of commit time while  t1  came  from  the
>     original  query  and must be scanned with the visibility that
>     was when the original delete from t1 was executed  (they  are
>     already deleted, but the rule actions scan must find em).

And so for deffered rules rewrite system will:

1. set t2' RTE snapshot pointer to NULL - this will guarantee  that snapshot of execution time (commit or set immediate
time) will be used;
 
2. set t1' RTE snapshot pointer to current QuerySnapshot   (and increment its refcount).

>     And  there  could  also be rules fired on t2. This results in
>     recursive rewriting and it's not that  easy  to  foresee  the
>     order  in  which  all  these commands will then get executed.
>     During recursion there is no  difference  between  a  command
>     coming  from  the  user  and one that is already generated by
>     another rule.
> 
>     The problem here is, that the RTE's in a rule generated query
>     resulting  from the former command (that fired them) must get
>     scanned against the snapshot of  the  time  when  the  former
>     command  get's  executed.  But the RTE's coming from the rule     ^^^^^^^^^^^^^^^^^^^^^^^^
So - you use QuerySnapshot as it was in this time.

>     action itself must get the snapshot when the rules command is

Set RTE' snapshot pointer to NULL.

>     executed.  Only this way the quals added to the rule from the
>     former command will see what the former command saw.
> 
>     The executor cannot know where all  the  RTE's  where  coming
>     from. Except we have a QueryId and associate the QueryId with
>     a snapshot at the time of execution. And I think we  must  do
>     this lookup, because the order commands are executed will not
>     be the same as they got created.  The executor  only  has  to
>     override  the RTE's snapshot if the RTE's snapshot name isn't
>     NULL.

+ set NULL snapshot pointers to QuerySnapshot.

Vadim


pgsql-hackers by date:

Previous
From: Zeugswetter Andreas IZ5
Date:
Subject: AW: [HACKERS] Problems with >2GB tables on Linux 2.0
Next
From: jwieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] TIME QUALIFICATION