Thread: copyObject() ? again

copyObject() ? again

From
"Hiroshi Inoue"
Date:
Hello all,

"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> > > AFAIC the relation between objects is not copied correctly
> > > by copyObject() (i.e the same pointers to an object are copied
> > > to different pointers by copyObject()).
> >
> > True, but it seems irrelevant to me --- as Jan Wieck was just pointing
> > out, no code should ever depend on pointer-equality in parse trees or
> > plan trees anyway.
> >
> > > There is a way to maintain the list of (old,new) pairs during
> > > copyObject() operations.
> >
> > I think we'd be better off fixing any places that mistakenly assume
> > pointer compare is sufficient.  You didn't say which version you were
> > testing,
>
> My environment is v6.4.2.
> OK,I would test my cases again after the release of 6.5-BETA(v6.4.3?).
>

This time,I tested the following 2 cases under CURRENT(99/02/23)
environment and the result was same.

Those bugs are caused by copyObject().
The removal of pointer comparison solves those bugs ?
I don't think so.
I think copyObject() should be more reliable.

I made a patch on trial (see attached file) .
After applying it,both return proper results.
This patch changes the implementaion of copyObject() entirely.

Comments ?


[Case -1]
create function subsel() returns int4
as
'
declare
k       int4;
begin
SELECT 1 AS one into k WHERE 1 IN (SELECT 1);
return k;
end;
'
language 'plpgsql';
select subsel();

pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
beforeor
 
while pr
ocessing the request.
We have lost the connection to the backend, so further processing is
impossible. Terminating.

[Case-2]

create table grp (key int4);
insert into grp values(2);
insert into grp values(2);
create function grup() returns int4
as
'
declare
k       int4;
begin
select key into k from grp
group by key
having count(*) > 1;
return k;
end;
'
language 'plpgsql';
select grup();

The result is

grup
----

(1 row)


Thanks in advance.

Hiroshi Inoue
Inoue@tpf.co.jp

Re: [HACKERS] copyObject() ? again

From
Tom Lane
Date:
I'm still not real happy about Hiroshi's proposed reimplementation of
copyObject.  It would make copyObject much slower and more memory-
hungry, which I think is a bad thing -- especially since we may use
copyObject more than now if we redesign memory allocation as was
discussed a few weeks ago.

I looked at the specific problem he's reporting (coredump involving
a SELECT inside a plpgsql function), and I believe I understand it;
it's a pretty straightforward order-of-operations bug in copyfuncs.c.
Specifically, CopyPlanFields() tries to construct the new plan node's
list of subplans, but you can't do that until the node-type-specific
fields have all been set, since they may contain references to subplans.
As it stands, a copied plan will have a subPlan list that is missing
some of its subplans.

A simple localized fix would be to rearrange operations inside
copyfuncs.c so that the SS_pull_subplan operation is not done until
the end of the node-type-specific routines for copying plan nodes.

More generally, however, I think that the very existence of the subPlan
list is a bug.  It violates the notion that "pointer equality shouldn't
matter" because the subPlan list *must* contain pointers to the actual
member objects of a plan node's other substructure.

Moreover the subPlan list isn't even buying us much --- it wouldn't be
that expensive to scan the substructure directly for plan-type nodes
when necessary, exactly as SS_pull_subplan does.  Arguably this would
take less time than is expended on memory allocation to build the
explicit subPlan list.  So I suggest getting rid of the subPlan list
entirely, rather than hacking on copyfuncs.

Plans also have an "initPlan" list that seems to have the same kind of
problem.  I don't really understand the difference between initPlans
and subPlans --- plannodes.h says   List       *initPlan;        /* Init Plan nodes (un-correlated expr
              * subselects) */   List       *subPlan;        /* Other SubPlan nodes */
 
which doesn't convey a whole lot to my mind.  Does anyone understand
why this distinction is made?  Is it really necessary?
        regards, tom lane


Re: [HACKERS] copyObject() ? again

From
Vadim Mikheev
Date:
Tom Lane wrote:
> 
> More generally, however, I think that the very existence of the subPlan
> list is a bug.  It violates the notion that "pointer equality shouldn't
> matter" because the subPlan list *must* contain pointers to the actual
> member objects of a plan node's other substructure.
> 
> Moreover the subPlan list isn't even buying us much --- it wouldn't be
> that expensive to scan the substructure directly for plan-type nodes
> when necessary, exactly as SS_pull_subplan does.  Arguably this would
> take less time than is expended on memory allocation to build the
> explicit subPlan list.  So I suggest getting rid of the subPlan list
> entirely, rather than hacking on copyfuncs.

First, I apologize for any inconveniences in my implementation
of subselect. I agreed that current handling of subPlan lists
is bad, but parent plan need in this list for some initializations
and SUBPLAN_EXPR expressions inside parent plan also need in 
corresponding subPlan nodes (to execute them).

Well, this could be good solution:

1. Implement TopPlan node (upmost plan, we told about this and  seems that Jan wanted implement it).
2. Construct list of _all_ subPlan inside planner and  put this list into TopPlan node.
3. Put into Plan->subPlan list of _indices_ of subPlans used  in this Plan (indices in TopPlan->subPlan list).
4. Use indices in SUBPLAN_EXPR, istead of subPlan pointers.
5. Add to EState and ExprContext pointer to TopPlan->subPlan  list (to enable index --> node conversion in
ExecSubPlan()& ExecInitSubPlan()).  
 
Unfortunately, I missed this solution year ago -:(
And more of that, I have no time to implement this now, sorry.

> Plans also have an "initPlan" list that seems to have the same kind of
> problem.  I don't really understand the difference between initPlans

No problems here. As you see this list is just copied in copyfuncs
and this is ok.

> and subPlans --- plannodes.h says
>     List       *initPlan;        /* Init Plan nodes (un-correlated expr
>                                  * subselects) */
>     List       *subPlan;        /* Other SubPlan nodes */
> which doesn't convey a whole lot to my mind.  Does anyone understand
> why this distinction is made?  Is it really necessary?

Uncorrelated expression subqueries are processed differently
from all other subqueries.

Vadim


RE: [HACKERS] copyObject() ? again

From
"Hiroshi Inoue"
Date:
Hello all,

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Wednesday, March 03, 1999 1:52 AM
> To: Hiroshi Inoue
> Cc: pgsql-hackers
> Subject: Re: [HACKERS] copyObject() ? again 
> 
> 
> I'm still not real happy about Hiroshi's proposed reimplementation of
> copyObject.  It would make copyObject much slower and more memory-
> hungry, which I think is a bad thing -- 

Yes,it's a big defect of my implementation.
But isn't it simple and reliable ?

Currently,limited persons could implement copyObject() or 
could rearrange structures which cause copyObject() bugs.
And it seems that we should use copyObject() carefully.
I don't know whether we are allowed to use copyObject() in 
various phase (parser,optimizer etc) without limitation.

OK,there's a possibility to reduce the number of (old,new) pairs 
maintained during copyObject() operations. 
In fact,if only SubPlan and Aggref type nodes are maintained,
the cases I reported return proper results.
It may be risky but would improve the performance of my 
implementation.

Comments ? 

As to the 2nd case I reported,it was probably introduced by my 
patch. The patch was made to solve other Aggregate bugs.
get_agg_tlist_references() is used to reconstruct aggs member 
node of Agg type nodes as SS_pull_subPlan() does in CopyPlan
Fields(). (The name of function was set_agg_tlist_references() 
originally. Probably Bruce changed it).
We may be able to patch by following related code closely.
But it seems strange that copying objects requires such 
a complicated procedure. 

Thanks in advance.

Hiroshi Inoue
Inoue@tpf.co.jp


Re: [HACKERS] copyObject() ? again

From
Bruce Momjian
Date:
Just sending this over to you in case it is helpful in the future.

> Tom Lane wrote:
> > 
> > More generally, however, I think that the very existence of the subPlan
> > list is a bug.  It violates the notion that "pointer equality shouldn't
> > matter" because the subPlan list *must* contain pointers to the actual
> > member objects of a plan node's other substructure.
> > 
> > Moreover the subPlan list isn't even buying us much --- it wouldn't be
> > that expensive to scan the substructure directly for plan-type nodes
> > when necessary, exactly as SS_pull_subplan does.  Arguably this would
> > take less time than is expended on memory allocation to build the
> > explicit subPlan list.  So I suggest getting rid of the subPlan list
> > entirely, rather than hacking on copyfuncs.
> 
> First, I apologize for any inconveniences in my implementation
> of subselect. I agreed that current handling of subPlan lists
> is bad, but parent plan need in this list for some initializations
> and SUBPLAN_EXPR expressions inside parent plan also need in 
> corresponding subPlan nodes (to execute them).
> 
> Well, this could be good solution:
> 
> 1. Implement TopPlan node (upmost plan, we told about this and
>    seems that Jan wanted implement it).
> 2. Construct list of _all_ subPlan inside planner and
>    put this list into TopPlan node.
> 3. Put into Plan->subPlan list of _indices_ of subPlans used
>    in this Plan (indices in TopPlan->subPlan list).
> 4. Use indices in SUBPLAN_EXPR, istead of subPlan pointers.
> 5. Add to EState and ExprContext pointer to TopPlan->subPlan
>    list (to enable index --> node conversion in
>    ExecSubPlan() & ExecInitSubPlan()).
>    
> Unfortunately, I missed this solution year ago -:(
> And more of that, I have no time to implement this now, sorry.
> 
> > Plans also have an "initPlan" list that seems to have the same kind of
> > problem.  I don't really understand the difference between initPlans
> 
> No problems here. As you see this list is just copied in copyfuncs
> and this is ok.
> 
> > and subPlans --- plannodes.h says
> >     List       *initPlan;        /* Init Plan nodes (un-correlated expr
> >                                  * subselects) */
> >     List       *subPlan;        /* Other SubPlan nodes */
> > which doesn't convey a whole lot to my mind.  Does anyone understand
> > why this distinction is made?  Is it really necessary?
> 
> Uncorrelated expression subqueries are processed differently
> from all other subqueries.
> 
> Vadim
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026