Thread: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault

search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault

From
David Geier
Date:
Hi hackers,

While working with cursors that reference plans with CustomScanStates 
nodes, I encountered a segfault which originates from 
search_plan_tree(). The query plan is the result of a simple SELECT 
statement into which I inject a Custom Scan node at the root to do some 
post-processing before returning rows. This plan is referenced by a 
second plan with a Tid Scan which originates from a query of the form 
DELETE FROM foo WHERE CURRENT OF my_cursor;

search_plan_tree() assumes that 
CustomScanState::ScanState::ss_currentRelation is never NULL. In my 
understanding that only holds for CustomScanState nodes which are at the 
bottom of the plan and actually read from a relation. CustomScanState 
nodes which are not at the bottom don't have ss_currentRelation set. I 
believe for such nodes, instead search_plan_tree() should recurse into 
CustomScanState::custom_ps.

I attached a patch. Any thoughts?

Best regards,
David
Swarm64


Attachment

Re: search_plan_tree(): handling of non-leaf CustomScanState nodes causes segfault

From
Ashutosh Bapat
Date:
On Mon, Jan 18, 2021 at 4:13 PM David Geier <david@swarm64.com> wrote:
>
> Hi hackers,
>
> While working with cursors that reference plans with CustomScanStates
> nodes, I encountered a segfault which originates from
> search_plan_tree(). The query plan is the result of a simple SELECT
> statement into which I inject a Custom Scan node at the root to do some
> post-processing before returning rows. This plan is referenced by a
> second plan with a Tid Scan which originates from a query of the form
> DELETE FROM foo WHERE CURRENT OF my_cursor;
>
> search_plan_tree() assumes that
> CustomScanState::ScanState::ss_currentRelation is never NULL. In my
> understanding that only holds for CustomScanState nodes which are at the
> bottom of the plan and actually read from a relation. CustomScanState
> nodes which are not at the bottom don't have ss_currentRelation set. I
> believe for such nodes, instead search_plan_tree() should recurse into
> CustomScanState::custom_ps.
>
> I attached a patch. Any thoughts?

I don't have any comments about your patch as such, but ForeignScan is
similar to CustomScan. ForeignScan also can leave ss_currentRelation
NULL if it represents join between two foreign tables. So either
ForeignScan has the same problem as CustomScan (it's just above
CustomScan case in search_plan_tree()) or it's handling it in some
other way. In the first case we may want to fix that too in the same
manner (not necessarily in the same patch) and the in the later case
CustomScan can handle it the same way.

Said that, I didn't notice any field in ForeignScan which is parallel
to custom_ps, so what you are proposing is still needed.

-- 
Best Wishes,
Ashutosh Bapat



Hi,

+            * Custom scan nodes can be leaf nodes or inner nodes and therfore need special treatment.

The special treatment applies to inner nodes. The above should be better phrased to clarify.

Cheers

On Mon, Jan 18, 2021 at 2:43 AM David Geier <david@swarm64.com> wrote:
Hi hackers,

While working with cursors that reference plans with CustomScanStates
nodes, I encountered a segfault which originates from
search_plan_tree(). The query plan is the result of a simple SELECT
statement into which I inject a Custom Scan node at the root to do some
post-processing before returning rows. This plan is referenced by a
second plan with a Tid Scan which originates from a query of the form
DELETE FROM foo WHERE CURRENT OF my_cursor;

search_plan_tree() assumes that
CustomScanState::ScanState::ss_currentRelation is never NULL. In my
understanding that only holds for CustomScanState nodes which are at the
bottom of the plan and actually read from a relation. CustomScanState
nodes which are not at the bottom don't have ss_currentRelation set. I
believe for such nodes, instead search_plan_tree() should recurse into
CustomScanState::custom_ps.

I attached a patch. Any thoughts?

Best regards,
David
Swarm64

David Geier <david@swarm64.com> writes:
> search_plan_tree() assumes that 
> CustomScanState::ScanState::ss_currentRelation is never NULL. In my 
> understanding that only holds for CustomScanState nodes which are at the 
> bottom of the plan and actually read from a relation. CustomScanState 
> nodes which are not at the bottom don't have ss_currentRelation set. I 
> believe for such nodes, instead search_plan_tree() should recurse into 
> CustomScanState::custom_ps.

Hm.  I agree that we shouldn't simply assume that ss_currentRelation
isn't null.  However, we cannot make search_plan_tree() descend
through non-leaf CustomScan nodes, because we don't know what processing
is involved there.  We need to find a scan that is guaranteed to return
rows that are one-to-one with the cursor output.  This is why the function
doesn't descend through join or aggregation nodes, and I see no argument
by which we should assume we know more about what a customscan node will
do than we know about those.

So I'm inclined to think a suitable fix is just

-               if (RelationGetRelid(sstate->ss_currentRelation) == table_oid)
+               if (sstate->ss_currentRelation &&
+                   RelationGetRelid(sstate->ss_currentRelation) == table_oid)
                    result = sstate;

            regards, tom lane



Hi,
It seems sstate->ss_currentRelation being null can only occur for T_ForeignScanState and T_CustomScanState.

What about the following change ?

Cheers

diff --git a/src/backend/executor/execCurrent.c b/src/backend/executor/execCurrent.c
index 0852bb9cec..56e31951d1 100644
--- a/src/backend/executor/execCurrent.c
+++ b/src/backend/executor/execCurrent.c
@@ -325,12 +325,21 @@ search_plan_tree(PlanState *node, Oid table_oid,
         case T_IndexOnlyScanState:
         case T_BitmapHeapScanState:
         case T_TidScanState:
+            {
+                ScanState  *sstate = (ScanState *) node;
+
+                if (RelationGetRelid(sstate->ss_currentRelation) == table_oid)
+                    result = sstate;
+                break;
+            }
+
         case T_ForeignScanState:
         case T_CustomScanState:
             {
                 ScanState  *sstate = (ScanState *) node;

-                if (RelationGetRelid(sstate->ss_currentRelation) == table_oid)
+                if (sstate->ss_currentRelation &&
+                    RelationGetRelid(sstate->ss_currentRelation) == table_oid)
                     result = sstate;
                 break;
             }

On Mon, Jan 18, 2021 at 10:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Geier <david@swarm64.com> writes:
> search_plan_tree() assumes that
> CustomScanState::ScanState::ss_currentRelation is never NULL. In my
> understanding that only holds for CustomScanState nodes which are at the
> bottom of the plan and actually read from a relation. CustomScanState
> nodes which are not at the bottom don't have ss_currentRelation set. I
> believe for such nodes, instead search_plan_tree() should recurse into
> CustomScanState::custom_ps.

Hm.  I agree that we shouldn't simply assume that ss_currentRelation
isn't null.  However, we cannot make search_plan_tree() descend
through non-leaf CustomScan nodes, because we don't know what processing
is involved there.  We need to find a scan that is guaranteed to return
rows that are one-to-one with the cursor output.  This is why the function
doesn't descend through join or aggregation nodes, and I see no argument
by which we should assume we know more about what a customscan node will
do than we know about those.

So I'm inclined to think a suitable fix is just

-               if (RelationGetRelid(sstate->ss_currentRelation) == table_oid)
+               if (sstate->ss_currentRelation &&
+                   RelationGetRelid(sstate->ss_currentRelation) == table_oid)
                    result = sstate;

                        regards, tom lane


Zhihong Yu <zyu@yugabyte.com> writes:
> It seems sstate->ss_currentRelation being null can only
> occur for T_ForeignScanState and T_CustomScanState.
> What about the following change ?

Seems like more code for no very good reason.

            regards, tom lane



Hi, Tom:
I was thinking that, if sstate->ss_currentRelation is null for the other cases, that would be a bug.
An assertion can be added for the cases ending with T_TidScanState.
Though, the null sstate->ss_currentRelation would surface immediately (apart from assertion). So I omitted the assertion in the diff.

Cheers

On Mon, Jan 18, 2021 at 12:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Zhihong Yu <zyu@yugabyte.com> writes:
> It seems sstate->ss_currentRelation being null can only
> occur for T_ForeignScanState and T_CustomScanState.
> What about the following change ?

Seems like more code for no very good reason.

                        regards, tom lane
Zhihong Yu <zyu@yugabyte.com> writes:
> I was thinking that, if sstate->ss_currentRelation is null for the other
> cases, that would be a bug.
> An assertion can be added for the cases ending with T_TidScanState.

Maybe, but there are surely a lot of other places that would crash
in such a case --- places far more often traversed than search_plan_tree.
I do not see any value in complicating search_plan_tree for that.

            regards, tom lane



Hi,

On 18.01.21 19:46, Tom Lane wrote:
> David Geier <david@swarm64.com> writes:
>> search_plan_tree() assumes that
>> CustomScanState::ScanState::ss_currentRelation is never NULL. In my
>> understanding that only holds for CustomScanState nodes which are at the
>> bottom of the plan and actually read from a relation. CustomScanState
>> nodes which are not at the bottom don't have ss_currentRelation set. I
>> believe for such nodes, instead search_plan_tree() should recurse into
>> CustomScanState::custom_ps.
> Hm.  I agree that we shouldn't simply assume that ss_currentRelation
> isn't null.  However, we cannot make search_plan_tree() descend
> through non-leaf CustomScan nodes, because we don't know what processing
> is involved there.  We need to find a scan that is guaranteed to return
> rows that are one-to-one with the cursor output.  This is why the function
> doesn't descend through join or aggregation nodes, and I see no argument
> by which we should assume we know more about what a customscan node will
> do than we know about those.
That makes sense. Thanks for the explanation.
>
> So I'm inclined to think a suitable fix is just
>
> -               if (RelationGetRelid(sstate->ss_currentRelation) == table_oid)
> +               if (sstate->ss_currentRelation &&
> +                   RelationGetRelid(sstate->ss_currentRelation) == table_oid)
>                      result = sstate;
>
>             regards, tom lane
>
>
I updated the patch to match your proposal.

Best regards,
David
Swarm64

Attachment
David Geier <david@swarm64.com> writes:
> On 18.01.21 19:46, Tom Lane wrote:
>> Hm.  I agree that we shouldn't simply assume that ss_currentRelation
>> isn't null.  However, we cannot make search_plan_tree() descend
>> through non-leaf CustomScan nodes, because we don't know what processing
>> is involved there.  We need to find a scan that is guaranteed to return
>> rows that are one-to-one with the cursor output.  This is why the function
>> doesn't descend through join or aggregation nodes, and I see no argument
>> by which we should assume we know more about what a customscan node will
>> do than we know about those.

> That makes sense. Thanks for the explanation.

OK, cool.  I was afraid you'd argue that you really needed your CustomScan
node to be transparent in such cases.  We could imagine inventing an
additional custom-scan-provider callback to embed the necessary knowledge,
but I'd rather not add the complexity until someone has a use-case.

> I updated the patch to match your proposal.

WFM, will push in a bit.

            regards, tom lane



Hi,

On 18.01.21 23:42, Tom Lane wrote:
> David Geier<david@swarm64.com>  writes:
>> On 18.01.21 19:46, Tom Lane wrote:
>>> Hm.  I agree that we shouldn't simply assume that ss_currentRelation
>>> isn't null.  However, we cannot make search_plan_tree() descend
>>> through non-leaf CustomScan nodes, because we don't know what processing
>>> is involved there.  We need to find a scan that is guaranteed to return
>>> rows that are one-to-one with the cursor output.  This is why the function
>>> doesn't descend through join or aggregation nodes, and I see no argument
>>> by which we should assume we know more about what a customscan node will
>>> do than we know about those.
>> That makes sense. Thanks for the explanation.
> OK, cool.  I was afraid you'd argue that you really needed your CustomScan
> node to be transparent in such cases.  We could imagine inventing an
> additional custom-scan-provider callback to embed the necessary knowledge,
> but I'd rather not add the complexity until someone has a use-case.

I was thinking about that. Generally, having such possibility would be 
very useful. Unfortunately, I believe that in my specific case even 
having such functionality wouldn't allow for the query to work with 
CURRENT OF, because my CSP behaves similarly to a materialize node.

My understanding is only such plans are supported which consist of nodes 
that guarantee that the tuple returned by the plan is the unmodified 
tuple a scan leaf node is currently positioned on?

Still, if there's interest I would be happy to draft a patch. Instead of 
a separate CSP callback, we could also provide an additional flag like 
CUSTOMPATH_SUPPORT_CURRENT_OF. The advantage of the callback would be 
that we could delay the decision until execution time where potentially 
more information is available.
>> I updated the patch to match your proposal.
> WFM, will push in a bit.
>
>             regards, tom lane
Best regards,
David
Swarm64



David Geier <david@swarm64.com> writes:
> On 18.01.21 23:42, Tom Lane wrote:
>> OK, cool.  I was afraid you'd argue that you really needed your CustomScan
>> node to be transparent in such cases.  We could imagine inventing an
>> additional custom-scan-provider callback to embed the necessary knowledge,
>> but I'd rather not add the complexity until someone has a use-case.

> I was thinking about that. Generally, having such possibility would be 
> very useful. Unfortunately, I believe that in my specific case even 
> having such functionality wouldn't allow for the query to work with 
> CURRENT OF, because my CSP behaves similarly to a materialize node.
> My understanding is only such plans are supported which consist of nodes 
> that guarantee that the tuple returned by the plan is the unmodified 
> tuple a scan leaf node is currently positioned on?

Doesn't have to be *unmodified* --- a projection is fine, for example.
But we have to be sure that the current output tuple of the plan tree
is based on the current output tuple of the bottom-level table scan
node.  As an example of the hazards here, it's currently safe for
search_plan_tree to descend through a Limit node, but it did not use to
be, because the old implementation of Limit was such that it could return
a different tuple from the one the underlying scan node thinks it is
positioned on.

As another example, descending through Append is OK because only one
of the child scans will be positioned-on-a-tuple at all; the rest
will be at EOF or not yet started, so they can't produce a match
to whatever tuple ID the WHERE CURRENT OF is asking about.

Now that I look at this, I strongly wonder whether whoever added
MergeAppend support here understood what they were doing.  That
looks broken, because child nodes will typically be positioned on
tuples, whether or not the current top-level output came from them.
So I fear we could get a false-positive confirmation that some
tuple matches WHERE CURRENT OF.

Anyway, it seems clearly possible that some nonleaf CustomScans
would operate in a manner that would allow descending through them
while others wouldn't.  But I don't really want to write the docs
explaining what a callback for this should do ;-)

            regards, tom lane



I wrote:
> Now that I look at this, I strongly wonder whether whoever added
> MergeAppend support here understood what they were doing.  That
> looks broken, because child nodes will typically be positioned on
> tuples, whether or not the current top-level output came from them.
> So I fear we could get a false-positive confirmation that some
> tuple matches WHERE CURRENT OF.

Urgh, indeed it's buggy.  With the attached test script I get

...
BEGIN
DECLARE CURSOR
 f1 | f2  
----+-----
  1 | one
(1 row)

UPDATE 1
UPDATE 1
UPDATE 1
COMMIT
 f1 |     f2      
----+-------------
  1 | one updated
(1 row)

 f1 |     f2      
----+-------------
  2 | two updated
(1 row)

 f1 |      f2       
----+---------------
  3 | three updated
(1 row)

where clearly only the row with f1=1 should have updated
(and if you leave off ORDER BY, so as to get a Merge not
MergeAppend plan, indeed only that row updates).

I shall go fix this, and try to improve the evidently-inadequate
comments in search_plan_tree.

            regards, tom lane

drop table if exists t1,t2,t3;

create table t1 (f1 int unique, f2 text);
insert into t1 values (1, 'one');
create table t2 (f1 int unique, f2 text);
insert into t2 values (2, 'two');
create table t3 (f1 int unique, f2 text);
insert into t3 values (3, 'three');

explain
select * from t1 union all select * from t2 union all select * from t3
order by 1;

begin;

declare c cursor for
select * from t1 union all select * from t2 union all select * from t3
order by 1;

fetch 1 from c;

update t1 set f2 = f2 || ' updated' where current of c;
update t2 set f2 = f2 || ' updated' where current of c;
update t3 set f2 = f2 || ' updated' where current of c;

commit;

table t1;
table t2;
table t3;