Thread: mark/restore failures on unsorted merge joins

mark/restore failures on unsorted merge joins

From
Andrew Gierth
Date:
From a report by user "kes" on irc, I constructed this test case:

create table marktst (id integer, val numrange, exclude using gist (val with &&));
insert into marktst select i, numrange(i, i+1, '[)') from generate_series(1,1000) i;
vacuum marktst;
analyze marktst;

select * from (select val from marktst where val && numrange(10,11)) s1 full join (values (1)) v(a) on true;
ERROR:  function ammarkpos is not defined for index marktst_val_excl

for which the query plan is:
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Merge Full Join  (cost=0.14..4.21 rows=2 width=20)
   ->  Result  (cost=0.00..0.01 rows=1 width=4)
   ->  Index Only Scan using marktst_val_excl on marktst  (cost=0.14..4.18 rows=2 width=16)
         Index Cond: (val && '[10,11)'::numrange)
(4 rows)

The problem is that the planner calls ExecSupportsMarkRestore to find
out whether a Materialize node is needed, and that function looks no
further than the Path type of T_Index[Only]Path in order to return true,
even though in this case it's a GiST index which does not support
mark/restore.

(Usually this can't be a problem because the merge join would need
sorted input, thus the index scan would be a btree; but a merge join
that doesn't actually have any sort keys could take unsorted input from
any index type.)

Going forward, this looks like IndexOptInfo needs another am* boolean
field, but that's probably not appropriate for the back branches; maybe
as a workaround, ExecSupportsMarkRestore should just check for btree?

-- 
Andrew (irc:RhodiumToad)



Re: mark/restore failures on unsorted merge joins

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> The problem is that the planner calls ExecSupportsMarkRestore to find
> out whether a Materialize node is needed, and that function looks no
> further than the Path type of T_Index[Only]Path in order to return true,
> even though in this case it's a GiST index which does not support
> mark/restore.

> (Usually this can't be a problem because the merge join would need
> sorted input, thus the index scan would be a btree; but a merge join
> that doesn't actually have any sort keys could take unsorted input from
> any index type.)

Sounds like the right analysis.

> Going forward, this looks like IndexOptInfo needs another am* boolean
> field, but that's probably not appropriate for the back branches; maybe
> as a workaround, ExecSupportsMarkRestore should just check for btree?

Uh, why would you not just look to see if the ammarkpos/amrestrpos fields
are non-null?

            regards, tom lane



Re: mark/restore failures on unsorted merge joins

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> The problem is that the planner calls ExecSupportsMarkRestore to
 >> find out whether a Materialize node is needed, and that function
 >> looks no further than the Path type of T_Index[Only]Path in order to
 >> return true, even though in this case it's a GiST index which does
 >> not support mark/restore.

 >> (Usually this can't be a problem because the merge join would need
 >> sorted input, thus the index scan would be a btree; but a merge join
 >> that doesn't actually have any sort keys could take unsorted input
 >> from any index type.)

 Tom> Sounds like the right analysis.

 >> Going forward, this looks like IndexOptInfo needs another am*
 >> boolean field, but that's probably not appropriate for the back
 >> branches; maybe as a workaround, ExecSupportsMarkRestore should just
 >> check for btree?

 Tom> Uh, why would you not just look to see if the ammarkpos/amrestrpos
 Tom> fields are non-null?

We don't (in the back branches) seem to have a pointer to the
IndexAmRoutine handy, only the oid? Obviously we can look it up from the
oid, but is that more overhead than we want in a join cost function,
given that this will be called for all potential mergejoins considered,
not just JOIN_FULL? Or is the overhead not worth bothering about?

-- 
Andrew (irc:RhodiumToad)



Re: mark/restore failures on unsorted merge joins

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Uh, why would you not just look to see if the ammarkpos/amrestrpos
>  Tom> fields are non-null?

> We don't (in the back branches) seem to have a pointer to the
> IndexAmRoutine handy, only the oid?

Oh, sorry, I misread your comment to be that you wanted to add a field
to IndexAmRoutine.  You're right, the real issue here is that
ExecSupportsMarkRestore lacks any convenient access to the needed info,
and we need to add a bool to IndexOptInfo to fix that.

I don't see any compelling reason why you couldn't add the field at
the end in the back branches; that's what we usually do to avoid
ABI breaks.  Although actually (counts fields...) it looks like
there's at least one pad byte after amcanparallel, so you could
add a bool there without any ABI consequence, resulting in a
reasonably natural field order in all branches.

            regards, tom lane



Re: mark/restore failures on unsorted merge joins

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> Oh, sorry, I misread your comment to be that you wanted to add a
 Tom> field to IndexAmRoutine. You're right, the real issue here is that
 Tom> ExecSupportsMarkRestore lacks any convenient access to the needed
 Tom> info, and we need to add a bool to IndexOptInfo to fix that.

 Tom> I don't see any compelling reason why you couldn't add the field
 Tom> at the end in the back branches; that's what we usually do to
 Tom> avoid ABI breaks. Although actually (counts fields...) it looks
 Tom> like there's at least one pad byte after amcanparallel, so you
 Tom> could add a bool there without any ABI consequence, resulting in a
 Tom> reasonably natural field order in all branches.

I guess that's close enough; this should suffice then.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index e2154ba86a..0c10f1d35c 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -417,6 +417,11 @@ ExecSupportsMarkRestore(Path *pathnode)
     {
         case T_IndexScan:
         case T_IndexOnlyScan:
+            /*
+             * Not all index types support mark/restore.
+             */
+            return castNode(IndexPath, pathnode)->indexinfo->amcanmarkpos;
+
         case T_Material:
         case T_Sort:
             return true;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 52c01eb86b..3e94256d34 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -284,6 +284,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
             info->amhasgettuple = (amroutine->amgettuple != NULL);
             info->amhasgetbitmap = amroutine->amgetbitmap != NULL &&
                 relation->rd_tableam->scan_bitmap_next_block != NULL;
+            info->amcanmarkpos = (amroutine->ammarkpos != NULL &&
+                                  amroutine->amrestrpos != NULL);
             info->amcostestimate = amroutine->amcostestimate;
             Assert(info->amcostestimate != NULL);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index abe6f570e3..5a10c1855d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -864,6 +864,7 @@ struct IndexOptInfo
     bool        amhasgettuple;    /* does AM have amgettuple interface? */
     bool        amhasgetbitmap; /* does AM have amgetbitmap interface? */
     bool        amcanparallel;    /* does AM support parallel scan? */
+    bool        amcanmarkpos;    /* does AM support mark/restore? */
     /* Rather than include amapi.h here, we declare amcostestimate like this */
     void        (*amcostestimate) ();    /* AM's cost estimator */
 };

Re: mark/restore failures on unsorted merge joins

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> I guess that's close enough; this should suffice then.

Looks about right.  Not sure if we need to bother with a regression test
case; once that's in, it'd be hard to break it.

            regards, tom lane



Re: mark/restore failures on unsorted merge joins

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> I guess that's close enough; this should suffice then.

 Tom> Looks about right. Not sure if we need to bother with a regression
 Tom> test case; once that's in, it'd be hard to break it.

We could check the EXPLAIN output (since the Materialize node would show
up), but it's not easy to get stable plans since the choice of which
path to put on the outside is not fixed. Based on what I found when
actually testing the code, it probably wouldn't be worth the effort.

-- 
Andrew (irc:RhodiumToad)



Re: mark/restore failures on unsorted merge joins

From
Tom Lane
Date:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> Looks about right. Not sure if we need to bother with a regression
>  Tom> test case; once that's in, it'd be hard to break it.

> We could check the EXPLAIN output (since the Materialize node would show
> up), but it's not easy to get stable plans since the choice of which
> path to put on the outside is not fixed. Based on what I found when
> actually testing the code, it probably wouldn't be worth the effort.

If it's not easy to test, I agree it's not worth it.

(Given how long it took anyone to notice this, the difficulty of
making a stable test case is unsurprising, perhaps.)

            regards, tom lane