Thread: Incremental sort for access method with ordered scan support (amcanorderbyop)

Incremental sort for access method with ordered scan support (amcanorderbyop)

From
Miroslav Bendik
Date:
Current version of postgresql don't support incremental sort using
ordered scan on access method.

Example database:

CREATE TABLE t (id serial, p point, PRIMARY KEY(id));
INSERT INTO t (SELECT generate_series(1, 10000000), point(random(), random()));
CREATE INDEX p_idx ON t USING gist(p);
ANALYZE;

Now i want closest points to center:

SELECT id, p <-> point(0.5, 0.5) dist FROM t ORDER BY dist LIMIT 10;

Everything works good (Execution Time: 0.276 ms).

Now i want predictable sorting for points with same distance:

SELECT id, p <-> point(0.5, 0.5) dist FROM t ORDER BY dist, id LIMIT 10;

Execution time is now 1 000 x slower (589.486 ms) and postgresql uses
full sort istead of incremental:

Sort (cost=205818.51..216235.18 rows=4166667 width=12)

Postgres allows incremental sort only for ordered indexes. Function
build_index_paths dont build partial order paths for access methods
with order support. My patch adds support for incremental ordering
with access method. Results with patch:

Incremental Sort (cost=5522.10..1241841.02 rows=10000000 width=12)
(actual time=0.404..0.405 rows=10 loops=1)
  Sort Key: ((p <-> '(0.5,0.5)'::point)), id
  Presorted Key: ((p <-> '(0.5,0.5)'::point))

Execution Time: 0.437 ms

Attachment

On Sun, Apr 16, 2023 at 1:20 AM Miroslav Bendik <miroslav.bendik@gmail.com> wrote:
Postgres allows incremental sort only for ordered indexes. Function
build_index_paths dont build partial order paths for access methods
with order support. My patch adds support for incremental ordering
with access method. 

I think this is a meaningful optimization.  I reviewed the patch and
here are the comments from me.

* I understand the new param 'match_pathkeys_length_p' is used to tell
how many presorted keys are useful.  I think list_length(orderbyclauses)
will do the same.  So there is no need to add the new param, thus we can
reduce the code diffs.

* Now that match_pathkeys_to_index() returns a prefix of the pathkeys
rather than returns NIL immediately when there is a failure to match, it
seems the two local variables 'orderby_clauses' and 'clause_columns' are
not necessary any more.  I think we can instead lappend the matched
'expr' and 'indexcol' to '*orderby_clauses_p' and '*clause_columns_p'
directly.  In this way we can still call 'return' when we come to a
failure to match.

* In build_index_paths(), I think the diff can be reduced to

-    if (orderbyclauses)
-        useful_pathkeys = root->query_pathkeys;
-    else
-        useful_pathkeys = NIL;
+    useful_pathkeys = list_truncate(list_copy(root->query_pathkeys),
+                                    list_length(orderbyclauses));

* Several comments in match_pathkeys_to_index() are out of date.  We
need to revise them to cope with the change.

* I think it's better to provide a test case.

Thanks
Richard

Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

From
Miroslav Bendik
Date:
po 17. 4. 2023 o 15:26 Richard Guo <guofenglinux@gmail.com> napísal(a):
>
>
> On Sun, Apr 16, 2023 at 1:20 AM Miroslav Bendik <miroslav.bendik@gmail.com> wrote:
>>
>> Postgres allows incremental sort only for ordered indexes. Function
>> build_index_paths dont build partial order paths for access methods
>> with order support. My patch adds support for incremental ordering
>> with access method.
>
>
> I think this is a meaningful optimization.  I reviewed the patch and
> here are the comments from me.
>
> * I understand the new param 'match_pathkeys_length_p' is used to tell
> how many presorted keys are useful.  I think list_length(orderbyclauses)
> will do the same.  So there is no need to add the new param, thus we can
> reduce the code diffs.
>
> * Now that match_pathkeys_to_index() returns a prefix of the pathkeys
> rather than returns NIL immediately when there is a failure to match, it
> seems the two local variables 'orderby_clauses' and 'clause_columns' are
> not necessary any more.  I think we can instead lappend the matched
> 'expr' and 'indexcol' to '*orderby_clauses_p' and '*clause_columns_p'
> directly.  In this way we can still call 'return' when we come to a
> failure to match.
>
> * In build_index_paths(), I think the diff can be reduced to
>
> -    if (orderbyclauses)
> -        useful_pathkeys = root->query_pathkeys;
> -    else
> -        useful_pathkeys = NIL;
> +    useful_pathkeys = list_truncate(list_copy(root->query_pathkeys),
> +                                    list_length(orderbyclauses));
>
> * Several comments in match_pathkeys_to_index() are out of date.  We
> need to revise them to cope with the change.
>
> * I think it's better to provide a test case.
>
> Thanks
> Richard

Thank you for advice,
here is an updated patch with proposed changes.

--
Best regards
Miroslav

Attachment
On Tue, 18 Apr 2023 at 19:29, Miroslav Bendik <miroslav.bendik@gmail.com> wrote:
> here is an updated patch with proposed changes.

Here's a quick review:

1. I don't think this is required. match_pathkeys_to_index() sets
these to NIL and they're set accordingly by the other code paths.

- List    *orderbyclauses;
- List    *orderbyclausecols;
+ List    *orderbyclauses = NIL;
+ List    *orderbyclausecols = NIL;

2. You can use list_copy_head(root->query_pathkeys,
list_length(orderbyclauses)); instead of:

+ useful_pathkeys = list_truncate(list_copy(root->query_pathkeys),
+     list_length(orderbyclauses));

3. The following 2 changes don't seem to be needed:

@@ -3104,11 +3100,11 @@ match_pathkeys_to_index(IndexOptInfo *index,
List *pathkeys,
  /* Pathkey must request default sort order for the target opfamily */
  if (pathkey->pk_strategy != BTLessStrategyNumber ||
      pathkey->pk_nulls_first)
-     return;
+     break;

  /* If eclass is volatile, no hope of using an indexscan */
  if (pathkey->pk_eclass->ec_has_volatile)
-     return;
+     break;

There's no code after the loop you're breaking out of, so it seems to
me that return is the same as break and there's no reason to change
it.

David



Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

From
Miroslav Bendik
Date:
Thanks for feedback

> 2. You can use list_copy_head(root->query_pathkeys,
> list_length(orderbyclauses)); instead of:
>
> + useful_pathkeys = list_truncate(list_copy(root->query_pathkeys),
> +     list_length(orderbyclauses));

This code will crash if query_pathkeys is NIL. I need either modify
list_copy_head (v3.1) or add checks before call (v3.2).

I don't know if it's a good idea to modify list_copy_head. It will add
additional overhead to every call.

-- 
Best regards
Miroslav

Attachment

Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

From
Miroslav Bendik
Date:
Sorry for spamming, but I found a more elegant way to check if
query_paths is NIL without modified list_copy_head.

Here is a third iteration of this patch.

-- 
Miroslav

Attachment
On Wed, 19 Apr 2023 at 16:53, Miroslav Bendik <miroslav.bendik@gmail.com> wrote:
> > 2. You can use list_copy_head(root->query_pathkeys,
> > list_length(orderbyclauses)); instead of:
> >
> > + useful_pathkeys = list_truncate(list_copy(root->query_pathkeys),
> > +     list_length(orderbyclauses));
>
> This code will crash if query_pathkeys is NIL. I need either modify
> list_copy_head (v3.1) or add checks before call (v3.2).
>
> I don't know if it's a good idea to modify list_copy_head. It will add
> additional overhead to every call.

That's a bug in list_copy_head().  Since NIL is how we represent empty
Lists, crashing on some valid representation of a List is not how it
should work.

That function is pretty new and was exactly added so we didn't have to
write list_truncate(list_copy(...), n) anymore.  That gets pretty
wasteful when the input List is long and we only need a small portion
of it.

I've just pushed a fix to master for this. See [1].  If you base your
patch atop of that you should be able to list list_copy_head() without
any issues.

David

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e35ded29566f679e52888a8d34468bb51bc78bed




On Thu, Apr 20, 2023 at 6:38 AM David Rowley <dgrowleyml@gmail.com> wrote:
That function is pretty new and was exactly added so we didn't have to
write list_truncate(list_copy(...), n) anymore.  That gets pretty
wasteful when the input List is long and we only need a small portion
of it.

I searched the codes and found some other places where the manipulation
of lists can be improved in a similar way.

* lappend(list_copy(list), datum) as in get_required_extension().
This is not very efficient as after list_copy it would need to enlarge
the list immediately.  It can be improved by inventing a new function,
maybe called list_append_copy, that do the copy and append all together.

* lcons(datum, list_copy(list)) as in get_query_def().
This is also not efficient.  Immediately after list_copy, we'd need to
enlarge the list and move all the entries.  It can also be improved by
doing all these things all together in one function.

* lcons(datum, list_delete_nth_cell(list_copy(list), n)) as in
sort_inner_and_outer.
It'd need to copy all the elements, and then delete the n'th entry which
would cause all following entries be moved, and then move all the
remaining entries for lcons.  Maybe we can invent a new function for it?

So is it worthwhile to improve these places?

Besides, I found one place that can be improved the same way as what we
did in 9d299a49.

--- a/src/backend/rewrite/rewriteSearchCycle.c
+++ b/src/backend/rewrite/rewriteSearchCycle.c
@@ -523,7 +523,7 @@ rewriteSearchAndCycle(CommonTableExpr *cte)

            fexpr = makeFuncExpr(F_INT8INC, INT8OID, list_make1(fs), InvalidOid, InvalidOid, COERCE_EXPLICIT_CALL);

-           lfirst(list_head(search_col_rowexpr->args)) = fexpr;
+           linitial(search_col_rowexpr->args) = fexpr;


Also, in applyparallelworker.c we have the usage as

    TransactionId xid_tmp = lfirst_xid(list_nth_cell(subxactlist, i));

I wonder if we can invent function list_nth_xid to do it, to keep
consistent with list_nth/list_nth_int/list_nth_oid.

Thanks
Richard

Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

From
Miroslav Bendik
Date:
> I've just pushed a fix to master for this. See [1].  If you base your
> patch atop of that you should be able to list list_copy_head() without
> any issues.

Thanks for this fix. Now the version
am_orderbyop_incremental_sort_v3.1.patch [1] works without issues
using the master branch.

[1] https://www.postgresql.org/message-id/attachment/146450/am_orderbyop_incremental_sort_v3.1.patch



>I searched the codes and found some other places where the manipulation
>of lists can be improved in a similar way.

>* lappend(list_copy(list), datum) as in get_required_extension().
>This is not very efficient as after list_copy it would need to enlarge
>the list immediately. It can be improved by inventing a new function,
>maybe called list_append_copy, that do the copy and append all together.

>* lcons(datum, list_copy(list)) as in get_query_def().
>This is also not efficient. Immediately after list_copy, we'd need to
>enlarge the list and move all the entries. It can also be improved by
>doing all these things all together in one function.

>* lcons(datum, list_delete_nth_cell(list_copy(list), n)) as in
>sort_inner_and_outer.
>It'd need to copy all the elements, and then delete the n'th entry which
>would cause all following entries be moved, and then move all the
>remaining entries for lcons. Maybe we can invent a new function for it?

>So is it worthwhile to improve these places?

I think yes. It's very inefficient coping and moving, unnecessarily.

Perhaps, like the attached patch?

lcons_copy_delete needs a careful review.


>I wonder if we can invent function list_nth_xid to do it, to keep
>consistent with list_nth/list_nth_int/list_nth_oid.

Perhaps list_nth_xid(const List *list, int n)?

regards,

Ranier Vilela

Attachment
On Thu, 20 Apr 2023 at 18:46, Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Thu, Apr 20, 2023 at 6:38 AM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> That function is pretty new and was exactly added so we didn't have to
>> write list_truncate(list_copy(...), n) anymore.  That gets pretty
>> wasteful when the input List is long and we only need a small portion
>> of it.
>
> I searched the codes and found some other places where the manipulation
> of lists can be improved in a similar way.

I'd be happy to discuss our thought about List inefficiencies, but I
think to be fair to Miroslav, we should do that somewhere else. The
list_copy_head() discussion was directly related to his patch due to
the list of list_truncate(list_copy(..), ..).  The other things you've
mentioned are not.  Feel free to start a thread and copy me in.

David




On Fri, Apr 21, 2023 at 5:43 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 20 Apr 2023 at 18:46, Richard Guo <guofenglinux@gmail.com> wrote:
> I searched the codes and found some other places where the manipulation
> of lists can be improved in a similar way.
I'd be happy to discuss our thought about List inefficiencies, but I
think to be fair to Miroslav, we should do that somewhere else. The
list_copy_head() discussion was directly related to his patch due to
the list of list_truncate(list_copy(..), ..).  The other things you've
mentioned are not.  Feel free to start a thread and copy me in.

Yeah, that's right.  Thank you for the suggestion.  I started a new
thread here:

https://www.postgresql.org/message-id/flat/CAMbWs49dJnpezDQDDxCPKq7%2B%3D_3NyqLqGqnhqCjd%2BdYe4MS15w%40mail.gmail.com

Thanks
Richard

On Thu, Apr 20, 2023 at 9:37 PM Miroslav Bendik <miroslav.bendik@gmail.com> wrote:
Thanks for this fix. Now the version
am_orderbyop_incremental_sort_v3.1.patch [1] works without issues
using the master branch.

The v3.1 patch looks good to me, except that the comments around
match_pathkeys_to_index still need some polish.

1. For comment "On success, the result list is ordered by pathkeys.", I
think it'd be more accurate if we say something like "On success, the
result list is ordered by pathkeys or a prefix list of pathkeys."
considering the possibility of incremental sort.

2. The comment below is not true anymore.

   /*
    * Note: for any failure to match, we just return NIL immediately.
    * There is no value in matching just some of the pathkeys.
    */

We should either remove it or change it to emphasize that we may return
a prefix of the pathkeys for incremental sort.

BTW, would you please add the patch to the CF to not lose track of it?

Thanks
Richard
 
Thanks, for suggestions.

On Sun 02. 07. 2023 at 10:18 Richard Guo <guofenglinux@gmail.com> wrote:
> 1. For comment "On success, the result list is ordered by pathkeys.", I
> think it'd be more accurate if we say something like "On success, the
> result list is ordered by pathkeys or a prefix list of pathkeys."
> considering the possibility of incremental sort.
>
> 2. The comment below is not true anymore.
>
>    /*
>     * Note: for any failure to match, we just return NIL immediately.
>     * There is no value in matching just some of the pathkeys.
>     */
> We should either remove it or change it to emphasize that we may return
> a prefix of the pathkeys for incremental sort.

Comments are updated now.

> BTW, would you please add the patch to the CF to not lose track of it?

Submitted <https://commitfest.postgresql.org/43/4433/>

-- 
Best regards
Miroslav

Attachment

On Sun, Jul 2, 2023 at 12:02 PM Miroslav Bendik <miroslav.bendik@gmail.com> wrote:
Thanks, for suggestions.

On Sun 02. 07. 2023 at 10:18 Richard Guo <guofenglinux@gmail.com> wrote:
> 1. For comment "On success, the result list is ordered by pathkeys.", I
> think it'd be more accurate if we say something like "On success, the
> result list is ordered by pathkeys or a prefix list of pathkeys."
> considering the possibility of incremental sort.
>
> 2. The comment below is not true anymore.
>
>    /*
>     * Note: for any failure to match, we just return NIL immediately.
>     * There is no value in matching just some of the pathkeys.
>     */
> We should either remove it or change it to emphasize that we may return
> a prefix of the pathkeys for incremental sort.

Comments are updated now.

> BTW, would you please add the patch to the CF to not lose track of it?

Submitted <https://commitfest.postgresql.org/43/4433/>

The v4 patch looks good to me (maybe some cosmetic tweaks are still
needed for the comments).  I think it's now 'Ready for Committer'.

Thanks
Richard
On Tue, 4 Jul 2023 at 20:12, Richard Guo <guofenglinux@gmail.com> wrote:
> The v4 patch looks good to me (maybe some cosmetic tweaks are still
> needed for the comments).  I think it's now 'Ready for Committer'.

I agree. I went and hit the comments with a large hammer and while
there also adjusted the regression tests. I didn't think having "t" as
a table name was a good idea as it seems like a name with a high risk
of conflicting with a concurrently running test. Also, there didn't
seem to be much need to insert data into that table as the tests
didn't query any of it.

The only other small tweak I made was to not call list_copy_head()
when the list does not need to be shortened. It's likely not that
important, but if the majority of cases are not partial matches, then
we'd otherwise be needlessly making copies of the list.

I pushed the adjusted patch.

David




On Tue, Jul 4, 2023 at 7:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Tue, 4 Jul 2023 at 20:12, Richard Guo <guofenglinux@gmail.com> wrote:
> The v4 patch looks good to me (maybe some cosmetic tweaks are still
> needed for the comments).  I think it's now 'Ready for Committer'.

I agree. I went and hit the comments with a large hammer and while
there also adjusted the regression tests. I didn't think having "t" as
a table name was a good idea as it seems like a name with a high risk
of conflicting with a concurrently running test. Also, there didn't
seem to be much need to insert data into that table as the tests
didn't query any of it.

The only other small tweak I made was to not call list_copy_head()
when the list does not need to be shortened. It's likely not that
important, but if the majority of cases are not partial matches, then
we'd otherwise be needlessly making copies of the list.

I pushed the adjusted patch.

The adjustments improve the patch a lot.  Thanks for adjusting and
pushing the patch.

Thanks
Richard

Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

From
"Jonathan S. Katz"
Date:
On 7/5/23 2:15 AM, Richard Guo wrote:
> 
> On Tue, Jul 4, 2023 at 7:15 PM David Rowley <dgrowleyml@gmail.com 
> <mailto:dgrowleyml@gmail.com>> wrote:
> 
>     On Tue, 4 Jul 2023 at 20:12, Richard Guo <guofenglinux@gmail.com
>     <mailto:guofenglinux@gmail.com>> wrote:
>      > The v4 patch looks good to me (maybe some cosmetic tweaks are still
>      > needed for the comments).  I think it's now 'Ready for Committer'.
> 
>     I agree. I went and hit the comments with a large hammer and while
>     there also adjusted the regression tests. I didn't think having "t" as
>     a table name was a good idea as it seems like a name with a high risk
>     of conflicting with a concurrently running test. Also, there didn't
>     seem to be much need to insert data into that table as the tests
>     didn't query any of it.
> 
>     The only other small tweak I made was to not call list_copy_head()
>     when the list does not need to be shortened. It's likely not that
>     important, but if the majority of cases are not partial matches, then
>     we'd otherwise be needlessly making copies of the list.
> 
>     I pushed the adjusted patch.
> 
> 
> The adjustments improve the patch a lot.  Thanks for adjusting and
> pushing the patch.

Thanks for working on this! While it allows the planner to consider 
choosing an incremental sort for indexes that implement 
"amcanorderbyop", it also has a positive side-effect that the planner 
will also consider choosing a plan for spawning parallel workers!

Because of that, I'd like to open the discussion that we consider 
backpatching this. Currently, extensions that implement index access 
methods (e.g. pgvector[1]) that are built primarily around 
"amcanorderbyop" are unable to get the planner to consider choosing a 
parallel scan, i.e. at this point in "create_order_paths"[2]:

/*
* If cheapest partial path doesn't need a sort, this is redundant
* with what's already been tried.
*/
if (!pathkeys_contained_in(root->sort_pathkeys,
                            cheapest_partial_path->pathkeys))

However, 625d5b3c does unlock this path for these types of indexes to 
allow for a parallel index scan to be chosen, which would allow 
extensions that implement a "amcanorderbyop" scan to use it. I would 
argue that this is a bug, given we offer the ability for index access 
methods to implement parallel index scans.

That said, I do think they may still need to be one planner tweak to 
properly support parallel index scan in this case, as I have yet to see 
costs generated where the parallel index scan is cheaper. However, I 
have not yet narrowed what/where that is.

Thanks,

Jonathan

[1] https://github.com/pgvector/pgvector
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/plan/planner.c;#l5188

Attachment