Re: enable_incremental_sort changes query behavior - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: enable_incremental_sort changes query behavior |
Date | |
Msg-id | 295524.1606246314@sss.pgh.pa.us Whole thread Raw |
In response to | Re: enable_incremental_sort changes query behavior (James Coleman <jtc331@gmail.com>) |
Responses |
Re: enable_incremental_sort changes query behavior
|
List | pgsql-hackers |
James Coleman <jtc331@gmail.com> writes: > On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> 1. James was wondering, far upthread, why we would do projections >> pre-sort or post-sort. I think the answers can be found by studying >> planner.c's make_sort_input_target(), which separates out what we want >> to do pre-sort and post-sort. > Does it imply we need to intentionally avoid SRFs also? It's sort of a wart that we allow SRFs in ORDER BY at all, but my expectation is that make_sort_input_target would prevent lower levels of the planner from needing to think about that. We don't allow SRFs in index expressions, nor in WHERE clauses (so they'll never come up as mergejoin sort keys). So really there's no way that scan/join processing should need to consider such cases. Such a sort would only ever be implemented via a final Sort atop ProjectSet. >> 2. Robert is correct that under normal circumstances, the targetlists of >> both baserels and join rels contain only Vars. > Is that primarily a project policy? Or a limitation of our current > planner (just can't push things down/carry the results back up as much > as we'd like)? Or something else? In particular, it seems plausible > there are cases where pushing down the sort on a non-Var expression to > the base rel could improve plans, so I'm wondering if there's a reason > to intentionally avoid that in the long or short run (or both). I think you've just rediscovered Joe Hellerstein's thesis topic [1]. We ripped out the remnants of that code ages ago (search very early git states for "JMH" if you're interested), because the complexity vs. benefit ratio seemed pretty bad. Maybe there'll be a case for putting it back some day, but I'm dubious. Note that we have the ability to push down sorts-on-expressions anyway; that's not constrained by what is in the relation targetlists. > Interesting: so merge joins are an example of us pushing down sorts, > which I assume means (part of) the answer to my question on (2) is > that there's nothing inherently wrong/broken with evaluating > expressions lower down the tree as long as we're careful about what is > safe/unsafe with respect to volatility and parallelism? Right, I don't see any fundamental problem with that, we just have to be careful about these constraints. > I have wondered if we should strictly require the expression to be in > the target list even if nonvolatile, but prepare_sort_from_pathkeys > doesn't seem to think that's necessary, so I'm comfortable with that > unless there's something you know we haven't considered. No, prepare_sort_from_pathkeys is happy to build a sort expression if it can't find it already computed in the input. The secret here is that we should never get to that code with a "dangerous" sort expression, because we should never have made a Path that would request such a thing in the first place. It's fairly obvious to me how we deal with the consideration for volatile sortkeys. We cannot restrict parallel-unsafe sortkeys quite the same way, because the restriction only applies to parallel not non-parallel plans. Maybe it's sufficient to mark Paths as parallel unsafe if they sort by parallel-unsafe sortkeys? Is there anyplace that is Just Assuming that paths it builds will be parallel-safe? regards, tom lane [1] https://www.postgresql.org/message-id/28216.1023340706%40sss.pgh.pa.us
pgsql-hackers by date: