On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote: > Yeah. Quoting the comments for outersortkeys/innersortkeys: > > * outersortkeys (resp. innersortkeys) is NIL if the outer path > * (resp. inner path) is already ordered appropriately for the > * mergejoin. If it is not NIL then it is a PathKeys list describing > * the ordering that must be created by an explicit Sort node. > > And try_mergejoin_path will detect whether the outer path (resp. inner > path) is already well enough ordered, and suppresses an explicit sort > step if so by setting outersortkeys (resp. innersortkeys) to NIL. > > This reflects a basic assumption: if MergePath.outersortkeys is not > NIL, it means the outer path is not sufficiently ordered. > > Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not > NIL is reasonable. And I think some other code is violating this > assumption. > > I tried this repro query on v17 and got a plan with a redundant Sort > node. It seems that this issue existed prior to commit 828e94c9d, and > the Assert introduced in that commit makes the issue easier to detect.
Yes, I tested this repro query on v17.4 and got the same plan.
It seems the culprit is in GetExistingLocalJoinPath(), which is used to get a local path for EPQ checks. In that function, if the outer or inner path of the selected join path is a ForeignScan, it is replaced with the fdw_outerpath.
The problem arises when the selected join path is a MergePath, and its outer or inner path is a ForeignScan that is not already well enough ordered. In this case, the MergePath will have non-NIL outersortkeys or innersortkeys. If we then replace the outer or inner path with the corresponding fdw_outerpath, and that path is already sufficiently ordered, we end up in an inconsistent state: the MergePath has non-NIL outersortkeys/innersortkeys, and its input path is already properly ordered.
This inconsistency leads to a redundant Sort node prior to commit 828e94c9d, and triggers an Assert failure after that commit.
Yes, your analysis is correct. The commit 828e94c9d brought the issue to light.