Re: BUG #18764: server closed the connection unexpectedly - Mailing list pgsql-bugs

From David Rowley
Subject Re: BUG #18764: server closed the connection unexpectedly
Date
Msg-id CAApHDvo8FfuOiXYnJ1n2fZh8nwrDwkSU7=k6So+u-pYDOkN4YQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18764: server closed the connection unexpectedly  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #18764: server closed the connection unexpectedly
List pgsql-bugs
On Sat, 4 Jan 2025 at 06:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Actually ... why is that Sort there at all?  The whole plan looks like
>
> regression=# explain (costs off)
> SELECT c2 AS ca2, c2 AS ca3 FROM t0
> UNION
> SELECT DISTINCT ca8 AS ca5, ca7 AS ca6
>   FROM (SELECT c1 AS ca7, c2 AS ca8 FROM t0) AS ta1
>        JOIN
>        (SELECT c1 AS ca10, c1 AS ca11 FROM t0) AS ta2
>        ON TRUE;
>                              QUERY PLAN
> ---------------------------------------------------------------------
>  Unique
>    ->  Sort
>          Sort Key: t0.c2, t0.c2
>          ->  Append
>                ->  Seq Scan on t0
>                ->  Subquery Scan on "*SELECT* 2"
>                      ->  Sort
>                            Sort Key: t0_1.c2, t0_1.c1 USING <
>                            ->  HashAggregate
>                                  Group Key: t0_1.c2, t0_1.c1
>                                  ->  Nested Loop
>                                        ->  Seq Scan on t0 t0_1
>                                        ->  Materialize
>                                              ->  Seq Scan on t0 t0_2
> (14 rows)
>
> There is no value in forcing a sort of the subquery's output,
> and the previous code didn't do so:

The reason for the 2nd sort, (ignoring the invalidity of the pathkeys)
is due to how build_setop_child_paths() tries sorting the
cheapest_input_path.  The problem is that the sorted path gets added
to the same RelOptInfo as the cheapest_input_path so if add_path()
sees they're fuzzily the same cost, the sorted has better pathkeys, so
the actual cheapest path is thrown away. I think the reason this
happens in this case is that there are just not that many rows to
sort, and it's fairly expensive to get those rows.

The reason that the planner doesn't end up doing a Merge Append is
because the sorted path for the first branch of the UNION has the same
column twice in the targetlist and only needs a single PathKey to
sort. Later, that causes the get_cheapest_path_for_pathkeys() in
generate_union_paths() not to find a matching sorted path since
union_pathkeys has 2 PathKeys. I think that might be fixed by the
patch I had being toying around with which I posted in [1].  If we had
something like that then that would make the apparent wasted sort less
of an issue here.

David

[1] https://www.postgresql.org/message-id/CAApHDvqo1rV8O4pMU2-22iTASBXgnm4kbHF6A8_VMqiDR3hG8A%40mail.gmail.com



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #18765: Inconsistent behaviour and errors with LIKE
Next
From: Thomas Munro
Date:
Subject: Re: pg_rewind fails on Windows where tablespaces are used