Home > mailing lists

Avoid computing ORDER BY junk columns unnecessarily - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Avoid computing ORDER BY junk columns unnecessarily
Date	December 21, 2023 18:38:27
Msg-id	2ca5865b-4693-40e5-8f78-f3b45d5378fb@iki.fi Whole thread Raw
Responses	Re: Avoid computing ORDER BY junk columns unnecessarily Re: Avoid computing ORDER BY junk columns unnecessarily Re: Avoid computing ORDER BY junk columns unnecessarily
List	pgsql-hackers

Tree view

Problem
-------

We are using junk columns for (at least) two slightly different purposes:

1. For passing row IDs and other such data from lower plan nodes to
LockRows / ModifyTable.

2. To represent ORDER BY and GROUP BY columns that don't appear in the
SELECT list. For example, in a query like:

SELECT foo FROM mytable ORDER BY bar;

The parser adds 'bar' to the target list as a junk column. You can see
that with EXPLAIN VERBOSE:

explain (verbose, costs off)
select foo from mytable order by bar;

QUERY PLAN
----------------------------------
Sort
Output: foo, bar
Sort Key: mytable.bar
-> Seq Scan on public.mytable
Output: foo, bar
(5 rows)

The 'bar' column get filtered away in the executor, by the so-called
junk filter. That's fine for simple cases like the above, but in some
cases, that causes the ORDER BY value to be computed unnecessarily. For
example:

create table mytable (foo text, bar text);
insert into mytable select g, g from generate_series(1, 10000) g;
create index on mytable (sha256(bar::bytea));
explain verbose
select foo from mytable order by sha256(bar::bytea);

QUERY PLAN

------------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..737.28 rows=10000 width=64)
Output: foo, sha256((bar)::bytea)
(2 rows)

The index is used to satisfy the ORDER BY, but the expensive ORDER BY
expression is still computed for every row, just to be thrown away by
the junk filter.

This came up with pgvector, as the vector distance functions are pretty
expensive. All vector operations are expensive, so one extra distance
function call per row doesn't necessarily make that much difference, but
it sure looks silly. See
https://github.com/pgvector/pgvector/issues/359#issuecomment-1840786021
(thanks Matthias for the investigation!).

Solution
--------

The obvious solution is that the planner should not include those junk
columns in the plan. But how exactly to implement that is a different
matter.

I came up with the attached patch set, which adds a projection to all
the paths at the end of planning in grouping_planner(). The projection
filters out the unnecessary junk columns. With that, the plan for the
above example:

postgres=# explain verbose select foo from mytable order by
sha256(bar::bytea);
QUERY PLAN

-----------------------------------------------------------------------------------------------
Index Scan using mytable_sha256_idx on public.mytable
(cost=0.29..662.24 rows=10000 width=4)
Output: foo
(2 rows)

Problems with the solution
--------------------------

So this seems to work, but I have a few doubts:

1. Because Sort cannot project, this adds an extra Result node on top of
Sort nodes when the the ORDER BY is implemented by sorting:

postgres=# explain verbose select foo from mytable order by bar;
QUERY PLAN

--------------------------------------------------------------------------------
Result (cost=818.39..843.39 rows=10000 width=4)
Output: foo
-> Sort (cost=818.39..843.39 rows=10000 width=8)
Output: foo, bar
Sort Key: mytable.bar
-> Seq Scan on public.mytable (cost=0.00..154.00 rows=10000
width=8)
Output: foo, bar
(7 rows)

From a performance point of view, I think that's not as bad as it
sounds. Remember that without this patch, the executor needs to execute
the junk filter to filter out the extra column instead. It's not clear
that an extra Result is worse than that, although I haven't tried
benchmarking it though.

This makes plans for queries like above more verbose though. Perhaps we
should teach Sort (and MergeAppend) to do projection, just to avoid that?

Another solution would be to continue relying on the junk filter, if
adding the projection in the planner leads to an extra Result node.
That's a bit ugly, because then the final target list of a (sub)query
depends on the Path that's chosen.

2. Instead of tacking on the projection to the paths at the end, I first
tried modifying the code earlier in grouping_planner() that computes the
target lists for the different plan stages. That still feels like a
cleaner approach to me, although I don't think there's any difference in
the generated plans in practice. However I ran into some problems with
that approach and gave up.

I basically tried to remove the junk columns from 'final_target', and
have create_ordered_paths() create paths with the filtered target list
directly. And if there is no ORDER BY, leave out the junk columns from
'grouping_target' too, and have create_grouping_paths() generate the
final target list directly. However, create_grouping_paths() assumes
that the grouping columns are included in 'grouping_target'. And
similarly in create_ordered_paths(), some partial planning stages assume
that the ordering columns are included in 'final_target'. Those
assumptions could probably be fixed, but I ran out of steam trying to do
that.

3. I also considered if we should stop using junk columns to represent
ORDER BY / GROUP BY columns like this in the first place. Perhaps we
should have a separate list for those and not stash them in the target
list. But that seems like a much bigger change.

4. It would be nice to get rid of the junk filter in the executor
altogether. With this patch, the junk columns used for RowLocks and
ModifyTable are still included in the final target list, and are still
filtered out by the junk filter. But we could add similar projections
between the RowLocks and ModifyTable stages, to eliminate all the junk
columns at the top of the plan. ExecFilterJunk() isn't a lot of code,
but it would feel cleaner to me. I didn't try to implement that.

5. I'm not sure the categorization of junk columns that I implemented
here is the best one. It might make sense to have more categories, and
distinguish row-id columns from others for example. And ORDER BY columns
from GROUP BY columns.

Patches
-------

So the attached patches implement that idea, with the above-mentioned
problems. I think this approach is fine as it is, despite those
problems, but my planner-fu is a rusty so I really need review and a
second opinion on this.

v1-0001-Add-test-for-Min-Max-optimization-with-kNN-index-.patch
v1-0002-Show-how-ORDER-BY-expression-is-computed-unnecess.patch

These patches just add to existing tests to demonstrate the problem.

v1-0003-Turn-resjunk-into-an-enum.patch

Replace 'resjunk' boolean with an enum, so that we can distinguish
between different junk columns. The next patch uses that information to
identify junk columns that can be filtered out. It's is a separate patch
for easier review.

v1-0004-Omit-columns-from-final-tlist-that-were-only-need.patch

The main patch in this series.

v1-0005-Fix-regression-tests-caused-by-additional-Result-.patch

Regression test output changes, for all the plans with Sort that now
have Sort + Result. See "Problem with the solution" #1.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment

pgsql-hackers by date:

From: Tom Lane
Date: 21 December 2023, 18:21:36
Subject: authentication/t/001_password.pl trashes ~/.psql_history

From: Tomas Vondra
Date: 21 December 2023, 19:05:52
Subject: Re: index prefetching

Avoid computing ORDER BY junk columns unnecessarily - Mailing list pgsql-hackers

Attachment

Previous

Next