Avoid using list_delete_first in simplify_or/and_arguments - Mailing list pgsql-hackers

From Richard Guo
Subject Avoid using list_delete_first in simplify_or/and_arguments
Date
Msg-id CAMbWs4_ysmb0=-odBusZ9_83MJE1f1bA1wztOR6dLaAJG8nJ7A@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi hackers,

While trying to measure if there is any gain from the change as
discussed in [1], I happened to notice another place that is slowed down
by list_delete_first.  I'm using the query as below:

(n=1000000;
printf "explain (summary on) select * from t where "
for ((i=1;i<$n;i++)); do printf "a = $i or "; done;
printf "a = $n;"
) | psql

And I notice that a large part of planning time is spent on the
list_delete_first calls inside simplify_or_arguments().

I think the issue here is clear and straightforward: list_delete_first
has an O(N) cost due to data movement.  And I believe similar issue has
been discussed several times before.

I wonder if we can improve it by using list_delete_last instead, so I
tried the following change:

--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -3612,9 +3612,9 @@ simplify_or_arguments(List *args,
        unprocessed_args = list_copy(args);
        while (unprocessed_args)
        {
-               Node       *arg = (Node *) linitial(unprocessed_args);
+               Node       *arg = (Node *) llast(unprocessed_args);

-               unprocessed_args = list_delete_first(unprocessed_args);
+               unprocessed_args = list_delete_last(unprocessed_args);


With this change, in my box the planning time for the query above is
reduced from 64257.784 ms to 1411.666 ms, a big improvement.  The side
effect is that it results in a lot of plan diffs in regression tests,
but they are all about different order of OR arguments.

I believe simplify_and_arguments() can also benefit from similar
changes. But I'm not sure if we could have such a long AND/OR arguments
in real world. So is this worth doing?

[1] https://www.postgresql.org/message-id/CAMbWs4-RXhgz0i4O1z62gt%2BbTLTM5vXYyYhgnius0j_txLH7hg%40mail.gmail.com

Thanks
Richard

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Allow single table VACUUM in transaction block
Next
From: vignesh C
Date:
Subject: Re: Support logical replication of DDLs