Thread: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

The following bug has been logged on the website:

Bug reference:      17540
Logged by:          William Duclot
Email address:      william.duclot@gmail.com
PostgreSQL version: 14.4
Operating system:   GNU/Linux (Red Hat 8.5.0)
Description:

My application uses prepared statements. This section of the documentation
is going to be very relevant to the rest of this report:
https://www.postgresql.org/docs/current/sql-prepare.html#SQL-PREPARE-NOTES.

This is a minimal reproduction of the problem I observe, which I will
explain below:
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=6b01d161da27379844e7602a16543626

Scenario:
- I create a fairly simple table (id + timestamp). Timestamp is indexed.
- I create a simple-ish prepared statement for `SELECT MIN(id), MAX(id) from
relation_tuple_transaction WHERE timestamp >= $1;`
- I execute the prepared statement multiple times (> 5 times)

From the 6th time onwards, the query plan used by Postgres changes, which
isn't fully unexpected as the documentation linked above does make it clear
that Postgres might decide to change the query plan for a generic query plan
after the 5th execution. And indeed, the estimated "cost" of the generic
plan is lower than the custom plan's: therefore the query planner behaves
correctly according to the documentation.

Now, the problem: the execution of the generic plan is multiple orders of
magnitude slower than the custom query plan ("actual time" for the generic
plan is over 6500x slower), yet Postgres decides to stick with the generic
plan. Very unexpected for me: I was very happy with the first 5 plans, yet
Postgres decides to change the plan for another that's enormously slower and
stick with it.
Giving a different parameter passed to the prepared statement (eg `now() -
interval '5 days'`) does give a "slow" custom plan (similar to the generic
plan). This means that the query planner does not realise that the actual
parameter value matters a lot, and that the parameters used _in practice_
result in a faster plan than the generic plan (100% of the first 5
executions), and that therefore it shouldn't stick to the generic plan.

It is particularly insidious as actually I wasn't even aware I was using
prepared statements. Like most applications I use a database driver (pgx, in
Go) which I learnt uses `PQexecPrepared` under the hood, which creates a
sort of "unnamed prepared statement" behaving the same as this minimal
reproduction without me ever being aware that prepared statements are
involved anywhere between my code and the database. This makes debugging
very complex as there's no reason to suspect anything
prepared-statement-related and a manual EXPLAIN ANALYZE outside of a
prepared statement won't show the problem.

Note: setting `plan_cache_mode = force_custom_plan` database-wide solved the
immediate problem but is a workaround. It was a very welcome workaround,
though.


On Wed, Jul 6, 2022 at 2:41 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17540
Logged by:          William Duclot
Email address:      william.duclot@gmail.com
PostgreSQL version: 14.4
Operating system:   GNU/Linux (Red Hat 8.5.0)
Description:       
 
This means that the query planner does not realise that the actual
parameter value matters a lot, and that the parameters used _in practice_
result in a faster plan than the generic plan (100% of the first 5
executions), and that therefore it shouldn't stick to the generic plan.

I mean, it is the planner and so, no, it doesn't understand that the executor encountered an issue.


It is particularly insidious as actually I wasn't even aware I was using
prepared statements. Like most applications I use a database driver (pgx, in
Go) which I learnt uses `PQexecPrepared` under the hood, which creates a
sort of "unnamed prepared statement" behaving the same as this minimal
reproduction without me ever being aware that prepared statements are
involved anywhere between my code and the database.

Yep, and the core project pretty much says that if you don't like this you need to complain to the driver writer and ask them to provide you an interface to the unnamed parse-bind-execute API which lets you perform parameterization without memory, just safety.

PostgreSQL has built the needed tools to make this less problematic, and has made solid attempts to improve matters in the current state of things.  There doesn't seem to be a bug here.  There is potentially room for improvement but no one presently is working on things in this area.

David J.

(On Thu, 7 Jul 2022 at 09:41, PG Bug reporting form
<noreply@postgresql.org> wrote:
> Scenario:
> - I create a fairly simple table (id + timestamp). Timestamp is indexed.
> - I create a simple-ish prepared statement for `SELECT MIN(id), MAX(id) from
> relation_tuple_transaction WHERE timestamp >= $1;`
> - I execute the prepared statement multiple times (> 5 times)
>
> From the 6th time onwards, the query plan used by Postgres changes, which
> isn't fully unexpected as the documentation linked above does make it clear
> that Postgres might decide to change the query plan for a generic query plan
> after the 5th execution. And indeed, the estimated "cost" of the generic
> plan is lower than the custom plan's: therefore the query planner behaves
> correctly according to the documentation.

It's a pretty narrow fix for a fairly generic problem, but I think the
planner wouldn't have picked the pk_rttx index if build_minmax_path()
hadn't added the "id IS NOT NULL" qual.

I know that Andy Fan has been proposing a patch to add a Bitmapset
field to RelOptInfo to record the non-NULLable columns. That's a
fairly lightweight patch, so it might be worth adding that just so
build_minmax_path() can skip adding the NULL test if the column is a
NOT NULL column.

I see that preprocess_minmax_aggregates() won't touch anything that's
not a query to a single relation, so the Var can't be NULLable from
being on the outside of an outer join. So it looks like to plumb in
Andy's patch, build_minmax_path() would need to be modified to check
if mminfo->target is a plain Var and then test if that Var is NOT
NULLable then skip adding the NullTest.

All seems fairly trivial. It's just a fairly narrow fix to side-step a
more generic costing problem we have for Params.  I just don't have
any bright ideas on how to fix the more generic problem right now.

I've been looking for a good excuse to commit Andy's NOT NULL patch so
that he has some more foundations for the other work he's doing.  This
might be that excuse.

Does anyone think differently?

David

[1] https://www.postgresql.org/message-id/CAKU4AWoZrFaWAkTn9tE2_dd4RYnUiQUiX8xc=ryUywhBWQv89w@mail.gmail.com



Hi,


On 2022-07-06 15:07:46 -0700, David G. Johnston wrote:
> On Wed, Jul 6, 2022 at 2:41 PM PG Bug reporting form <noreply@postgresql.org>
> wrote:
> > It is particularly insidious as actually I wasn't even aware I was using
> > prepared statements. Like most applications I use a database driver (pgx,
> > in
> > Go) which I learnt uses `PQexecPrepared` under the hood, which creates a
> > sort of "unnamed prepared statement" behaving the same as this minimal
> > reproduction without me ever being aware that prepared statements are
> > involved anywhere between my code and the database.
>
>
> Yep, and the core project pretty much says that if you don't like this you
> need to complain to the driver writer and ask them to provide you an
> interface to the unnamed parse-bind-execute API which lets you perform
> parameterization without memory, just safety.
>
> PostgreSQL has built the needed tools to make this less problematic, and
> has made solid attempts to improve matters in the current state of things.
> There doesn't seem to be a bug here.  There is potentially room for
> improvement but no one presently is working on things in this area.

I think the cost for the slow plan being so much cheaper can almost be
qualified as bug.

The slow plan seems pretty nonsensical to me. ISTM that something in the
costing there is at least almost broken.


Result  (cost=1.06..1.07 rows=1 width=16) (actual time=148.732..148.734 rows=1 loops=1)
  Buffers: shared hit=4935
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..0.53 rows=1 width=8) (actual time=73.859..73.860 rows=0 loops=1)
          Buffers: shared hit=2113
          ->  Index Scan using pk_rttx on relation_tuple_transaction  (cost=0.42..9445.44 rows=86400 width=8) (actual
time=73.857..73.858rows=0 loops=1)
 
                Index Cond: (id IS NOT NULL)
                Filter: ("timestamp" >= $1)
                Rows Removed by Filter: 259201
                Buffers: shared hit=2113
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..0.53 rows=1 width=8) (actual time=74.869..74.870 rows=0 loops=1)
          Buffers: shared hit=2822
          ->  Index Scan Backward using pk_rttx on relation_tuple_transaction relation_tuple_transaction_1
(cost=0.42..9445.44rows=86400 width=8) (actual time=74.868..74.868 rows=0 loops=1)
 
                Index Cond: (id IS NOT NULL)
                Filter: ("timestamp" >= $1)
                Rows Removed by Filter: 259201
                Buffers: shared hit=2822
Planning Time: 0.224 ms
Execution Time: 148.781 ms

The planner assumes the table has 259201 rows. Somehow we end up
assuming that a estimate-less filter reduces the number of rows to 86400
both on a backward and a forward scan.

And for some reason we don't take the filter clause into account *at
all* for the cost of returning the first row.

SET enable_seqscan = false;
EXPLAIN SELECT * FROM relation_tuple_transaction WHERE id IS NOT NULL LIMIT 1;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                               QUERY PLAN                                                │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit  (cost=0.42..0.45 rows=1 width=16)                                                                │
│   ->  Index Scan using pk_rttx on relation_tuple_transaction  (cost=0.42..8797.44 rows=259201 width=16) │
│         Index Cond: (id IS NOT NULL)                                                                    │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(3 rows)

It's also pointless that we use "Index Cond: (id IS NOT NULL)" for a
primary key index, but that's a minor thing.

Greetings,

Andres Freund



On Thu, 7 Jul 2022 at 12:46, Andres Freund <andres@anarazel.de> wrote:
> I think the cost for the slow plan being so much cheaper can almost be
> qualified as bug.
>
> The slow plan seems pretty nonsensical to me. ISTM that something in the
> costing there is at least almost broken.

I forgot to mention what the "generic problem" is when I posted my
reply.  I should have mentioned that this is how we cost LIMIT. We
assume that we'll find the LIMIT 1 row after incurring the scan cost
multiplied by (1 / 259201).

For the plan with WHERE timestamp >= $1, the seqscan plan looks pretty
cheap for fetching DEFAULT_INEQ_SEL of the 259201 rows considering the
LIMIT multiples the cost of the scan by (1 / 86400).

David



David Rowley <dgrowleyml@gmail.com> writes:
> I've been looking for a good excuse to commit Andy's NOT NULL patch so
> that he has some more foundations for the other work he's doing.  This
> might be that excuse.

> Does anyone think differently?

While I don't have any problem with tracking column NOT NULL flags
in RelOptInfo once the planner has a use for that info, I'm not sure
that we have a solid use-case for it quite yet.  In particular, the
fact that the table column is marked NOT NULL doesn't mean that any
particular occurrence of that column's Var can be freely assumed to be
non-null.  The patch I'm working on to label Vars that have possibly
been nulled by outer joins [1] seems like essential infrastructure for
doing anything very useful with the info.

Maybe that objection doesn't apply to build_minmax_path's usage in
particular, but that's an awfully narrow use-case.

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/830269.1656693747@sss.pgh.pa.us



Andres Freund <andres@anarazel.de> writes:
> I think the cost for the slow plan being so much cheaper can almost be
> qualified as bug.
> The slow plan seems pretty nonsensical to me. ISTM that something in the
> costing there is at least almost broken.

I think this is probably an instance of the known problem that a generic
plan is made without knowledge of the actual parameter values, and that
can lead us to make statistical assumptions that are not valid for the
actual values, but nonetheless make one plan look cheaper than another
even though the opposite is true given the actual values.  In essence,
comparing the cost estimate for the generic plan to the cost estimate
for a custom plan is not really logically valid, because those estimates
are founded on different statistics.  I don't know how to fix that :-(.

            regards, tom lane



On Thu, 7 Jul 2022 at 15:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> While I don't have any problem with tracking column NOT NULL flags
> in RelOptInfo once the planner has a use for that info, I'm not sure
> that we have a solid use-case for it quite yet.  In particular, the
> fact that the table column is marked NOT NULL doesn't mean that any
> particular occurrence of that column's Var can be freely assumed to be
> non-null.  The patch I'm working on to label Vars that have possibly
> been nulled by outer joins [1] seems like essential infrastructure for
> doing anything very useful with the info.

I was aware that you'd done that work. I'm interested in it, but just
not found the time to look yet.

> Maybe that objection doesn't apply to build_minmax_path's usage in
> particular, but that's an awfully narrow use-case.

I thought I'd quickly put the idea together and fairly quickly noticed
that we do preprocess_minmax_aggregates() in grouping_planner(), which
is long before we load the RelOptInfo data in
add_base_rels_to_query(), which is called in query_planner(). I
considered if we could move the preprocess_minmax_aggregates(), but
that does not seem right, although, surprisingly, no tests seem to
fail from doing so. I'd have expected at least some EXPLAIN outputs to
have changed from the no-longer-present IS NOT NULL quals.

I imagine a much less narrow case would be to check for redundant
RestrictInfos in distribute_restrictinfo_to_rels().  That would also
catch cases such as WHERE non_nullable_col IS NULL, provided that qual
made it down to baserestrictinfo.  When I realised that, I thought I
might be starting to overlap with your work in the link below.

> [1] https://www.postgresql.org/message-id/flat/830269.1656693747@sss.pgh.pa.us

The 2 attached patches do fix the bad reported plan, it's just that
it's a very roundabout way of fixing it

Anyway, I've no current plans to take the attached any further. I
think it'll be better to pursue your NULLable-Var stuff and see if we
can do something more generic like remove provably redundant NullTests
from baserestrictinfo.

David

Attachment
David Rowley <dgrowleyml@gmail.com> writes:
> Anyway, I've no current plans to take the attached any further. I
> think it'll be better to pursue your NULLable-Var stuff and see if we
> can do something more generic like remove provably redundant NullTests
> from baserestrictinfo.

Yeah, I suspect that the way forward is to allow
preprocess_minmax_aggregates to do what it does now, and then
remove the IS NOT NULL clause again later when we have the
info available to let us do that in a generic way.

In any case, as you said, it's just a band-aid that happens to
help in this exact scenario.  It's not doing much for the bad
cost estimation that's the root of the problem.

            regards, tom lane



Hi,

On 2022-07-06 23:13:18 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think the cost for the slow plan being so much cheaper can almost be
> > qualified as bug.
> > The slow plan seems pretty nonsensical to me. ISTM that something in the
> > costing there is at least almost broken.
>
> I think this is probably an instance of the known problem that a generic
> plan is made without knowledge of the actual parameter values, and that
> can lead us to make statistical assumptions that are not valid for the
> actual values, but nonetheless make one plan look cheaper than another
> even though the opposite is true given the actual values.  In essence,
> comparing the cost estimate for the generic plan to the cost estimate
> for a custom plan is not really logically valid, because those estimates
> are founded on different statistics.  I don't know how to fix that :-(.

I think there's something more fundamentally wrong - somehow we end up with
assuming > 50% selectivity on both the min and the max initplan, for the same
condition!  And afaics (although it's a bit hard to see with the precision
explain prints floating point values as) don't charge cpu_operator_cost /
cpu_tuple_cost. And this is on a table where we can know, despite not know the
parameter value, that the column being compared has a correlation of 1.

In this case the whole generic plan part seems like a red herring. The generic
plan is *awful* and would still be awful if the value were known, but
somewhere around the middle of the value range.


Here's the op's tables + query, but without the prepared statement part:

CREATE TABLE relation_tuple_transaction (
    id BIGSERIAL NOT NULL UNIQUE,
    timestamp TIMESTAMP WITHOUT TIME ZONE NOT NULL UNIQUE,
    CONSTRAINT pk_rttx PRIMARY KEY (id)
);
CREATE INDEX ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction(timestamp);
INSERT INTO relation_tuple_transaction(timestamp) SELECT * FROM generate_series
        ( now() - interval '3 days'
        , now()
        , '1 second'::interval) dd
        ;
vacuum freeze analyze;
EXPLAIN ANALYZE SELECT MIN(id), MAX(id) from relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5
days');

postgres[631148][1]=# EXPLAIN ANALYZE SELECT MIN(id), MAX(id) from relation_tuple_transaction WHERE timestamp >= (now()
-interval '1.5 days');;
 

Result  (cost=1.01..1.02 rows=1 width=16) (actual time=113.379..113.381 rows=1 loops=1)
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..0.50 rows=1 width=8) (actual time=113.347..113.348 rows=1 loops=1)
          ->  Index Scan using pk_rttx on relation_tuple_transaction  (cost=0.42..10741.45 rows=127009 width=8) (actual
time=113.345..113.345rows=1 loops=1)
 
                Index Cond: (id IS NOT NULL)
                Filter: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
                Rows Removed by Filter: 129746
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..0.50 rows=1 width=8) (actual time=0.024..0.024 rows=1 loops=1)
          ->  Index Scan Backward using pk_rttx on relation_tuple_transaction relation_tuple_transaction_1
(cost=0.42..10741.45rows=127009 width=8) (actual time=0.023..0.023 rows=1 loops=1)
 
                Index Cond: (id IS NOT NULL)
                Filter: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
Planning Time: 0.370 ms
Execution Time: 113.441 ms
(14 rows)

We're pretty much by definition scanning half the table via the index scans,
and end up with a cost of 1.02 (yes, aware that the paths are costed
separately).


FWIW, manually writing the min/max as ORDER BY timestamp ASC/DESC LIMIT 1
queries yields a *vastly* better plan:

EXPLAIN ANALYZE SELECT (SELECT id FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days')
ORDERBY timestamp ASC LIMIT 1), (SELECT id FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5
days')ORDER BY timestamp DESC LIMIT 1);
 

Result  (cost=0.92..0.93 rows=1 width=16) (actual time=0.110..0.111 rows=1 loops=1)
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..0.46 rows=1 width=16) (actual time=0.079..0.079 rows=1 loops=1)
          ->  Index Scan using ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction
(cost=0.42..4405.46rows=129602 width=16) (actual time=0.077..0.078 rows=1 loops=1)
 
                Index Cond: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
  InitPlan 2 (returns $1)
    ->  Limit  (cost=0.42..0.46 rows=1 width=16) (actual time=0.028..0.028 rows=1 loops=1)
          ->  Index Scan Backward using ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction
relation_tuple_transaction_1 (cost=0.42..4405.46 rows=129602 width=16) (actual time=0.027..0.027 rows=1 loops=1)
 
                Index Cond: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
Planning Time: 0.270 ms
Execution Time: 0.159 ms
                                                                                           (11 rows)
 

And it stays sane even if you add a (redundantly evaluated) AND id IS NOT NULL.


EXPLAIN SELECT id FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days') AND id IS NOT NULL
ORDERBY timestamp ASC LIMIT 1;
 
QUERY PLAN
Limit  (cost=0.42..0.46 rows=1 width=16)
  ->  Index Scan using ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction  (cost=0.42..4405.46
rows=129602width=16)
 
        Index Cond: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
        Filter: (id IS NOT NULL)
(4 rows)


EXPLAIN SELECT min(id) FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days');
QUERY PLAN
Result  (cost=0.50..0.51 rows=1 width=8)
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.42..0.50 rows=1 width=8)
          ->  Index Scan using pk_rttx on relation_tuple_transaction  (cost=0.42..10741.45 rows=129602 width=8)
                Index Cond: (id IS NOT NULL)
                Filter: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
(6 rows)



Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2022-07-06 23:13:18 -0400, Tom Lane wrote:
>> comparing the cost estimate for the generic plan to the cost estimate
>> for a custom plan is not really logically valid, because those estimates
>> are founded on different statistics.  I don't know how to fix that :-(.

> I think there's something more fundamentally wrong - somehow we end up with
> assuming > 50% selectivity on both the min and the max initplan, for the same
> condition!

Well, sure, because it *is* the same condition.  AFAICS this is operating
as designed.  Do I wish it were better?  Sure, but there is no simple fix
here.

The reasoning that's being applied in the generic plan is

(1) default selectivity estimate for a scalar inequality is
#define DEFAULT_INEQ_SEL  0.3333333333333333

(2) therefore, the filter condition on the indexscan will select a random
one-third of the table;

(3) therefore, the LIMIT will be able to stop after about three rows,
whichever direction we scan in.

The information that is lacking is that the "id" and "timestamp"
columns are heavily correlated, so that we may have to scan far more
than three rows in "id" order before finding a row satisfying the
inequality on "timestamp".  This is a problem we've understood for
a long time --- I recall talking about it at PGCon a decade ago.

The extended stats machinery provides a framework wherein we could
calculate and save the ordering correlation between the two columns,
but I don't believe it actually calculates that number yet --- I think
the functional-dependency stuff is close but not the right thing.
Even if we had the stats, it's not very clear where to fit this
type of consideration into the planner's estimates.

> In this case the whole generic plan part seems like a red herring. The generic
> plan is *awful* and would still be awful if the value were known, but
> somewhere around the middle of the value range.

If the value were somewhere around the middle (which is more or less
what we're assuming for the generic plan), then an indexscan on the
timestamp column isn't going to be that great either; you'd still
be scanning half the table.

> FWIW, manually writing the min/max as ORDER BY timestamp ASC/DESC LIMIT 1
> queries yields a *vastly* better plan:

Those queries give the wrong answers.  We're looking for the min or max
id, not the id associated with the min or max timestamp.  (They're
accidentally the same with this toy dataset.)

            regards, tom lane



On Thu, 7 Jul 2022 at 15:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Anyway, I've no current plans to take the attached any further. I
> > think it'll be better to pursue your NULLable-Var stuff and see if we
> > can do something more generic like remove provably redundant NullTests
> > from baserestrictinfo.
>
> Yeah, I suspect that the way forward is to allow
> preprocess_minmax_aggregates to do what it does now, and then
> remove the IS NOT NULL clause again later when we have the
> info available to let us do that in a generic way.

I started looking at a more generic way to fix this.  In the attached
I'm catching quals being added to baserestrictinfo in
distribute_restrictinfo_to_rels() and checking for IS NOT NULL quals
on columns defined with NOT NULL.

I did this by adding a new function add_baserestrictinfo_to_rel()
which can be the place where we add any future logic to ignore other
always-true quals. Perhaps in the future, we can add some logic there
to look for quals on partitions which are always true based on the
partition constraint.

I also took the opportunity here to slightly modernised the Bitmapset
code in this area. We previously called bms_membership() and then
bms_singleton_member(), which is not quite optimal. We invented
bms_get_singleton_member() as a more efficient way of getting that.
The empty set case can just be handled more easily now since you
changed empty sets to always be NULL. If it's not an empty set and not
a singleton, then it must contain multiple members.

I'm quite keen to see some forward progress on improving things for
this bug report. It would be good to take some more measures to stop
the planner being tricked into making silly mistakes. This is one
example of somewhere we could do better.

David

Attachment

On Thu, Jul 6, 2023 at 7:55 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 7 Jul 2022 at 15:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> David Rowley <dgrowleyml@gmail.com> writes:
> > Anyway, I've no current plans to take the attached any further. I
> > think it'll be better to pursue your NULLable-Var stuff and see if we
> > can do something more generic like remove provably redundant NullTests
> > from baserestrictinfo.
>
> Yeah, I suspect that the way forward is to allow
> preprocess_minmax_aggregates to do what it does now, and then
> remove the IS NOT NULL clause again later when we have the
> info available to let us do that in a generic way.

I started looking at a more generic way to fix this.  In the attached
I'm catching quals being added to baserestrictinfo in
distribute_restrictinfo_to_rels() and checking for IS NOT NULL quals
on columns defined with NOT NULL.

I did this by adding a new function add_baserestrictinfo_to_rel()
which can be the place where we add any future logic to ignore other
always-true quals. Perhaps in the future, we can add some logic there
to look for quals on partitions which are always true based on the
partition constraint.

I think this is a good start.  Maybe we can extend it with little effort
to cover OR clauses.  For an OR clause, we can test its sub-clauses and
if one of them is IS NOT NULL qual on a NOT NULL column then we can know
that the OR clause is always true.

Maybe we can also test if the qual is always true according to the
applicable constraint expressions of the given relation, something that
is like the opposite of relation_excluded_by_constraints().  Of course
that would require much more efforts.

Another thing I'm wondering is that since we already have the
outer-join-aware-Var infrastructure, maybe we can also test whether a IS
NOT NULL qual in join clauses is always true.  I imagine we need to test
whether the Var in the IS NOT NULL qual has an empty varnullingrels
besides that the Var is a NOT NULL column.

BTW, with this patch the variable ‘rel’ in function
distribute_restrictinfo_to_rels is unused.

Thanks
Richard

On Thu, Jul 6, 2023 at 5:26 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Thu, Jul 6, 2023 at 7:55 AM David Rowley <dgrowleyml@gmail.com> wrote:
I started looking at a more generic way to fix this.  In the attached
I'm catching quals being added to baserestrictinfo in
distribute_restrictinfo_to_rels() and checking for IS NOT NULL quals
on columns defined with NOT NULL.

I did this by adding a new function add_baserestrictinfo_to_rel()
which can be the place where we add any future logic to ignore other
always-true quals. Perhaps in the future, we can add some logic there
to look for quals on partitions which are always true based on the
partition constraint.

I think this is a good start.  Maybe we can extend it with little effort
to cover OR clauses.  For an OR clause, we can test its sub-clauses and
if one of them is IS NOT NULL qual on a NOT NULL column then we can know
that the OR clause is always true.

Maybe we can also test if the qual is always true according to the
applicable constraint expressions of the given relation, something that
is like the opposite of relation_excluded_by_constraints().  Of course
that would require much more efforts.

Another thing I'm wondering is that since we already have the
outer-join-aware-Var infrastructure, maybe we can also test whether a IS
NOT NULL qual in join clauses is always true.  I imagine we need to test
whether the Var in the IS NOT NULL qual has an empty varnullingrels
besides that the Var is a NOT NULL column.

BTW, with this patch the variable ‘rel’ in function
distribute_restrictinfo_to_rels is unused.

Attached is what I have in mind.  The patch extends the logic from two
points.

* it also checks OR clauses to see if it is always true.

* it also checks for join clauses by additionally testing if the nulling
bitmap is empty.

I did not try the logic about testing a qual against the relation's
constraints though.

Thanks
Richard
Attachment
On Fri, 7 Jul 2023 at 19:03, Richard Guo <guofenglinux@gmail.com> wrote:
> Attached is what I have in mind.  The patch extends the logic from two
> points.
>
> * it also checks OR clauses to see if it is always true.
>
> * it also checks for join clauses by additionally testing if the nulling
> bitmap is empty.

Do you mind writing some regression tests for this?

I don't really see an existing test file that would suit, maybe it's
worth adding something like predicate.sql

David




On Mon, Jul 10, 2023 at 10:14 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 7 Jul 2023 at 19:03, Richard Guo <guofenglinux@gmail.com> wrote:
> Attached is what I have in mind.  The patch extends the logic from two
> points.
>
> * it also checks OR clauses to see if it is always true.
>
> * it also checks for join clauses by additionally testing if the nulling
> bitmap is empty.

Do you mind writing some regression tests for this?

I don't really see an existing test file that would suit, maybe it's
worth adding something like predicate.sql

Here is v3 patch with regression tests.  I add the new test into the
group where stats test is in, but I'm not sure if this is the right
place.

Thanks
Richard
Attachment

On Mon, Jul 10, 2023 at 2:39 PM Richard Guo <guofenglinux@gmail.com> wrote:
Here is v3 patch with regression tests.  I add the new test into the
group where stats test is in, but I'm not sure if this is the right
place.

cfbot says there is a test failure in postgres_fdw.  So update to v4 to
fix that.

Thanks
Richard
Attachment
On Mon, 10 Jul 2023 at 18:39, Richard Guo <guofenglinux@gmail.com> wrote:
> Here is v3 patch with regression tests.  I add the new test into the
> group where stats test is in, but I'm not sure if this is the right
> place.

Thanks for taking an interest in this.

I spent more time looking at the idea and I wondered why we should
just have it skip distributing IS NOT NULL quals to the relations.
Should we also be allow IS NULL quals on non-nullable Vars to be
detected as false?

I did some work on your v3 patch to see if that could be made to work.
I ended up just trying to make a new RestrictInfo with a "false"
clause, but quickly realised that it's not safe to go making new
RestrictInfos during deconstruct_distribute_oj_quals().  A comment
there mentions:

/*
* Each time we produce RestrictInfo(s) from these quals, reset the
* last_rinfo_serial counter, so that the RestrictInfos for the "same"
* qual condition get identical serial numbers.  (This relies on the
* fact that we're not changing the qual list in any way that'd affect
* the number of RestrictInfos built from it.) This'll allow us to
* detect duplicative qual usage later.
*/

I ended up moving the function that looks for the NullTest quals in
the joinlist out so it's done after the quals have been distributed to
the relations.  I'm not really that happy with this as if we ever
found some way to optimise quals that could be made part of an
EquivalenceClass then those quals would have already have been
processed to become EquivalenceClasses. I just don't see how to do it
earlier as deconstruct_distribute_oj_quals() calls
remove_nulling_relids() which changes the Var's varnullingrels causing
them to be empty during the processing of the NullTest qual.

It's also not so great that the RestrictInfo gets duplicated in:

CREATE TABLE t1 (a INT NOT NULL, b INT);
CREATE TABLE t2 (c INT NOT NULL, d INT);
CREATE TABLE t3 (e INT NOT NULL, f INT);

postgres=# EXPLAIN (costs off) SELECT * FROM t1 JOIN t2 ON t1.a = 1
LEFT JOIN t3 ON t2.c IS NULL AND t2.d = 1;
                      QUERY PLAN
-------------------------------------------------------
 Nested Loop
   ->  Nested Loop Left Join
         Join Filter: (false AND false AND (t2.d = 1))
         ->  Seq Scan on t2
         ->  Result
               One-Time Filter: false
   ->  Materialize
         ->  Seq Scan on t1
               Filter: (a = 1)
(9 rows)

Adjusting the code to build a new false clause and setting that in the
existing RestrictInfo rather than building a new RestrictInfo seems to
fix that. I wondered if the duplication was a result of the
rinfo_serial number changing.

Checking back to the original MinMaxAgg I'm not sure if this is all
getting more complex than it's worth or not.

I've attached what I've ended up with so far.

David


David

Attachment

On Tue, Sep 26, 2023 at 9:42 PM David Rowley <dgrowleyml@gmail.com> wrote:
I ended up moving the function that looks for the NullTest quals in
the joinlist out so it's done after the quals have been distributed to
the relations. 

It seems that the RestrictInfos for the "same" qual condition still may
get different serial numbers even if transform_join_clauses() is called
after we've distributed all the quals.  For example,

select * from t1
    left join t2 on t1.a = t2.c
    left join t3 on t2.c = t3.e and t2.c is null;

There are two versions for qual 't2.c is null': with and without being
marked nullable by t1/t2 join.  Let's write them as 'c* is null' and 'c
is null'.  They are supposed to have identical serial number.  But after
we've transformed 'c is null' to 'false', they do not have identical
serial number any more.  This may cause problems where the logic of
serial numbers is relied on?
 
I'm not really that happy with this as if we ever
found some way to optimise quals that could be made part of an
EquivalenceClass then those quals would have already have been
processed to become EquivalenceClasses. I just don't see how to do it
earlier as deconstruct_distribute_oj_quals() calls
remove_nulling_relids() which changes the Var's varnullingrels causing
them to be empty during the processing of the NullTest qual.

Hmm, I don't think it's a problem that deconstruct_distribute_oj_quals
changes the nullingrels.  It will compute the correct nullingrels at
last for different clones of the same qual condition.  We can just check
the nullingrels whatever it computes.
 
It's also not so great that the RestrictInfo gets duplicated in:

Adjusting the code to build a new false clause and setting that in the
existing RestrictInfo rather than building a new RestrictInfo seems to
fix that. I wondered if the duplication was a result of the
rinfo_serial number changing.

The RestrictInfo nodes in different joinlists are multiply-linked rather
than copied, so when building restrictlist for a joinrel we use pointer
equality to remove any duplication.  In your patch new RestrictInfo
nodes are created in transform_join_clauses(), so pointer equality no
longer works and we see duplication in the plan.
 
Checking back to the original MinMaxAgg I'm not sure if this is all
getting more complex than it's worth or not.

It seems that optimizing IS NULL quals is more complex than optimizing
IS NOT NULL quals.  I also wonder if it's worth the trouble to optimize
IS NULL quals.

BTW, there is an Assert failure running regression tests with your
patch.  I haven't looked into it.

Thanks
Richard
On Thu, 28 Sept 2023 at 16:22, Richard Guo <guofenglinux@gmail.com> wrote:
> It seems that optimizing IS NULL quals is more complex than optimizing
> IS NOT NULL quals.  I also wonder if it's worth the trouble to optimize
> IS NULL quals.

I'm happy to reduce the scope of this patch. As for what to cut, I
think if we're doing a subset then we should try to do that subset in
a way that best leaves things open for phase 2 at some later date.

In my view, it would be less surprising that this works for base quals
and not join quals than if it worked with "Var IS NOT NULL" but not
"Var IS NULL".  I'm unsure if my view is clouded by the fact that I
don't have a clear picture in my head on how this should work for join
quals, however.

Would it be surprising if this didn't work for join quals?  My
thoughts are probably not any more so than the fact that extended
statistics only work for base quals and not join quals, but I'm sure
other people will have different views on that. I don't feel like we
should end up with exactly nothing committed from this patch solely
due to scope creep.

David




On Thu, Sep 28, 2023 at 11:51 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 28 Sept 2023 at 16:22, Richard Guo <guofenglinux@gmail.com> wrote:
> It seems that optimizing IS NULL quals is more complex than optimizing
> IS NOT NULL quals.  I also wonder if it's worth the trouble to optimize
> IS NULL quals.

I'm happy to reduce the scope of this patch. As for what to cut, I
think if we're doing a subset then we should try to do that subset in
a way that best leaves things open for phase 2 at some later date.

I had a go at supporting IS NULL quals and ended up with the attached.
The patch generates a new constant-FALSE RestrictInfo that is marked
with the same required_relids etc as the original one if it is an IS
NULL qual that can be reduced to FALSE.  Note that the original
rinfo_serial is also copied to the new RestrictInfo.

One thing that is not great is that we may have 'FALSE and otherquals'
in the final plan, as shown by the plan below which is from the new
added test case.

+explain (costs off)
+select * from pred_tab t1 left join pred_tab t2 on true left join pred_tab t3 on t2.a is null and t2.b = 1;
+                    QUERY PLAN
+---------------------------------------------------
+ Nested Loop Left Join
+   ->  Seq Scan on pred_tab t1
+   ->  Materialize
+         ->  Nested Loop Left Join
+               Join Filter: (false AND (t2.b = 1))
+               ->  Seq Scan on pred_tab t2
+               ->  Result
+                     One-Time Filter: false
+(8 rows)

Maybe we can artificially reduce it to 'FALSE', but I'm not sure if it's
worth the trouble.

Thanks
Richard
Attachment
On 8/10/2023 15:26, Richard Guo wrote:
Hi,
> On Thu, Sep 28, 2023 at 11:51 AM David Rowley <dgrowleyml@gmail.com 
> <mailto:dgrowleyml@gmail.com>> wrote:
> 
>     On Thu, 28 Sept 2023 at 16:22, Richard Guo <guofenglinux@gmail.com
>     <mailto:guofenglinux@gmail.com>> wrote:
>      > It seems that optimizing IS NULL quals is more complex than
>     optimizing
>      > IS NOT NULL quals.  I also wonder if it's worth the trouble to
>     optimize
>      > IS NULL quals.
> 
>     I'm happy to reduce the scope of this patch. As for what to cut, I
>     think if we're doing a subset then we should try to do that subset in
>     a way that best leaves things open for phase 2 at some later date.
> 
> 
> I had a go at supporting IS NULL quals and ended up with the attached.
> The patch generates a new constant-FALSE RestrictInfo that is marked
> with the same required_relids etc as the original one if it is an IS
> NULL qual that can be reduced to FALSE.  Note that the original
> rinfo_serial is also copied to the new RestrictInfo.
> 
> One thing that is not great is that we may have 'FALSE and otherquals'
> in the final plan, as shown by the plan below which is from the new
> added test case.

Setting aside the thread's subject, I am interested in this feature 
because of its connection with the SJE feature and the same issue raised 
[1] during the discussion.
In the attachment - rebased version of your patch (because of the 
5d8aa8bced).
Although the patch is already in a good state, some improvements can be 
made. Look:
explain (costs off)
SELECT oid,relname FROM pg_class
WHERE oid < 5 OR (oid = 1 AND oid IS NULL);

  Bitmap Heap Scan on pg_class
    Recheck Cond: ((oid < '5'::oid) OR ((oid = '1'::oid) AND (oid IS NULL)))
    ->  BitmapOr
          ->  Bitmap Index Scan on pg_class_oid_index
                Index Cond: (oid < '5'::oid)
          ->  Bitmap Index Scan on pg_class_oid_index
                Index Cond: ((oid = '1'::oid) AND (oid IS NULL))

If we go deeply through the filter, I guess we could replace such buried 
clauses.

[1] Removing unneeded self joins
https://www.postgresql.org/message-id/CAPpHfdt-0kVV7O%3D%3DaJEbjY2iGYBu%2BXBzTHEbPv_6sVNeC7fffQ%40mail.gmail.com

-- 
regards,
Andrei Lepikhov
Postgres Professional

Attachment

On Tue, Oct 24, 2023 at 12:25 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:
Setting aside the thread's subject, I am interested in this feature
because of its connection with the SJE feature and the same issue raised
[1] during the discussion.

Thanks for taking an interest in this.

I rebased this patch over the SJE commit, and found that it can help
discard redundant IS_NOT_NULL quals added by SJE logic if we've
successfully removed some self-joins on primary keys, as shown by the
regression test plan changes, which IMO makes this patch look more
useful in practice.
 
Although the patch is already in a good state, some improvements can be
made. Look:
explain (costs off)
SELECT oid,relname FROM pg_class
WHERE oid < 5 OR (oid = 1 AND oid IS NULL);

  Bitmap Heap Scan on pg_class
    Recheck Cond: ((oid < '5'::oid) OR ((oid = '1'::oid) AND (oid IS NULL)))
    ->  BitmapOr
          ->  Bitmap Index Scan on pg_class_oid_index
                Index Cond: (oid < '5'::oid)
          ->  Bitmap Index Scan on pg_class_oid_index
                Index Cond: ((oid = '1'::oid) AND (oid IS NULL))

If we go deeply through the filter, I guess we could replace such buried
clauses.

Yeah, we can do that by exploring harder on OR clauses.  But for now I
think it's more important for this patch to introduce the
'reduce-quals-to-constant' mechanism.  As a start I think it'd be better
to keep the logic simple for review.  In the future maybe we can extend
it to consider more than just NullTest quals, for example we could also
consider applicable constraint expressions of the given relation.

Thanks
Richard
Attachment
On Wed, 1 Nov 2023 at 15:21, Richard Guo <guofenglinux@gmail.com> wrote:
> I rebased this patch over the SJE commit

I rebased your v7 patch on top of 930d2b442 and updated the expected
results of some new regression tests which now have their NullTest
clauses removed.

I also renamed add_baserestrictinfo_to_rel() to
add_base_clause_to_rel() so that it's more aligned to
add_join_clause_to_rels().

On looking deeper, I see you're overwriting the rinfo_serial of the
const-false RestrictInfo with the one from the original RestrictInfo.
If that's the correct thing to do then the following comment would
need to be updated to mention this exception of why the rinfo_serial
isn't unique.

/*----------
* Serial number of this RestrictInfo.  This is unique within the current
* PlannerInfo context, with a few critical exceptions:
* 1. When we generate multiple clones of the same qual condition to
* cope with outer join identity 3, all the clones get the same serial
* number.  This reflects that we only want to apply one of them in any
* given plan.
* 2. If we manufacture a commuted version of a qual to use as an index
* condition, it copies the original's rinfo_serial, since it is in
* practice the same condition.
* 3. RestrictInfos made for a child relation copy their parent's
* rinfo_serial.  Likewise, when an EquivalenceClass makes a derived
* equality clause for a child relation, it copies the rinfo_serial of
* the matching equality clause for the parent.  This allows detection
* of redundant pushed-down equality clauses.
*----------
*/

Looking at the tests, I see:

select * from pred_tab t1 left join pred_tab t2 on true left join
pred_tab t3 on t2.a is null;

I'm wondering if you can come up with a better test for this? I don't
quite see any reason why varnullingrels can't be empty for t2.a in the
join qual as the "ON true" join condition between t1 and t2 means that
there shouldn't ever be any NULL t2.a rows.  My thoughts are that if
we improve how varnullingrels are set in the future then this test
will be broken.

Also, I also like to write exactly what each test is testing so that
it's easier in the future to maintain the expected results.  It's
often tricky when making planner changes to know if some planner
changes makes a test completely useless or if the expected results
just need to be updated.  If someone changes varnullingrels to be
empty for this case, then if they accept the actual results as
expected results then the test becomes useless.  I tend to do this
with comments in the .sql file along the lines of "-- Ensure ..."

I also would rather see the SQLs in the test wrap their lines before
each join and the keywords to be upper case.

David

Attachment

On Wed, Nov 29, 2023 at 8:48 AM David Rowley <dgrowleyml@gmail.com> wrote:
I rebased your v7 patch on top of 930d2b442 and updated the expected
results of some new regression tests which now have their NullTest
clauses removed.

Thanks for your rebase.
 
On looking deeper, I see you're overwriting the rinfo_serial of the
const-false RestrictInfo with the one from the original RestrictInfo.
If that's the correct thing to do then the following comment would
need to be updated to mention this exception of why the rinfo_serial
isn't unique.

Right, that's what we need to do.
 
Looking at the tests, I see:

select * from pred_tab t1 left join pred_tab t2 on true left join
pred_tab t3 on t2.a is null;

I'm wondering if you can come up with a better test for this? I don't
quite see any reason why varnullingrels can't be empty for t2.a in the
join qual as the "ON true" join condition between t1 and t2 means that
there shouldn't ever be any NULL t2.a rows.  My thoughts are that if
we improve how varnullingrels are set in the future then this test
will be broken.

Also, I also like to write exactly what each test is testing so that
it's easier in the future to maintain the expected results.  It's
often tricky when making planner changes to know if some planner
changes makes a test completely useless or if the expected results
just need to be updated.  If someone changes varnullingrels to be
empty for this case, then if they accept the actual results as
expected results then the test becomes useless.  I tend to do this
with comments in the .sql file along the lines of "-- Ensure ..."

I also would rather see the SQLs in the test wrap their lines before
each join and the keywords to be upper case.

Thanks for the suggestions on the tests.  I had a go at improving the
test queries and their comments.

BTW, I changed the subject of this patch to 'Reduce NullTest quals to
constant TRUE or FALSE', which seems more accurate to me, because this
patch also reduces IS NULL clauses to constant-FALSE when applicable, in
addition to ignoring redundant NOT NULL clauses.

Thanks
Richard
Attachment