Thread: Fix incorrect start up costs for WindowAgg paths (bug #17862)

Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
Over on [1], Tim reported that the planner is making some bad choices
when the plan contains a WindowFunc which requires reading all of, or
a large portion of the WindowAgg subnode in order to produce the first
WindowAgg row.

For example:

EXPLAIN (ANALYZE, TIMING OFF)
SELECT COUNT(*) OVER ()
FROM tenk1 t1 INNER JOIN tenk1 t2 ON t1.unique1 = t2.tenthous
LIMIT 1;

With master, we get the following plan:

                                                                    QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.29..0.67 rows=1 width=8) (actual time=47.491..47.492
rows=1 loops=1)
   ->  WindowAgg  (cost=0.29..3815.00 rows=10000 width=8) (actual
time=47.489..47.490 rows=1 loops=1)
         ->  Nested Loop  (cost=0.29..3690.00 rows=10000 width=0)
(actual time=0.026..42.972 rows=10000 loops=1)
               ->  Seq Scan on tenk1 t2  (cost=0.00..445.00 rows=10000
width=4) (actual time=0.009..1.734 rows=10000 loops=1)
               ->  Index Only Scan using tenk1_unique1 on tenk1 t1
(cost=0.29..0.31 rows=1 width=4) (actual time=0.003..0.004 rows=1
loops=10000)
                     Index Cond: (unique1 = t2.tenthous)
                     Heap Fetches: 0
 Planning Time: 0.420 ms
 Execution Time: 48.107 ms

You can see that the time to get the first WindowAgg row (47.489 ms)
is not well aligned to the startup cost (0.29).  This effectively
causes the planner to choose a Nested Loop plan as it thinks it'll
read just 1 row from the join.  Due to the OVER (), we'll read all
rows! Not good.

It's not hard to imagine that a slightly different schema could yield
a *far* worse plan if it opted to use a non-parameterised nested loop
plan and proceed to read all rows from it.

With the attached patch, that turns into:


       QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=928.02..928.02 rows=1 width=8) (actual
time=29.308..29.310 rows=1 loops=1)
   ->  WindowAgg  (cost=928.02..928.07 rows=10000 width=8) (actual
time=29.306..29.308 rows=1 loops=1)
         ->  Hash Join  (cost=395.57..803.07 rows=10000 width=0)
(actual time=10.674..22.032 rows=10000 loops=1)
               Hash Cond: (t1.unique1 = t2.tenthous)
               ->  Index Only Scan using tenk1_unique1 on tenk1 t1
(cost=0.29..270.29 rows=10000 width=4) (actual time=0.036..4.961
rows=10000 loops=1)
                     Heap Fetches: 0
               ->  Hash  (cost=270.29..270.29 rows=10000 width=4)
(actual time=10.581..10.582 rows=10000 loops=1)
                     Buckets: 16384  Batches: 1  Memory Usage: 480kB
                     ->  Index Only Scan using tenk1_thous_tenthous on
tenk1 t2  (cost=0.29..270.29 rows=10000 width=4) (actual
time=0.055..5.437 rows=10000 loops=1)
                           Heap Fetches: 0
 Planning Time: 2.415 ms
 Execution Time: 30.554 ms


I'm not sure if we should consider backpatching a fix for this bug.
We tend not to commit stuff that would destabilise plans in the back
branches.  On the other hand, it's fairly hard to imagine  how we
could make this much worse even given bad estimates.

I do think we should fix this in v16, however.

I'll add this to the "Older bugs affecting stable branches" section of
the PG 16 open items list

David

[1] https://postgr.es/m/17862-1ab8f74b0f7b0611@postgresql.org

Attachment

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Andy Fan
Date:


On Wed, Apr 12, 2023 at 5:04 PM David Rowley <dgrowleyml@gmail.com> wrote:

With the attached patch, that turns into:

The concept of startup_tuples for a WindowAgg looks good to me, but I 
can't follow up with the below line:

+ return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL);

# select count(*) over() from tenk1 limit 1;
 count
-------
 10000  -->  We need to scan all the tuples.  

Should we just return clamp_row_est(partition_tuples)? 
 

--
Best Regards
Andy Fan

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
.On Thu, 13 Apr 2023 at 02:28, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> The concept of startup_tuples for a WindowAgg looks good to me, but I
> can't follow up with the below line:
>
> + return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL);
>
> # select count(*) over() from tenk1 limit 1;
>  count
> -------
>  10000  -->  We need to scan all the tuples.
>
> Should we just return clamp_row_est(partition_tuples)?

For the case you've shown, it will.  It's handled by this code:

if (wc->orderClause == NIL)
    return clamp_row_est(partition_tuples);

It would take something like the following to hit the code you're
concerned about:

explain select count(*) over(order by unique1 rows between unbounded
preceding and 10*random() following) from tenk1;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 WindowAgg  (cost=140.23..420.29 rows=10000 width=12)
   ->  Index Only Scan using tenk1_unique1 on tenk1
(cost=0.29..270.29 rows=10000 width=4)
(2 rows)

You can see the startup cost is about 33% of the total cost for that,
which is from the DEFAULT_INEQ_SEL.  I'm not exactly set on that
having to be DEFAULT_INEQ_SEL, but I'm not really sure what we could
put that's better. I don't really follow why assuming all rows are
required is better.  That'll just mean we favour cheap startup plans
less, but there might be a case where a cheap startup plan is
favourable. I was opting for a happy medium when I thought to use
DEFAULT_INEQ_SEL.

I also see I might need to do a bit more work on this as the following
is not handled correctly:

select count(*) over(rows between unbounded preceding and 10
following) from tenk1;

it's assuming all rows due to lack of ORDER BY, but it seems like it
should be 10 rows due to the 10 FOLLOWING end bound.

David



Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Andy Fan
Date:


On Thu, Apr 13, 2023 at 6:09 AM David Rowley <dgrowleyml@gmail.com> wrote:
.On Thu, 13 Apr 2023 at 02:28, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> The concept of startup_tuples for a WindowAgg looks good to me, but I
> can't follow up with the below line:
>
> + return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL);
>
> # select count(*) over() from tenk1 limit 1;
>  count
> -------
>  10000  -->  We need to scan all the tuples.
>
> Should we just return clamp_row_est(partition_tuples)?

For the case you've shown, it will.  It's handled by this code:

if (wc->orderClause == NIL)
    return clamp_row_est(partition_tuples);

My fault.  I should have real debugging to double check my
understanding, surely I will next time. 

It would take something like the following to hit the code you're
concerned about:

explain select count(*) over(order by unique1 rows between unbounded
preceding and 10*random() following) from tenk1;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 WindowAgg  (cost=140.23..420.29 rows=10000 width=12)
   ->  Index Only Scan using tenk1_unique1 on tenk1
(cost=0.29..270.29 rows=10000 width=4)
(2 rows)

You can see the startup cost is about 33% of the total cost for that,
which is from the DEFAULT_INEQ_SEL.  I'm not exactly set on that
having to be DEFAULT_INEQ_SEL, but I'm not really sure what we could
put that's better. I don't really follow why assuming all rows are
required is better.  That'll just mean we favour cheap startup plans
less, but there might be a case where a cheap startup plan is
favourable. I was opting for a happy medium when I thought to use
DEFAULT_INEQ_SEL.

That looks reasonable to me.  My suggestion came from my misreading
before,  It was a bit late in my time zone when writing. Thanks for the
detailed explanation! 
 

I also see I might need to do a bit more work on this as the following
is not handled correctly:

select count(*) over(rows between unbounded preceding and 10
following) from tenk1;

it's assuming all rows due to lack of ORDER BY, but it seems like it
should be 10 rows due to the 10 FOLLOWING end bound.


True to me.   


--
Best Regards
Andy Fan

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
On Thu, 13 Apr 2023 at 10:09, David Rowley <dgrowleyml@gmail.com> wrote:
> I also see I might need to do a bit more work on this as the following
> is not handled correctly:
>
> select count(*) over(rows between unbounded preceding and 10
> following) from tenk1;
>
> it's assuming all rows due to lack of ORDER BY, but it seems like it
> should be 10 rows due to the 10 FOLLOWING end bound.

Well, as it turned out, it was quite a bit more work. The frame
options have had quite a few additions since I last looked in detail.

I've attached v2 of the patch.  I've included a DEBUG1 message which
is useful to check what the estimate comes out as without having to
have a debugger attached all the time.

Here are a few samples of the estimator getting things right:

# select count(*) over (order by four range between unbounded
preceding and 2 following exclude current row) from tenk1 limit 1;
DEBUG:  startup_tuples = 7499
 count
-------
  7499

# select count(*) over (order by four rows between unbounded preceding
and 4000 following) from tenk1 limit 1;
DEBUG:  startup_tuples = 4001
 count
-------
  4001

# select count(*) over (order by four rows between unbounded preceding
and 4000 following exclude group) from tenk1 limit 1;
DEBUG:  startup_tuples = 1501
 count
-------
  1501

You can see in each case, startup_tuples was estimated correctly as
confirmed by count(*) during execution.

I've attached some more of these in sample_tests.txt, which all are
correct with the caveat of get_windowclause_startup_tuples() never
returning 0 due to it using clamp_row_est().  In practice, that's a
non-issue due to the way the startup_tuples value is used to calculate
the startup costs.

David

Attachment

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
On Wed, 12 Apr 2023 at 21:03, David Rowley <dgrowleyml@gmail.com> wrote:
> I'm not sure if we should consider backpatching a fix for this bug.
> We tend not to commit stuff that would destabilise plans in the back
> branches.  On the other hand, it's fairly hard to imagine  how we
> could make this much worse even given bad estimates.
>
> I do think we should fix this in v16, however.
>
> I'll add this to the "Older bugs affecting stable branches" section of
> the PG 16 open items list

When I wrote the above, it was very soon after the feature freeze for
PG16. I wondered, since we tend not to do cost changes as part of bug
fixes due to not wanting to destabilise plans between minor versions
if we could instead just fix it in PG16 given the freeze had *just*
started.   That's no longer the case, so I'm just going to move this
out from where I added it in the PG16 Open items "Live issues" section
and just add a July CF entry for it instead.

David



Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
On Wed, 31 May 2023 at 12:59, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 12 Apr 2023 at 21:03, David Rowley <dgrowleyml@gmail.com> wrote:
> > I'll add this to the "Older bugs affecting stable branches" section of
> > the PG 16 open items list
>
> When I wrote the above, it was very soon after the feature freeze for
> PG16. I wondered, since we tend not to do cost changes as part of bug
> fixes due to not wanting to destabilise plans between minor versions
> if we could instead just fix it in PG16 given the freeze had *just*
> started.   That's no longer the case, so I'm just going to move this
> out from where I added it in the PG16 Open items "Live issues" section
> and just add a July CF entry for it instead.

I'm keen to move this patch along.  It's not a particularly
interesting patch and don't expect much interest in it, but I feel
it's pretty important to have the planner not accidentally choose a
cheap startup plan when a WindowAgg is going to fetch the entire
subplan's tuples.

I've made another pass over the patch and made a bunch of cosmetic
changes.  As far as mechanical changes, I only changed the EXCLUDE
TIES and EXCLUDE GROUP behaviour when there is no ORDER BY clause in
the WindowClause. If there's no ORDER BY then subtracting 1.0 rows
seems like the right thing to do rather than what the previous patch
did.

I (temporarily) left the DEBUG1 elog in there if anyone wants to test
for themselves (saves debugger use). In the absence of that, I'm
planning on just pushing it to master only tomorrow.  It seems fairly
low risk and unlikely to attract too much interest since it only
affects startup costs of WindowAgg nodes. I'm currently thinking it's
a bad idea to backpatch this but I'd consider it more if someone else
thought it was a good idea or if more people came along complaining
about poor plan choice in plans containing WindowAggs. Currently, it
seems better not to destabilise plans in the back branches. (CC'd Tim,
who reported #17862, as he may have an opinion on this)

The only thought I had while looking at this again aside from what I
changed was if get_windowclause_startup_tuples() should go in
selfuncs.c.  I wondered if it would be neater to use
convert_numeric_to_scalar() instead of the code I had to add to
convert the (SMALL|BIG)INT Consts in <Const> FOLLOWING to double.
Aside from that reason, it seems we don't have many usages of
DEFAULT_INEQ_SEL outside of selfuncs.c. I didn't feel strongly enough
about this to actually move the function.

The updated patch is attached.

Here are the results of my testing (note the DEBUG message matches the
COUNT(*) result in all cases apart from one case where COUNT(*)
returns 0 and the estimated tuples is 1.0).

create table ab (a int, b int);
insert into ab select a,b from generate_series(1,100) a,
generate_series(1,100) b;
analyze ab;
set client_min_messages=debug1;

# select count(*) over () from ab limit 1;
DEBUG:  startup_tuples = 10000
 count
-------
 10000
(1 row)


# select count(*) over (partition by a) from ab limit 1;
DEBUG:  startup_tuples = 100
 count
-------
   100
(1 row)


# select count(*) over (partition by a order by b) from ab limit 1;
DEBUG:  startup_tuples = 1
 count
-------
     1
(1 row)


# select count(*) over (partition by a order by b rows between current
row and unbounded following) from ab limit 1;
DEBUG:  startup_tuples = 100
 count
-------
   100
(1 row)


# select count(*) over (partition by a order by b rows between current
row and 10 following) from ab limit 1;
DEBUG:  startup_tuples = 11
 count
-------
    11
(1 row)


# select count(*) over (partition by a order by b rows between current
row and 10 following exclude current row) from ab limit 1;
DEBUG:  startup_tuples = 10
 count
-------
    10
(1 row)


# select count(*) over (partition by a order by b rows between current
row and 10 following exclude ties) from ab limit 1;
DEBUG:  startup_tuples = 11
 count
-------
    11
(1 row)


# select count(*) over (partition by a order by b range between
current row and 10 following exclude ties) from ab limit 1;
DEBUG:  startup_tuples = 11
 count
-------
    11
(1 row)


# select count(*) over (partition by a order by b range between
current row and unbounded following exclude ties) from ab limit 1;
DEBUG:  startup_tuples = 100
 count
-------
   100
(1 row)


# select count(*) over (partition by a order by b range between
current row and unbounded following exclude group) from ab limit 1;
DEBUG:  startup_tuples = 99
 count
-------
    99
(1 row)


# select count(*) over (partition by a order by b groups between
current row and unbounded following exclude group) from ab limit 1;
DEBUG:  startup_tuples = 99
 count
-------
    99
(1 row)


# select count(*) over (partition by a rows between current row and
unbounded following exclude group) from ab limit 1;
DEBUG:  startup_tuples = 1
 count
-------
     0
(1 row)


# select count(*) over (partition by a rows between current row and
unbounded following exclude ties) from ab limit 1;
DEBUG:  startup_tuples = 1
 count
-------
     1
(1 row)


# select count(*) over (partition by a order by b rows between current
row and unbounded following exclude ties) from ab limit 1;
DEBUG:  startup_tuples = 100
 count
-------
   100
(1 row)


# select count(*) over (partition by a order by b rows between current
row and unbounded following exclude current row) from ab limit 1;
DEBUG:  startup_tuples = 99
 count
-------
    99
(1 row)


# select count(*) over (partition by a order by b range between
current row and 9223372036854775807 following exclude ties) from ab
limit 1;
DEBUG:  startup_tuples = 100
 count
-------
   100
(1 row)

David

Attachment

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Andy Fan
Date:
Hi David:

Sorry for feedback at the last minute!  I study the patch and find the
following cases.

1. ORDER BY or PARTITION BY

select *, count(two) over (order by unique1) from tenk1 limit 1;
DEBUG:  startup_tuples = 1
DEBUG:  startup_tuples = 1

select *, count(two) over (partition by unique1) from tenk1 limit 1;
DEBUG:  startup_tuples = 1
DEBUG:  startup_tuples = 1

Due to the Executor of nodeWindowAgg, we have to fetch the next tuple
until it mismatches with the current one, then we can calculate the
WindowAgg function. In the current patch, we didn't count the
mismatched tuple. I verified my thought with 'break at IndexNext'
function and see IndexNext is called twice, so in the above case the
startup_tuples should be 2?


2. ORDER BY and PARTITION BY

select two, hundred,
count(two) over (partition by ten order by hundred)
from tenk1 limit 1;

DEBUG:  startup_tuples = 10
 two | hundred | count
-----+---------+-------
   0 |       0 |   100

If we consider the mismatched tuples, it should be 101?

3. As we can see the log for startup_tuples is logged twice sometimes,
the reason is because it is used in cost_windowagg, so it is calculated
for every create_one_window_path. I think the startup_tuples should be
independent with the physical path, maybe we can cache it somewhere to
save some planning cycles?

Thanks for the patch!

--
Best Regards
Andy Fan

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
Thanks for having a look at this.

On Thu, 3 Aug 2023 at 18:49, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> 1. ORDER BY or PARTITION BY
>
> select *, count(two) over (order by unique1) from tenk1 limit 1;
> DEBUG:  startup_tuples = 1
> DEBUG:  startup_tuples = 1
>
> select *, count(two) over (partition by unique1) from tenk1 limit 1;
> DEBUG:  startup_tuples = 1
> DEBUG:  startup_tuples = 1
>
> Due to the Executor of nodeWindowAgg, we have to fetch the next tuple
> until it mismatches with the current one, then we can calculate the
> WindowAgg function. In the current patch, we didn't count the
> mismatched tuple. I verified my thought with 'break at IndexNext'
> function and see IndexNext is called twice, so in the above case the
> startup_tuples should be 2?

You're probably right here. I'd considered that it wasn't that
critical and aimed to attempt to keep the estimate close to the number
of rows that'll be aggregated. I think that's probably not the best
thing to do as if you consider the EXCLUDE options, those just exclude
tuples from aggregation, it does not mean we read fewer tuples from
the subnode. I've updated the patch accordingly.

> 2. ORDER BY and PARTITION BY
>
> select two, hundred,
> count(two) over (partition by ten order by hundred)
> from tenk1 limit 1;
>
> DEBUG:  startup_tuples = 10
>  two | hundred | count
> -----+---------+-------
>    0 |       0 |   100
>
> If we consider the mismatched tuples, it should be 101?

I don't really see how we could do better with the current level of
statistics.  The stats don't know that there are only 10 distinct
"hundred" values for rows which have ten=1. All we have is n_distinct
on tenk1.hundred, which is 100.

> 3. As we can see the log for startup_tuples is logged twice sometimes,
> the reason is because it is used in cost_windowagg, so it is calculated
> for every create_one_window_path. I think the startup_tuples should be
> independent with the physical path, maybe we can cache it somewhere to
> save some planning cycles?

I wondered about that too but I ended up writing off the idea of
caching because the input_tuple count comes from the Path and the
extra calls are coming from other Paths, which could well have some
completely different value for input_tuples.

David

Attachment

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Andy Fan
Date:


On Thu, Aug 3, 2023 at 7:29 PM David Rowley <dgrowleyml@gmail.com> wrote:
Thanks for having a look at this.

On Thu, 3 Aug 2023 at 18:49, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> 1. ORDER BY or PARTITION BY
>
> select *, count(two) over (order by unique1) from tenk1 limit 1;
> DEBUG:  startup_tuples = 1
> DEBUG:  startup_tuples = 1
>
> select *, count(two) over (partition by unique1) from tenk1 limit 1;
> DEBUG:  startup_tuples = 1
> DEBUG:  startup_tuples = 1
>
> Due to the Executor of nodeWindowAgg, we have to fetch the next tuple
> until it mismatches with the current one, then we can calculate the
> WindowAgg function. In the current patch, we didn't count the
> mismatched tuple. I verified my thought with 'break at IndexNext'
> function and see IndexNext is called twice, so in the above case the
> startup_tuples should be 2?

You're probably right here. I'd considered that it wasn't that
critical and aimed to attempt to keep the estimate close to the number
of rows that'll be aggregated. I think that's probably not the best
thing to do as if you consider the EXCLUDE options, those just exclude
tuples from aggregation, it does not mean we read fewer tuples from
the subnode. I've updated the patch accordingly.

Thanks. 
 

> 2. ORDER BY and PARTITION BY
>
> select two, hundred,
> count(two) over (partition by ten order by hundred)
> from tenk1 limit 1;
>
> DEBUG:  startup_tuples = 10
>  two | hundred | count
> -----+---------+-------
>    0 |       0 |   100
>
> If we consider the mismatched tuples, it should be 101?

I don't really see how we could do better with the current level of
statistics.  The stats don't know that there are only 10 distinct
"hundred" values for rows which have ten=1. All we have is n_distinct
on tenk1.hundred, which is 100.

Yes,  actually I didn't figure it out before / after my posting.   
 

> 3. As we can see the log for startup_tuples is logged twice sometimes,
> the reason is because it is used in cost_windowagg, so it is calculated
> for every create_one_window_path. I think the startup_tuples should be
> independent with the physical path, maybe we can cache it somewhere to
> save some planning cycles?

I wondered about that too but I ended up writing off the idea of
caching because the input_tuple count comes from the Path and the
extra calls are coming from other Paths, which could well have some
completely different value for input_tuples.


Looks reasonable. 

I have checked the updated patch and LGTM.  

--
Best Regards
Andy Fan

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Tim Palmer
Date:
On Thu, 3 Aug 2023 at 05:50, David Rowley <dgrowleyml@gmail.com> wrote:
I'm currently thinking it's
a bad idea to backpatch this but I'd consider it more if someone else
thought it was a good idea or if more people came along complaining
about poor plan choice in plans containing WindowAggs. Currently, it
seems better not to destabilise plans in the back branches. (CC'd Tim,
who reported #17862, as he may have an opinion on this)

I agree it's better not to destabilise plans in the back branches.

Tim 

Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
On Fri, 4 Aug 2023 at 02:02, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> I have checked the updated patch and LGTM.

Thank you for reviewing.  I've pushed the patch to master only.

David



Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Nathan Bossart
Date:
On Fri, Aug 04, 2023 at 09:28:51AM +1200, David Rowley wrote:
> Thank you for reviewing.  I've pushed the patch to master only.

I'm seeing some reliable test failures for 32-bit builds on cfbot [0].  At
a glance, it looks like the relations are swapped in the plan.

[0]
https://api.cirrus-ci.com/v1/artifact/task/5728127981191168/testrun/build-32/testrun/regress/regress/regression.diffs

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Tom Lane
Date:
Nathan Bossart <nathandbossart@gmail.com> writes:
> On Fri, Aug 04, 2023 at 09:28:51AM +1200, David Rowley wrote:
>> Thank you for reviewing.  I've pushed the patch to master only.

> I'm seeing some reliable test failures for 32-bit builds on cfbot [0].  At
> a glance, it looks like the relations are swapped in the plan.

Yeah, I got the same result in a 32-bit FreeBSD VM.  Probably, the two
plans are of effectively-identical estimated cost, and there's some
roundoff effect in those estimates that differs between machines with
4-byte and 8-byte MAXALIGN.

You could likely stabilize the plan choice by joining two tables that
aren't of identical size -- maybe add an additional WHERE constraint
on one of the tables?

            regards, tom lane



Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
David Rowley
Date:
On Fri, 4 Aug 2023 at 11:54, Nathan Bossart <nathandbossart@gmail.com> wrote:
> I'm seeing some reliable test failures for 32-bit builds on cfbot [0].  At
> a glance, it looks like the relations are swapped in the plan.

Thank you for the report. I've just pushed a patch which I'm hoping will fix it.

David



Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> Thank you for the report. I've just pushed a patch which I'm hoping will fix it.

Passes now on my VM.

            regards, tom lane