Re: Question: test "aggregates" failed in 32-bit machine - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Question: test "aggregates" failed in 32-bit machine
Date
Msg-id 656578.1664557033@sss.pgh.pa.us
Whole thread Raw
In response to Re: Question: test "aggregates" failed in 32-bit machine  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Question: test "aggregates" failed in 32-bit machine
List pgsql-hackers
I wrote:
> The most likely theory, I think, is that that compiler is generating
> slightly different floating-point code causing different plans to
> be costed slightly differently than what the test case is expecting.
> Probably, the different orderings of the keys in this test case have
> exactly the same cost, or almost exactly, so that different roundoff
> error could be enough to change the selected plan.

I added some debug printouts to get_cheapest_group_keys_order()
and verified that in the two problematic queries, there are two
different orderings that have (on my machine) exactly equal lowest
cost.  So the code picks the first of those and ignores the second.
Different roundoff error would be enough to make it do something
else.

I find this problematic because "exactly equal" costs are not going
to be unusual.  That's because the values that cost_sort_estimate
relies on are, sadly, just about completely fictional.  It's expecting
that it can get a good cost estimate based on:

* procost.  In case you hadn't noticed, this is going to be 1 for
just about every function we might be considering here.

* column width.  This is either going to be a constant (e.g. 4
for integers) or, again, largely fictional.  The logic for
converting widths to cost multipliers adds yet another layer
of debatability.

* numdistinct estimates.  Sometimes we know what we're talking
about there, but often we don't.

So what I'm afraid we are dealing with here is usually going to
be garbage in, garbage out.  And we're expending an awful lot
of code and cycles to arrive at these highly questionable choices.

Given the previous complaints about db0d67db2, I wonder if it's not
most prudent to revert it.  I doubt we are going to get satisfactory
behavior out of it until there's fairly substantial improvements in
all these underlying estimates.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Question: test "aggregates" failed in 32-bit machine
Next
From: Andres Freund
Date:
Subject: Re: [PATCH v1] [meson] add a default option prefix=/usr/local/pgsql