Re: Indexes on expressions with multiple columns and operators - Mailing list pgsql-performance

From Frédéric Yhuel
Subject Re: Indexes on expressions with multiple columns and operators
Date
Msg-id 6ad08c0c-2c03-4b30-b5a7-dc486aa29c52@dalibo.com
Whole thread Raw
In response to Re: Indexes on expressions with multiple columns and operators  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Indexes on expressions with multiple columns and operators
List pgsql-performance

On 9/20/25 18:51, Tom Lane wrote:
> I concluded that maybe I was overthinking this part.  We only really
> need to check the rowcount estimate, since the indexscan cost estimate
> is already okay.  And stats_ext.sql seems to have gotten away with
> assuming that rowcount estimates are reliably reproducible on
> not-too-large tables.  This bug affects use of extended statistics
> too, so a test using those is good enough to show it's fixed; we don't
> really need to use an expression index for the purpose.  So, I added a
> test case in stats_ext.sql and pushed it.

OK, great! Thanks for the detailed explanation.

Regarding extended statistics, it's unfortunate that they cannot be used 
in this case. Multivariate MCV statistics work as long as the number of 
rows in the table is reasonably small (100K) and the STATISTICS value 
for a column is high enough (it doesn't matter which column, because 
only the sample size matters).

I wonder if this is an argument in favour of decoupling the sample size 
and the precision of the statistics. Here, we basically want the sample 
size to be as big as the table in order to include the few (NULL, 
WARNING) values.

Or maybe we need a different kind of extended statistics?



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Indexes on expressions with multiple columns and operators
Next
From: Andrei Lepikhov
Date:
Subject: Re: Indexes on expressions with multiple columns and operators