Re: Indexes on expressions with multiple columns and operators - Mailing list pgsql-performance

From Frédéric Yhuel
Subject Re: Indexes on expressions with multiple columns and operators
Date
Msg-id 493a013c-63d1-467a-b9ec-352f77baf37a@dalibo.com
Whole thread Raw
In response to Re: Indexes on expressions with multiple columns and operators  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: Indexes on expressions with multiple columns and operators
List pgsql-performance

On 9/22/25 15:57, Andrei Lepikhov wrote:
> On 22/9/2025 15:37, Frédéric Yhuel wrote:
>> I wonder if this is an argument in favour of decoupling the sample 
>> size and the precision of the statistics. Here, we basically want the 
>> sample size to be as big as the table in order to include the few 
>> (NULL, WARNING) values.
> I also have seen how repeating ANALYZE on the same database drastically 
> changes query plans ;(.
> It seems to me that with massive samples, many of the ANALYZE algorithms 
> should be rewritten. In principle, statistical hooks exist. So, it is 
> possible to invent an independent table analyser which will scan the 
> whole table to get precise statistics.
> 

Interesting! I wonder how difficult it would be.

However, in this specific case, I realised that it wouldn't solve the 
issue of ANALYZE being triggered when there are zero rows with (ackid, 
crit) = (NULL, WARNING).

Partitioning would still work in this case, though, because ackid's 
null_frac would be zero for the partition containing the 'WARNING' value.

I wonder if we could devise another kind of extended statistic that 
would provide these "partitioned statistics" without actually partitioning.




pgsql-performance by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: Indexes on expressions with multiple columns and operators
Next
From: Andrei Lepikhov
Date:
Subject: Re: Indexes on expressions with multiple columns and operators