On 9/22/25 15:57, Andrei Lepikhov wrote:
> On 22/9/2025 15:37, Frédéric Yhuel wrote:
>> I wonder if this is an argument in favour of decoupling the sample
>> size and the precision of the statistics. Here, we basically want the
>> sample size to be as big as the table in order to include the few
>> (NULL, WARNING) values.
> I also have seen how repeating ANALYZE on the same database drastically
> changes query plans ;(.
> It seems to me that with massive samples, many of the ANALYZE algorithms
> should be rewritten. In principle, statistical hooks exist. So, it is
> possible to invent an independent table analyser which will scan the
> whole table to get precise statistics.
>
Interesting! I wonder how difficult it would be.
However, in this specific case, I realised that it wouldn't solve the
issue of ANALYZE being triggered when there are zero rows with (ackid,
crit) = (NULL, WARNING).
Partitioning would still work in this case, though, because ackid's
null_frac would be zero for the partition containing the 'WARNING' value.
I wonder if we could devise another kind of extended statistic that
would provide these "partitioned statistics" without actually partitioning.