Home > mailing lists

Re: planner chooses incremental but not the best one - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: planner chooses incremental but not the best one
Date	February 15, 2024 11:10:29
Msg-id	66431c21-5172-4e17-9422-8ec8b97a4efd@enterprisedb.com Whole thread Raw
In response to	Re: planner chooses incremental but not the best one (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Responses	Re: planner chooses incremental but not the best one
List	pgsql-hackers

Tree view


On 2/15/24 07:50, Andrei Lepikhov wrote:
> On 18/12/2023 19:53, Tomas Vondra wrote:
>> On 12/18/23 11:40, Richard Guo wrote:
>> The challenge is where to get usable information about correlation
>> between columns. I only have a couple very rought ideas of what might
>> try. For example, if we have multi-column ndistinct statistics, we might
>> look at ndistinct(b,c) and ndistinct(b,c,d) and deduce something from
>>
>>      ndistinct(b,c,d) / ndistinct(b,c)
>>
>> If we know how many distinct values we have for the predicate column, we
>> could then estimate the number of groups. I mean, we know that for the
>> restriction "WHERE b = 3" we only have 1 distinct value, so we could
>> estimate the number of groups as
>>
>>      1 * ndistinct(b,c)
> Did you mean here ndistinct(c,d) and the formula:
> ndistinct(b,c,d) / ndistinct(c,d) ?

Yes, I think that's probably a more correct ... Essentially, the idea is
to estimate the change in number of distinct groups after adding a
column (or restricting it in some way).

> 
> Do you implicitly bear in mind here the necessity of tracking clauses
> that were applied to the data up to the moment of grouping?
> 

I don't recall what exactly I considered two months ago when writing the
message, but I don't see why we would need to track that beyond what we
already have. Shouldn't it be enough for the grouping to simply inspect
the conditions on the lower levels?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: "Zhijie Hou (Fujitsu)"
Date: 15 February 2024, 10:59:10
Subject: RE: Synchronizing slots from primary to standby

From: "Hayato Kuroda (Fujitsu)"
Date: 15 February 2024, 11:23:16
Subject: RE: speed up a logical replica setup

Re: planner chooses incremental but not the best one - Mailing list pgsql-hackers

Previous

Next