Re: using extended statistics to improve join estimates - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: using extended statistics to improve join estimates
Date
Msg-id 2ed75657-e084-9539-c6de-597e5675014c@enterprisedb.com
Whole thread Raw
In response to using extended statistics to improve join estimates  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: using extended statistics to improve join estimates
List pgsql-hackers
Hi,

Here's a slightly improved / cleaned up version of the PoC patch, 
removing a bunch of XXX and FIXMEs, adding comments, etc.

The approach is sound in principle, I think, although there's still a 
bunch of things to address:

1) statext_compare_mcvs only really deals with equijoins / inner joins 
at the moment, as it's based on eqjoinsel_inner. It's probably desirable 
to add support for additional join types (inequality and outer joins).

2) Some of the steps are performed multiple times - e.g. matching base 
restrictions to statistics, etc. Those probably can be cached somehow, 
to reduce the overhead.

3) The logic of picking the statistics to apply is somewhat simplistic, 
and maybe could be improved in some way. OTOH the number of candidate 
statistics is likely low, so this is not a big issue.

4) statext_compare_mcvs is based on eqjoinsel_inner and makes a bunch of 
assumptions similar to the original, but some of those assumptions may 
be wrong in multi-column case, particularly when working with a subset 
of columns. For example (ndistinct - size(MCV)) may not be the number of 
distinct combinations outside the MCV, when ignoring some columns. Same 
for nullfract, and so on. I'm not sure we can do much more than pick 
some reasonable approximation.

5) There are no regression tests at the moment. Clearly a gap.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: PG 14 release notes, first draft
Next
From: Васильев Дмитрий
Date:
Subject: Re: MultiXact\SLRU buffers configuration