Home > mailing lists

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser' - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser'
Date	February 19 13:00:51
Msg-id	CAPpHfduirKBHvBuy-ZAht5fxv++jCUeToHnRWxuGHmzAmcb54A@mail.gmail.com Whole thread Raw
In response to	Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser' (Heikki Linnakangas <hlinnaka@iki.fi>)
List	pgsql-hackers

Tree view

On Tue, Feb 18, 2025 at 2:52 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 17/2/2025 02:06, Alexander Korotkov wrote:
> > On Thu, Nov 28, 2024 at 4:39 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
> >> Here we also could count number of scanned NULLs separately in
> >> vardata_extra and use it in upper GROUP-BY estimation.
> >
> > What could be the type of vardata_extra?  And what information could
> > it store?  Yet seems too sketchy for me to understand.
> It is actually sketchy. Our estimation routines have no information
> about intermediate modifications of the data. Left-join generated NULLs
> is a good example here. So, my vague idea is to maintain that info and
> change statistical estimations somehow.
> Of course, it is out of the scope here.
> >
> > But, I think for now we should go with the original patch.  It seems
> > to be quite straightforward extension to what 4767bc8ff2 does.  I've
> > revised commit message and applied pg_indent to sources.  I'm going to
> > push this if no objections.
> Ok, I added one regression test to check that feature works properly.

Andrei, thank you.  I've pushed the patch applying some simplification
of regression test.

------
Regards,
Alexander Korotkov
Supabase

pgsql-hackers by date:

From: Bertrand Drouvot
Date: 19 February, 12:56:18
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Amit Kapila
Date: 19 February, 13:13:30
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation

Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser' - Mailing list pgsql-hackers

Previous

Next