Home > mailing lists

Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq - Mailing list pgsql-hackers

From	Tender Wang
Subject	Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
Date	March 1 06:40:11
Msg-id	CAHewXNnYQSCRQ9PaQyViBEB6UKC08nqCzE6YjNcZxuvbThRBgg@mail.gmail.com Whole thread
In response to	Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq
List	pgsql-hackers

Tree view

Hi all,
>Yeah, in my last email, I said I tried this way. But I worried that
>rel->tuples may be zero for an empty relation.
In my previous email， I worried rel->tuples may be zero for an empty relation.
But here it's safe, because an empty relation has no tuples in pg_statistic.
So it will not enter if (HeapTupleIsValid(vardata.statsTuple)).
Sorry for the noise.

Tom Lane <tgl@sss.pgh.pa.us> 于2026年3月1日周日 08:08写道：


> Hmm ... doesn't this contradict your argument that avgfreq and
> mcv_freq need to be calculated on the same basis?  Admittedly
> that was just a heuristic, but I'm not seeing why it's wrong.
>

Agree

> > The reason for this is that estfract is calculated as:
> >     estfract = 1.0 / ndistinct;
> > where ndistinct has been adjusted to account for restriction clauses.
> > Therefore, we must also use the adjusted avgfreq when adjusting
> > estfract here:
>
> It feels like that might end up double-counting the effects of
> the restriction clauses.
>
> Anyway, we all seem to agree that s/rel->rows/rel->tuples/ is the
> correct fix for a newly-introduced bug.  I'm inclined to proceed
> by committing that fix (along with any regression test fallout)
> and then investigating the avgfreq change as an independent matter.

+1


--
Thanks,
Tender Wang

pgsql-hackers by date:

From: vignesh C
Date: 01 March, 06:11:30
Subject: Re: Skipping schema changes in publication

From: Tom Lane
Date: 01 March, 06:53:33
Subject: Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq

Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq - Mailing list pgsql-hackers

Previous

Next