Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate. - Mailing list pgsql-hackers

From Alexander Kuzmenkov
Subject Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.
Date
Msg-id 101dd073-1cb7-b471-7eeb-aa9ee928a1c9@postgrespro.ru
Whole thread Raw
In response to Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.
List pgsql-hackers
On 02.03.2018 02:49, Tom Lane wrote:
> I looked at this and don't think it really answers the question.  What
> happens is that, precisely because we only slowly adapt our estimate of
> density towards the new measurement, we will have an overestimate of
> density if the true density is decreasing (even if the new measurement is
> spot-on), and that corresponds exactly to an overestimate of reltuples.
> No surprise there.  The question is why it fails to converge to reality
> over time.

The calculation I made for the first step applies to the next steps too, 
with minor differences. So, the estimate increases at each step. Just 
out of interest, I plotted the reltuples for 60 steps, and it doesn't 
look like it's going to converge anytime soon (see attached).
Looking at the formula, this overshoot term is created when we multiply 
the old density by the new number of pages. I'm not sure how to fix 
this. I think we could average the number of tuples, not the densities. 
The attached patch demonstrates what I mean.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: ON CONFLICT DO UPDATE for partitioned tables
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] path toward faster partition pruning