Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate. - Mailing list pgsql-hackers

From Alexander Kuzmenkov
Subject Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.
Date
Msg-id fa8122c2-ea7a-ff38-49ac-e6616aec38ea@postgrespro.ru
Whole thread Raw
In response to [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.  (David Gould <daveg@sonic.net>)
Responses Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.  (David Gould <daveg@sonic.net>)
List pgsql-hackers
Hi David,

I was able to reproduce the problem using your script. 
analyze_counts.awk is missing, though.

The idea of using the result of ANALYZE as-is, without additional 
averaging, was discussed when vac_estimate_reltuples() was introduced 
originally. Ultimately, it was decided not to do so. You can find the 
discussion in this thread: 

https://www.postgresql.org/message-id/flat/BANLkTinL6QuAm_Xf8teRZboG2Mdy3dR_vw%40mail.gmail.com#BANLkTinL6QuAm_Xf8teRZboG2Mdy3dR_vw@mail.gmail.com

The core problem here seems to be that this calculation of moving 
average does not converge in your scenario. It can be shown that when 
the number of live tuples is constant and the number of pages grows, the 
estimated number of tuples will increase at each step. Do you think we 
can use some other formula that would converge in this scenario, but 
still filter the noise in ANALYZE results? I couldn't think of one yet.

-- 
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Anastasia Lubennikova
Date:
Subject: Function to track shmem reinit time
Next
From: Michael Banck
Date:
Subject: Re: [PoC PATCH] Parallel dump to /dev/null