Re: Reduce maximum error in tuples estimation after vacuum. - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Reduce maximum error in tuples estimation after vacuum.
Date
Msg-id 00d001ce7329$5fcd0c50$1f6724f0$@kapila@huawei.com
Whole thread Raw
In response to Re: Reduce maximum error in tuples estimation after vacuum.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Reduce maximum error in tuples estimation after vacuum.
Re: Reduce maximum error in tuples estimation after vacuum.
List pgsql-hackers
On Wednesday, June 26, 2013 7:40 AM Kyotaro HORIGUCHI wrote:
> I've recovered from messing up.
> 
> <snip>
> > Please let me have a bit of time to diagnose this.
> 
> I was completely messed up and walking on the wrong way. I looked into
> the vacuum for UPDATEs, not DELETE's so it's quite resonable to have
> such results.
> 
> The renewed test script attached shows the verbose output of vacuum
> after the deletes. I had following output from it.
> 
> # I belive this runs for you..
> 
> | INFO: "t": found 989999 removable, 110 nonremovable row
> |       versions in 6308 out of 10829 pages
> 
> On such a case of partially-scanned, lazy_scan_heap() tries to estimate
> resulting num_tuples in vac_estimate_reltuples() assuming the
> uniformity of tuple density, which failes for such a a strong imbalance
> made by bulk updates.
> 
> Do you find any differences between what you will have and the
> following I had?

I could see the same output with your latest script, also I could reproduce
the test if I run the test with individual sql statements.
One of the main point for reproducing individual test was to keep autovacuum
= off.

Now I can look into it further, I have still not gone through in detail
about your new approach to calculate the reltuples, but I am wondering
whether there can be anyway with which estimates can be improved with
different calculation in vac_estimate_reltuples().

One thing I have observed that 2nd parameter is_analyze of
vac_estimate_reltuples() is currently not used.

I cannot work on it till early next week, so others are welcome to join
review.

With Regards,
Amit Kapila.




pgsql-hackers by date:

Previous
From: Maciej Gajewski
Date:
Subject: Re: Review: query result history in psql
Next
From: Marko Kreen
Date:
Subject: Re: MD5 aggregate