Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Improving N-Distinct estimation by ANALYZE
Date
Msg-id 20060105195818.GV43311@pervasive.com
Whole thread Raw
In response to Re: Improving N-Distinct estimation by ANALYZE  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
On Thu, Jan 05, 2006 at 10:12:29AM -0500, Greg Stark wrote:
> Worse, my recollection from the paper I mentioned earlier was that sampling
> small percentages like 3-5% didn't get you an acceptable accuracy. Before you
> got anything reliable you found you were sampling very large percentages of
> the table. And note that if you have to sample anything over 10-20% you may as
> well just read the whole table. Random access reads are that much slower.

If I'm reading backend/commands/analyze.c right, the heap is accessed
linearly, only reading blocks that get selected but reading them in heap
order, which shouldn't be anywhere near as bad as random access.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Improving N-Distinct estimation by ANALYZE
Next
From: hubert depesz lubaczewski
Date:
Subject: when can we get better partitioning?