Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: ANALYZE sampling is too good
Date
Msg-id 52A637A4.5090801@nasby.net
Whole thread Raw
In response to Re: ANALYZE sampling is too good  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: ANALYZE sampling is too good
List pgsql-hackers
On 12/8/13 1:49 PM, Heikki Linnakangas wrote:
> On 12/08/2013 08:14 PM, Greg Stark wrote:
>> The whole accounts table is 1.2GB and contains 10 million rows. As
>> expected with rows_per_block set to 1 it reads 240MB of that
>> containing nearly 2 million rows (and takes nearly 20s -- doing a full
>> table scan for select count(*) only takes about 5s):
>
> One simple thing we could do, without or in addition to changing the algorithm, is to issue posix_fadvise() calls for
theblocks we're going to read. It should at least be possible to match the speed of a plain sequential scan that way.
 

Hrm... maybe it wouldn't be very hard to use async IO here either? I'm thinking it wouldn't be very hard to do the
stage2 work in the callback routine...
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Bug in VACUUM reporting of "removed %d row versions" in 9.2+
Next
From: Peter Geoghegan
Date:
Subject: Re: ANALYZE sampling is too good