Home > mailing lists

Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

From	Jim Nasby
Subject	Re: ANALYZE sampling is too good
Date	December 9, 2013 21:35:49
Msg-id	52A637A4.5090801@nasby.net Whole thread Raw
In response to	Re: ANALYZE sampling is too good (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses	Re: ANALYZE sampling is too good
List	pgsql-hackers

Tree view

On 12/8/13 1:49 PM, Heikki Linnakangas wrote:
> On 12/08/2013 08:14 PM, Greg Stark wrote:
>> The whole accounts table is 1.2GB and contains 10 million rows. As
>> expected with rows_per_block set to 1 it reads 240MB of that
>> containing nearly 2 million rows (and takes nearly 20s -- doing a full
>> table scan for select count(*) only takes about 5s):
>
> One simple thing we could do, without or in addition to changing the algorithm, is to issue posix_fadvise() calls for
theblocks we're going to read. It should at least be possible to match the speed of a plain sequential scan that way.
 

Hrm... maybe it wouldn't be very hard to use async IO here either? I'm thinking it wouldn't be very hard to do the
stage2 work in the callback routine...
 
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

pgsql-hackers by date:

From: Bruce Momjian
Date: 09 December 2013, 21:24:42
Subject: Re: Bug in VACUUM reporting of "removed %d row versions" in 9.2+

From: Peter Geoghegan
Date: 09 December 2013, 21:40:21
Subject: Re: ANALYZE sampling is too good

Re: ANALYZE sampling is too good - Mailing list pgsql-hackers

Previous

Next