Re: [PoC] pgstattuple2: block sampling to reduce physical read - Mailing list pgsql-hackers

From Satoshi Nagayasu
Subject Re: [PoC] pgstattuple2: block sampling to reduce physical read
Date
Msg-id 52340DDA.9090606@uptime.jp
Whole thread Raw
In response to Re: [PoC] pgstattuple2: block sampling to reduce physical read  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: [PoC] pgstattuple2: block sampling to reduce physical read
List pgsql-hackers
(2013/07/23 20:02), Greg Smith wrote:
> On 7/23/13 2:16 AM, Satoshi Nagayasu wrote:
>> I've been working on new pgstattuple function to allow
>> block sampling [1] in order to reduce block reads while
>> scanning a table. A PoC patch is attached.
> 
> Take a look at all of the messages linked in
> https://commitfest.postgresql.org/action/patch_view?id=778
> 
> Jaime and I tried to do what you're working on then, including a random
> block sampling mechanism modeled on the stats_target mechanism.  We
> didn't do that as part of pgstattuple though, which was a mistake.
> 
> Noah created some test cases as part of his thorough review that were
> not computing the correct results.  Getting the results correct for all
> of the various types of PostgreSQL tables and indexes ended up being
> much harder than the sampling part.  See
> http://www.postgresql.org/message-id/20120222052747.GE8592@tornado.leadboat.com
> in particular for that.

Thanks for the info. I have read the previous discussion.

I'm looking forward to seeing more feedback on this approach,
in terms of design and performance improvement.
So, I have submitted this for the next CF.

>> This new function, pgstattuple2(), samples only 3,000 blocks
>> (which accounts 24MB) from the table randomly, and estimates
>> several parameters of the entire table.
> 
> There should be an input parameter to the function for how much sampling
> to do, and if it's possible to make the scale for it to look like
> ANALYZE that's helpful too.
> 
> I have a project for this summer that includes reviving this topic and
> making sure it works on some real-world systems.  If you want to work on
> this too, I can easily combine that project into what you're doing.

Yeah, I'm interested in that. Something can be shared?

Regards,
-- 
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Review: Patch to compute Max LSN of Data Pages
Next
From: Robert Haas
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE