Home > mailing lists

Re: [PoC] pgstattuple2: block sampling to reduce physical read - Mailing list pgsql-hackers

From	Satoshi Nagayasu
Subject	Re: [PoC] pgstattuple2: block sampling to reduce physical read
Date	September 14, 2013 10:19:05
Msg-id	52340DDA.9090606@uptime.jp Whole thread Raw
In response to	Re: [PoC] pgstattuple2: block sampling to reduce physical read (Greg Smith <greg@2ndQuadrant.com>)
Responses	Re: [PoC] pgstattuple2: block sampling to reduce physical read
List	pgsql-hackers

Tree view

(2013/07/23 20:02), Greg Smith wrote:
> On 7/23/13 2:16 AM, Satoshi Nagayasu wrote:
>> I've been working on new pgstattuple function to allow
>> block sampling [1] in order to reduce block reads while
>> scanning a table. A PoC patch is attached.
> 
> Take a look at all of the messages linked in
> https://commitfest.postgresql.org/action/patch_view?id=778
> 
> Jaime and I tried to do what you're working on then, including a random
> block sampling mechanism modeled on the stats_target mechanism.  We
> didn't do that as part of pgstattuple though, which was a mistake.
> 
> Noah created some test cases as part of his thorough review that were
> not computing the correct results.  Getting the results correct for all
> of the various types of PostgreSQL tables and indexes ended up being
> much harder than the sampling part.  See
> http://www.postgresql.org/message-id/20120222052747.GE8592@tornado.leadboat.com
> in particular for that.

Thanks for the info. I have read the previous discussion.

I'm looking forward to seeing more feedback on this approach,
in terms of design and performance improvement.
So, I have submitted this for the next CF.

>> This new function, pgstattuple2(), samples only 3,000 blocks
>> (which accounts 24MB) from the table randomly, and estimates
>> several parameters of the entire table.
> 
> There should be an input parameter to the function for how much sampling
> to do, and if it's possible to make the scale for it to look like
> ANALYZE that's helpful too.
> 
> I have a project for this summer that includes reviving this topic and
> making sure it works on some real-world systems.  If you want to work on
> this too, I can easily combine that project into what you're doing.

Yeah, I'm interested in that. Something can be shared?

Regards,
-- 
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

pgsql-hackers by date:

From: Amit Kapila
Date: 14 September 2013, 10:03:30
Subject: Re: Review: Patch to compute Max LSN of Data Pages

From: Robert Haas
Date: 14 September 2013, 10:22:48
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE

Re: [PoC] pgstattuple2: block sampling to reduce physical read - Mailing list pgsql-hackers

Previous

Next