Thread: [RFC] pgstattuple/pgstatindex enhancement

[RFC] pgstattuple/pgstatindex enhancement

From
Satoshi Nagayasu
Date:
Hi,

I'm thinking of pgstattuple/pgstatindex enhancement.

There are a few things I need to change, but I'd like to have
some comments and suggestions from hackers before tackling.

(1) Fix pgstatindex arguments to work same as pgstattuple.

As the document describes, pgstattuple accepts 'schema.table'
expression and oid of the table, but pgstatindex doesn't.
(because I didn't add that when I created pgstatindex...)

http://www.postgresql.org/docs/devel/static/pgstattuple.html

So, I'd like to change pgstatindex arguments to allow
schema name and oid.

Does it make sense?

(2) Enhance pgstattuple/pgstatindex to allow block sampling.

Now, we have large tables and indexes in PostgreSQL, and these are
growing day by day.

pgstattuple and pgstatindex are both very important to keep database
performance well, but doing full-scans on large tables and indexes
would generate big performance impact.

So, now I think pgstattuple and pgstatindex should support
'block sampling' to collect block statistics with avoiding full-scans.

With this block sampling feature, pgstattuple/pgstatindex would be
able to collect block statistics from 1~10% of the blocks in the
table/index if the table/index is large (maybe 10GB or more).

It would allow us to run pgstattuple/pgstatindex easier.

Is it worth having?

Any comments?

Regards,
-- 
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp



Re: [RFC] pgstattuple/pgstatindex enhancement

From
Robert Haas
Date:
On Tue, Feb 12, 2013 at 10:22 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
> (1) Fix pgstatindex arguments to work same as pgstattuple.
>
> As the document describes, pgstattuple accepts 'schema.table'
> expression and oid of the table, but pgstatindex doesn't.
> (because I didn't add that when I created pgstatindex...)
>
> http://www.postgresql.org/docs/devel/static/pgstattuple.html
>
> So, I'd like to change pgstatindex arguments to allow
> schema name and oid.
>
> Does it make sense?

Not sure.  It seems nice, but it's also a backward-compatibility
break.  So I don't know.

> (2) Enhance pgstattuple/pgstatindex to allow block sampling.
>
> Now, we have large tables and indexes in PostgreSQL, and these are
> growing day by day.
>
> pgstattuple and pgstatindex are both very important to keep database
> performance well, but doing full-scans on large tables and indexes
> would generate big performance impact.
>
> So, now I think pgstattuple and pgstatindex should support
> 'block sampling' to collect block statistics with avoiding full-scans.
>
> With this block sampling feature, pgstattuple/pgstatindex would be
> able to collect block statistics from 1~10% of the blocks in the
> table/index if the table/index is large (maybe 10GB or more).

Now that sounds really nice.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [RFC] pgstattuple/pgstatindex enhancement

From
Satoshi Nagayasu
Date:
(2013/02/15 1:55), Robert Haas wrote:
> On Tue, Feb 12, 2013 at 10:22 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
>> (1) Fix pgstatindex arguments to work same as pgstattuple.
>>
>> As the document describes, pgstattuple accepts 'schema.table'
>> expression and oid of the table, but pgstatindex doesn't.
>> (because I didn't add that when I created pgstatindex...)
>>
>> http://www.postgresql.org/docs/devel/static/pgstattuple.html
>>
>> So, I'd like to change pgstatindex arguments to allow
>> schema name and oid.
>>
>> Does it make sense?
> 
> Not sure.  It seems nice, but it's also a backward-compatibility
> break.  So I don't know.

Yeah, actually, the backward-compatibility issue is the first thing
I have considered, and now I think we can keep it.

Now, pgstattuple() function accepts following syntax:
 pgstattuple('table')          -- table name (searches in search_path) pgstattuple('schema.table')   -- schema and
tablename pgstattuple(1234)             -- oid
 

and pgstatindex() function only accepts below so far:
 pgstatindex('index')          -- index name (searches in search_path)

Then, we can easily add new syntax:
 pgstatindex('schema.index')   -- schema and index name pgstatindex(1234)             -- oid

I think this would allow us to modify pgstatindex() without breaking
the backward-compatibility.

>> (2) Enhance pgstattuple/pgstatindex to allow block sampling.
>>
>> Now, we have large tables and indexes in PostgreSQL, and these are
>> growing day by day.
>>
>> pgstattuple and pgstatindex are both very important to keep database
>> performance well, but doing full-scans on large tables and indexes
>> would generate big performance impact.
>>
>> So, now I think pgstattuple and pgstatindex should support
>> 'block sampling' to collect block statistics with avoiding full-scans.
>>
>> With this block sampling feature, pgstattuple/pgstatindex would be
>> able to collect block statistics from 1~10% of the blocks in the
>> table/index if the table/index is large (maybe 10GB or more).
>
> Now that sounds really nice.

Thanks. I will try it.

Regards,
-- 
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp