Thread: [RFC] pgstattuple/pgstatindex enhancement
Hi, I'm thinking of pgstattuple/pgstatindex enhancement. There are a few things I need to change, but I'd like to have some comments and suggestions from hackers before tackling. (1) Fix pgstatindex arguments to work same as pgstattuple. As the document describes, pgstattuple accepts 'schema.table' expression and oid of the table, but pgstatindex doesn't. (because I didn't add that when I created pgstatindex...) http://www.postgresql.org/docs/devel/static/pgstattuple.html So, I'd like to change pgstatindex arguments to allow schema name and oid. Does it make sense? (2) Enhance pgstattuple/pgstatindex to allow block sampling. Now, we have large tables and indexes in PostgreSQL, and these are growing day by day. pgstattuple and pgstatindex are both very important to keep database performance well, but doing full-scans on large tables and indexes would generate big performance impact. So, now I think pgstattuple and pgstatindex should support 'block sampling' to collect block statistics with avoiding full-scans. With this block sampling feature, pgstattuple/pgstatindex would be able to collect block statistics from 1~10% of the blocks in the table/index if the table/index is large (maybe 10GB or more). It would allow us to run pgstattuple/pgstatindex easier. Is it worth having? Any comments? Regards, -- Satoshi Nagayasu <snaga@uptime.jp> Uptime Technologies, LLC. http://www.uptime.jp
On Tue, Feb 12, 2013 at 10:22 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote: > (1) Fix pgstatindex arguments to work same as pgstattuple. > > As the document describes, pgstattuple accepts 'schema.table' > expression and oid of the table, but pgstatindex doesn't. > (because I didn't add that when I created pgstatindex...) > > http://www.postgresql.org/docs/devel/static/pgstattuple.html > > So, I'd like to change pgstatindex arguments to allow > schema name and oid. > > Does it make sense? Not sure. It seems nice, but it's also a backward-compatibility break. So I don't know. > (2) Enhance pgstattuple/pgstatindex to allow block sampling. > > Now, we have large tables and indexes in PostgreSQL, and these are > growing day by day. > > pgstattuple and pgstatindex are both very important to keep database > performance well, but doing full-scans on large tables and indexes > would generate big performance impact. > > So, now I think pgstattuple and pgstatindex should support > 'block sampling' to collect block statistics with avoiding full-scans. > > With this block sampling feature, pgstattuple/pgstatindex would be > able to collect block statistics from 1~10% of the blocks in the > table/index if the table/index is large (maybe 10GB or more). Now that sounds really nice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
(2013/02/15 1:55), Robert Haas wrote: > On Tue, Feb 12, 2013 at 10:22 AM, Satoshi Nagayasu <snaga@uptime.jp> wrote: >> (1) Fix pgstatindex arguments to work same as pgstattuple. >> >> As the document describes, pgstattuple accepts 'schema.table' >> expression and oid of the table, but pgstatindex doesn't. >> (because I didn't add that when I created pgstatindex...) >> >> http://www.postgresql.org/docs/devel/static/pgstattuple.html >> >> So, I'd like to change pgstatindex arguments to allow >> schema name and oid. >> >> Does it make sense? > > Not sure. It seems nice, but it's also a backward-compatibility > break. So I don't know. Yeah, actually, the backward-compatibility issue is the first thing I have considered, and now I think we can keep it. Now, pgstattuple() function accepts following syntax: pgstattuple('table') -- table name (searches in search_path) pgstattuple('schema.table') -- schema and tablename pgstattuple(1234) -- oid and pgstatindex() function only accepts below so far: pgstatindex('index') -- index name (searches in search_path) Then, we can easily add new syntax: pgstatindex('schema.index') -- schema and index name pgstatindex(1234) -- oid I think this would allow us to modify pgstatindex() without breaking the backward-compatibility. >> (2) Enhance pgstattuple/pgstatindex to allow block sampling. >> >> Now, we have large tables and indexes in PostgreSQL, and these are >> growing day by day. >> >> pgstattuple and pgstatindex are both very important to keep database >> performance well, but doing full-scans on large tables and indexes >> would generate big performance impact. >> >> So, now I think pgstattuple and pgstatindex should support >> 'block sampling' to collect block statistics with avoiding full-scans. >> >> With this block sampling feature, pgstattuple/pgstatindex would be >> able to collect block statistics from 1~10% of the blocks in the >> table/index if the table/index is large (maybe 10GB or more). > > Now that sounds really nice. Thanks. I will try it. Regards, -- Satoshi Nagayasu <snaga@uptime.jp> Uptime Technologies, LLC. http://www.uptime.jp