Thread: : Tracking Full Table Scans
Hello Everyone,
I am preparing a plan to track the tables undergoing Full Table Scans for most number of times.
If i track seq_scan from the pg_stat_user_tables, will that help (considering the latest analyzed ones) ?
Please help !
Thanks
VB
Venkat Balaji <venkat.balaji@verse.in> wrote: > I am preparing a plan to track the tables undergoing Full Table > Scans for most number of times. > > If i track seq_scan from the pg_stat_user_tables, will that help > (considering the latest analyzed ones) ? Well, yeah; but be careful not to assume that a sequential scan is always a bad thing. Here's our top ten tables for sequential scans in a database which is performing quite well: cc=> select seq_scan, n_live_tup, relname cc-> from pg_stat_user_tables cc-> order by seq_scan desc cc-> limit 10; seq_scan | n_live_tup | relname ----------+------------+-------------------- 81264339 | 20 | MaintCode 16840299 | 3 | DbTranImageStatus 14905181 | 18 | ControlFeature 11908114 | 10 | AgingBoundary 8789288 | 22 | CtofcTypeCode 7786110 | 6 | PrefCounty 6303959 | 9 | ProtOrderHistEvent 5835430 | 1 | ControlRecord 5466806 | 1 | ControlAccounting 5202028 | 12 | ProtEventOrderType (10 rows) You'll notice that they are all very small tables. In all cases the entire heap fits in one page, so any form of indexed scan would at least double the number of pages visited, and slow things down. If you have queries which are not performing to expectations, your best bet might be to pick one of them and post it here, following the advice on this page: http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin
Thanks a lot Kevin !!
Yes. I intended to track full table scans first to ensure that only small tables or tables with very less pages are (as you said) getting scanned full.
I am yet to identify slow running queries. Will surely hit back with them in future.
Thanks
VB
On Tue, Sep 27, 2011 at 8:02 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Venkat Balaji <venkat.balaji@verse.in> wrote:Well, yeah; but be careful not to assume that a sequential scan is
> I am preparing a plan to track the tables undergoing Full Table
> Scans for most number of times.
>
> If i track seq_scan from the pg_stat_user_tables, will that help
> (considering the latest analyzed ones) ?
always a bad thing. Here's our top ten tables for sequential scans
in a database which is performing quite well:
cc=> select seq_scan, n_live_tup, relname
cc-> from pg_stat_user_tables
cc-> order by seq_scan desc
cc-> limit 10;
seq_scan | n_live_tup | relname
----------+------------+--------------------
81264339 | 20 | MaintCode
16840299 | 3 | DbTranImageStatus
14905181 | 18 | ControlFeature
11908114 | 10 | AgingBoundary
8789288 | 22 | CtofcTypeCode
7786110 | 6 | PrefCounty
6303959 | 9 | ProtOrderHistEvent
5835430 | 1 | ControlRecord
5466806 | 1 | ControlAccounting
5202028 | 12 | ProtEventOrderType
(10 rows)
You'll notice that they are all very small tables. In all cases the
entire heap fits in one page, so any form of indexed scan would at
least double the number of pages visited, and slow things down.
If you have queries which are not performing to expectations, your
best bet might be to pick one of them and post it here, following
the advice on this page:
http://wiki.postgresql.org/wiki/SlowQueryQuestions
-Kevin
I would like to know the difference between "n_tup_upd" and "n_tup_hot_upd".
Thanks
VB
On Tue, Sep 27, 2011 at 9:56 PM, Venkat Balaji <venkat.balaji@verse.in> wrote:
Thanks a lot Kevin !!Yes. I intended to track full table scans first to ensure that only small tables or tables with very less pages are (as you said) getting scanned full.I am yet to identify slow running queries. Will surely hit back with them in future.ThanksVBOn Tue, Sep 27, 2011 at 8:02 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:Venkat Balaji <venkat.balaji@verse.in> wrote:Well, yeah; but be careful not to assume that a sequential scan is
> I am preparing a plan to track the tables undergoing Full Table
> Scans for most number of times.
>
> If i track seq_scan from the pg_stat_user_tables, will that help
> (considering the latest analyzed ones) ?
always a bad thing. Here's our top ten tables for sequential scans
in a database which is performing quite well:
cc=> select seq_scan, n_live_tup, relname
cc-> from pg_stat_user_tables
cc-> order by seq_scan desc
cc-> limit 10;
seq_scan | n_live_tup | relname
----------+------------+--------------------
81264339 | 20 | MaintCode
16840299 | 3 | DbTranImageStatus
14905181 | 18 | ControlFeature
11908114 | 10 | AgingBoundary
8789288 | 22 | CtofcTypeCode
7786110 | 6 | PrefCounty
6303959 | 9 | ProtOrderHistEvent
5835430 | 1 | ControlRecord
5466806 | 1 | ControlAccounting
5202028 | 12 | ProtEventOrderType
(10 rows)
You'll notice that they are all very small tables. In all cases the
entire heap fits in one page, so any form of indexed scan would at
least double the number of pages visited, and slow things down.
If you have queries which are not performing to expectations, your
best bet might be to pick one of them and post it here, following
the advice on this page:
http://wiki.postgresql.org/wiki/SlowQueryQuestions
-Kevin
Venkat Balaji <venkat.balaji@verse.in> wrote: > I would like to know the difference between "n_tup_upd" and > "n_tup_hot_upd". A HOT update is used when none of the updated columns are used in an index and there is room for the new tuple (version of the row) on the same page as the old tuple. This is faster for a number of reasons, and cleanup of the old tuple is a little different. If you want the gory implementation details, take a look at this file in the source tree: src/backend/access/heap/README.HOT -Kevin
On 09/28/2011 12:26 AM, Venkat Balaji wrote: > Thanks a lot Kevin !! > > Yes. I intended to track full table scans first to ensure that only > small tables or tables with very less pages are (as you said) getting > scanned full. It can also be best to do a full table scan of a big table for some queries. If the query needs to touch all the data in a table - for example, for an aggregate - then the query will often complete fastest and with less disk use by using a sequential scan. I guess what you'd really want to know is to find out about queries that do seqscans to match relatively small fractions of the total tuples scanned, ie low-selectivity seqscans. I'm not sure whether or not it's possible to gather this data with PostgreSQL's current level of stats detail. -- Craig Ringer
Thanks Kevin !!
I will have a look at the source tree.
Regards
VB
On Tue, Sep 27, 2011 at 10:45 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
> I would like to know the difference between "n_tup_upd" andA HOT update is used when none of the updated columns are used in an
> "n_tup_hot_upd".
index and there is room for the new tuple (version of the row) on
the same page as the old tuple. This is faster for a number of
reasons, and cleanup of the old tuple is a little different.
If you want the gory implementation details, take a look at this
file in the source tree:
src/backend/access/heap/README.HOT
-Kevin
Yes. I am looking for the justified full table scans.
If bigger tables are getting scanned, I would like to know %age rows scanned against %age rows as the output.
If the query needs 80% of the rows as the output, then a full table scan is always better.
I believe there is a possibility for this in Postgres. I think we can get this using pg_stat_user_table, pg_statio_user_tables and pg_stats.
I will post the calculation once it i get it.
Thanks
VB
On Wed, Sep 28, 2011 at 6:25 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
On 09/28/2011 12:26 AM, Venkat Balaji wrote:It can also be best to do a full table scan of a big table for some queries. If the query needs to touch all the data in a table - for example, for an aggregate - then the query will often complete fastest and with less disk use by using a sequential scan.Thanks a lot Kevin !!
Yes. I intended to track full table scans first to ensure that only
small tables or tables with very less pages are (as you said) getting
scanned full.
I guess what you'd really want to know is to find out about queries that do seqscans to match relatively small fractions of the total tuples scanned, ie low-selectivity seqscans. I'm not sure whether or not it's possible to gather this data with PostgreSQL's current level of stats detail.
--
Craig Ringer