Thread: Row level stats
It would be nice if PGAvd could receive row level stats without a large hit to simple queries. Ran a simple test. Calling pgstat_report_tabstat() at a frequency of once per second reduces the time taken for row level stats to be negligible: 500k select TRUE statements took: 6:50 with stats off 8:35 with row level stats on 6:52 with the below applied and row level stats on Anyone object to delaying the submission of stats by that timeframe? Are row level stats enough for PGAvd? More or less the below for the change on line 1912 in postgres.c currtme = GetCurrentAbsoluteTime(); if (lasttme < currtme) {pgstat_report_tabstat();lasttme = currtme; } -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Rod Taylor <rbt@rbt.ca> writes: > [ optimizing for small frequent queries ] What if the client doesn't come back with another query for awhile? regards, tom lane
On Thu, 2003-03-06 at 01:13, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > > [ optimizing for small frequent queries ] > > What if the client doesn't come back with another query for awhile? Yup.. Thats why I said you basically lose 1 seconds worth of stats (or rather can't count on when they arrive). But, that doesn't matter for most cases as you cannot cause a lot of trouble with 1 seconds worth of activity. That said, I'd make the actual value a GUC. However, having the numbers off by a little (the %age they're off by decreases with increased activity) is still a lot more useful than not having any numbers at all. Anyway, the basic proposal would be to change the on / off flags for row & block level into a 'stats delay'. A large number would be off, 0 would have the same properties as today. Default to be determined (but 1 second seems to be plenty). -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
On Thu, 2003-03-06 at 09:16, Rod Taylor wrote: > Anyway, the basic proposal would be to change the on / off flags for row > & block level into a 'stats delay'. A large number would be off, 0 > would have the same properties as today. Default to be determined (but > 1 second seems to be plenty). "A large number" seems messy. Why not "-1"? I also don't really like using "0" to mean "enabled", but I can't think of anything better... And the default should be to disable row-level stats, naturally. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
On Thu, 2003-03-06 at 11:40, Neil Conway wrote: > On Thu, 2003-03-06 at 09:16, Rod Taylor wrote: > > Anyway, the basic proposal would be to change the on / off flags for row > > & block level into a 'stats delay'. A large number would be off, 0 > > would have the same properties as today. Default to be determined (but > > 1 second seems to be plenty). > > "A large number" seems messy. Why not "-1"? I also don't really like > using "0" to mean "enabled", but I can't think of anything better... > > And the default should be to disable row-level stats, naturally. Not if we expect pg_autovacuum to be able to work. It *needs* this information. I really want an auto-vacuum in the backend, and figured the first step was to fix stats so they don't have such a high cost. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
On Thu, 2003-03-06 at 11:43, Rod Taylor wrote: > Not if we expect pg_autovacuum to be able to work [ by default ] I don't regard that as a very high priority at this stage of the game: since I think pg_avd should be in contrib/ for the next release at the very least, I don't think anyone capable of installing a contrib/ module will gripe about tweaking a GUC parameter. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Rod Taylor <rbt@rbt.ca> writes: > Not if we expect pg_autovacuum to be able to work. It *needs* this > information. Why? The implementation I've had in mind for autovacuum would rely on FSM not the pg_stats daemon. regards, tom lane
On Thu, 2003-03-06 at 11:49, Tom Lane wrote: > Rod Taylor <rbt@rbt.ca> writes: > > Not if we expect pg_autovacuum to be able to work. It *needs* this > > information. > > Why? > > The implementation I've had in mind for autovacuum would rely on FSM not > the pg_stats daemon. FSM as I understand it gets it information from vacuum, and has an space marker removed when an insert uses the tuple. Determining when vacuum runs would be based more on Deletes / updates than anything wouldn't it? Anyhow, I'd be curious to know your full thoughts on autovacuum and how it may work. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Rod Taylor <rbt@rbt.ca> writes: > On Thu, 2003-03-06 at 11:49, Tom Lane wrote: >> The implementation I've had in mind for autovacuum would rely on FSM not >> the pg_stats daemon. > FSM as I understand it gets it information from vacuum, and has an space > marker removed when an insert uses the tuple. FSM doesn't, but easily could, keep track of the number of space requests it gets. We could also add some code to use it to keep count of tuple deletions. Unlike the pg_stats daemon, FSM doesn't necessarily try to track every single table --- but it would track all the heavily-used ones, which is plenty for AVD purposes. I tend to think of pg_stats as a bit of a failed experiment. The overhead is excessive for what it accomplishes. Maybe we should rethink the communication mechanism ... although I have no immediate ideas for what else to do. regards, tom lane