Thread: Row level stats

Row level stats

From
Rod Taylor
Date:
It would be nice if PGAvd could receive row level stats without a large
hit to simple queries.

Ran a simple test.  Calling pgstat_report_tabstat() at a frequency of
once per second reduces the time taken for row level stats to be
negligible:

500k select TRUE statements took:
6:50 with stats off
8:35 with row level stats on
6:52 with the below applied and row level stats on

Anyone object to delaying the submission of stats by that timeframe?
Are row level stats enough for PGAvd?


More or less the below for the change on line 1912 in postgres.c

currtme = GetCurrentAbsoluteTime();
if (lasttme < currtme)
{pgstat_report_tabstat();lasttme = currtme;
}

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Row level stats

From
Tom Lane
Date:
Rod Taylor <rbt@rbt.ca> writes:
> [ optimizing for small frequent queries ]

What if the client doesn't come back with another query for awhile?
        regards, tom lane


Re: Row level stats

From
Rod Taylor
Date:
On Thu, 2003-03-06 at 01:13, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> > [ optimizing for small frequent queries ]
>
> What if the client doesn't come back with another query for awhile?

Yup.. Thats why I said you basically lose 1 seconds worth of stats (or
rather can't count on when they arrive).

But, that doesn't matter for most cases as you cannot cause a lot of
trouble with 1 seconds worth of activity.  That said, I'd make the
actual value a GUC.

However, having the numbers off by a little (the %age they're off by
decreases with increased activity) is still a lot more useful than not
having any numbers at all.


Anyway, the basic proposal would be to change the on / off flags for row
& block level into a 'stats delay'.  A large number would be off, 0
would have the same properties as today.  Default to be determined (but
1 second seems to be plenty).

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Row level stats

From
Neil Conway
Date:
On Thu, 2003-03-06 at 09:16, Rod Taylor wrote:
> Anyway, the basic proposal would be to change the on / off flags for row
> & block level into a 'stats delay'.  A large number would be off, 0
> would have the same properties as today.  Default to be determined (but
> 1 second seems to be plenty).

"A large number" seems messy. Why not "-1"? I also don't really like
using "0" to mean "enabled", but I can't think of anything better...

And the default should be to disable row-level stats, naturally.

Cheers,

Neil
-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Row level stats

From
Rod Taylor
Date:
On Thu, 2003-03-06 at 11:40, Neil Conway wrote:
> On Thu, 2003-03-06 at 09:16, Rod Taylor wrote:
> > Anyway, the basic proposal would be to change the on / off flags for row
> > & block level into a 'stats delay'.  A large number would be off, 0
> > would have the same properties as today.  Default to be determined (but
> > 1 second seems to be plenty).
>
> "A large number" seems messy. Why not "-1"? I also don't really like
> using "0" to mean "enabled", but I can't think of anything better...
>
> And the default should be to disable row-level stats, naturally.

Not if we expect pg_autovacuum to be able to work. It *needs* this
information.  I really want an auto-vacuum in the backend, and figured
the first step was to fix stats so they don't have such a high cost.

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Row level stats

From
Neil Conway
Date:
On Thu, 2003-03-06 at 11:43, Rod Taylor wrote:
> Not if we expect pg_autovacuum to be able to work [ by default ]

I don't regard that as a very high priority at this stage of the game:
since I think pg_avd should be in contrib/ for the next release at the
very least, I don't think anyone capable of installing a contrib/ module
will gripe about tweaking a GUC parameter.

Cheers,

Neil
-- 
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC





Re: Row level stats

From
Tom Lane
Date:
Rod Taylor <rbt@rbt.ca> writes:
> Not if we expect pg_autovacuum to be able to work. It *needs* this
> information.

Why?

The implementation I've had in mind for autovacuum would rely on FSM not
the pg_stats daemon.
        regards, tom lane


Re: Row level stats

From
Rod Taylor
Date:
On Thu, 2003-03-06 at 11:49, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> > Not if we expect pg_autovacuum to be able to work. It *needs* this
> > information.
>
> Why?
>
> The implementation I've had in mind for autovacuum would rely on FSM not
> the pg_stats daemon.

FSM as I understand it gets it information from vacuum, and has an space
marker removed when an insert uses the tuple.

Determining when vacuum runs would be based more on Deletes / updates
than anything wouldn't it?


Anyhow, I'd be curious to know your full thoughts on autovacuum and how
it may work.

--
Rod Taylor <rbt@rbt.ca>

PGP Key: http://www.rbt.ca/rbtpub.asc

Re: Row level stats

From
Tom Lane
Date:
Rod Taylor <rbt@rbt.ca> writes:
> On Thu, 2003-03-06 at 11:49, Tom Lane wrote:
>> The implementation I've had in mind for autovacuum would rely on FSM not
>> the pg_stats daemon.

> FSM as I understand it gets it information from vacuum, and has an space
> marker removed when an insert uses the tuple.

FSM doesn't, but easily could, keep track of the number of space
requests it gets.  We could also add some code to use it to keep count
of tuple deletions.  Unlike the pg_stats daemon, FSM doesn't necessarily
try to track every single table --- but it would track all the
heavily-used ones, which is plenty for AVD purposes.

I tend to think of pg_stats as a bit of a failed experiment.  The
overhead is excessive for what it accomplishes.  Maybe we should rethink
the communication mechanism ... although I have no immediate ideas for
what else to do.
        regards, tom lane