Thread: pg_stat_transaction patch

pg_stat_transaction patch

From
Joel Jacobson
Date:
Hi,

I propose a set of new statistics functions and system views.

I need these functions in order to do automated testing of our system, consisting of hundreds of stored procedures in plpgsql.
My plan is to develop some additional functions to pgTAP, benefiting from the new system tables I've added.

The patch should apply to 9.0beta or HEAD, but I created it using 8.4.3 because that's the version I'm using.

I'm thankful for your feedback.

My apologies if the packaging of the patch does not conform to your guidelines, feedback on this is also welcome.

-- 
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden


README:

Background
==========
The views pg_stat_user_tables and pg_stat_user_functions shows statistics on tables and functions.
The underlying functions named pg_stat_get_* fetches recent data from the statistics collector, and returns the requested value for the given "oid" (i.e. "tableid/relationid" or "functionid").
In the end of each transaction[1], the collected statistics are sent to the statistics collector[2].

[1] upon COMMIT/ROLLBACK, or a bit later (the report frequency is controlled by the PGSTAT_STAT_INTERVAL setting, default value is 500 ms)
[2] if you do a ps aux, it is the process named "postgres: stats collector process"

Problem
=======
Within a current transaction, there was no way of accessing the internal data structures which contains the so far collected statistics.
I wanted to check exactly what data changes my functions made and what functions they called, without having to commit the transaction
and without mixing the statistics data with all the other simultaneously running transactions.

Solution
========
I have exported get accessor methods to the internal data structure containing so far collected statistics for the current transaction.

I have also exported the method pgstat_report_stat to make it possible to force a "report and reset" of the so far collected statistics.
This was necessary to avoid not-yet-reported statistics for a previous transaction to affect the current transaction.

I used the unused_oids script to find unused oids and choosed the range between 3030-3044 for the new functions.

Functions
=========
test=# \df+ pg_catalog.pg_stat_get_transaction_*
                                                                                                                        List of functions
   Schema   |                    Name                    | Result data type | Argument data types |  Type  | Volatility | Owner | Language |                Source code                 |                               Description                               
------------+--------------------------------------------+------------------+---------------------+--------+------------+-------+----------+--------------------------------------------+-------------------------------------------------------------------------
 pg_catalog | pg_stat_get_transaction_blocks_fetched     | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_blocks_fetched     | statistics: number of blocks fetched in current transaction
 pg_catalog | pg_stat_get_transaction_blocks_hit         | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_blocks_hit         | statistics: number of blocks found in cache in current transaction
 pg_catalog | pg_stat_get_transaction_dead_tuples        | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_dead_tuples        | statistics: number of dead tuples in current transaction
 pg_catalog | pg_stat_get_transaction_function_calls     | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_function_calls     | statistics: number of function calls in current transaction
 pg_catalog | pg_stat_get_transaction_function_self_time | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_function_self_time | statistics: self execution time of function in current transaction
 pg_catalog | pg_stat_get_transaction_function_time      | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_function_time      | statistics: execution time of function in current transaction
 pg_catalog | pg_stat_get_transaction_live_tuples        | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_live_tuples        | statistics: number of live tuples in current transaction
 pg_catalog | pg_stat_get_transaction_numscans           | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_numscans           | statistics: number of scans done for table/index in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_deleted     | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_tuples_deleted     | statistics: number of tuples deleted in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_fetched     | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_tuples_fetched     | statistics: number of tuples fetched by idxscan in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_hot_updated | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_tuples_hot_updated | statistics: number of tuples hot updated in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_inserted    | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_tuples_inserted    | statistics: number of tuples inserted in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_returned    | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_tuples_returned    | statistics: number of tuples read by seqscan in current transaction
 pg_catalog | pg_stat_get_transaction_tuples_updated     | bigint           | oid                 | normal | stable     | joel  | internal | pg_stat_get_transaction_tuples_updated     | statistics: number of tuples updated in current transaction
(14 rows)

I also had to create a new internal function, "get_funcstat_entry".
This function find or create a PgStat_BackendFunctionEntry entry for the given oid (functionid).
The name and behaviour is similar to the existing function "get_tabstat_entry".

System views
============
pg_stat_transaction_tables - shows so far collected table statistics for the current transaction (almost identical structure as pg_stat_user_tables, but lacks the last_* columns)
pg_stat_transaction_functions - shows so far collected function statistics for the current transaction (identical structure as pg_stat_user_functions)

Test/Use case
=============


Patched files
=============
/doc/src/sgml/monitoring.sgml
/src/backend/catalog/system_views.sql
/src/backend/postmaster/pgstat.c
/src/backend/utils/adt/pgstatfuncs.c
/src/include/catalog/pg_proc.h
/src/include/pgstat.h

Attachment

Re: pg_stat_transaction patch

From
Alvaro Herrera
Date:
Excerpts from Joel Jacobson's message of jue may 06 09:51:41 -0400 2010:
> Hi,
> 
> I propose a set of new statistics functions and system views.

Hi,

Please add your patch to the next commitfest: http://commitfest.postgresql.org

Thanks
-- 


Re: pg_stat_transaction patch

From
Takahiro Itagaki
Date:
Joel Jacobson <joel@gluefinance.com> wrote:

> I propose a set of new statistics functions and system views.
>
> I need these functions in order to do automated testing of our system,
> consisting of hundreds of stored procedures in plpgsql.
> My plan is to develop some additional functions to pgTAP, benefiting from
> the new system tables I've added.

I ported your patch into 9.0beta, but it doesn't work well.
I had two assertion failures from the run.sql:

TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: "pgstat.c", Line: 715)
TRAP: FailedAssertion("!(tabstat->trans == trans)", File: "pgstat.c", Line: 1756)

Also, pg_stat_transaction_functions returned no rows from the test case even
after I removed those assertions. There are no rows in your test/run.out, too.

I like your idea itself, but more works are required for the implementation.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


Attachment

Re: pg_stat_transaction patch

From
Joel Jacobson
Date:
Hajimemashite Takahiro,

Thanks for your feedback.

I applied all the changes on 9.0beta manually and then it compiled without any assertion failures.

I also changed the oids to a different unused range, since the ones I used before had been taken in 9.0beta1.

There are still some problems though. I get 0 back from the functions supposed to return the number of inserts/updates for the current transaction.

I suspect it is because get_tabstat_entry for some reason returns NULL, in for example pg_stat_get_transaction_tuples_inserted(PG_FUNCTION_ARGS).

Does the function look valid? If you can find the error in it, the other functions probably have the same problem.

It is strange though the function "pg_stat_get_transaction_numscans" works fine, and it looks like it works the same way.

I added run.out843 and run.out90b1, showing the output from both patched versions.

run.out843 is the intended output, while run.out90b1 gives 0 on the columns n_tup_ins and n_tup_upd (and probably n_tup_del etc also).

I hope someone can help locating the problem.

Thanks.

Best regards,

Joel

2010/5/7 Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp>

Joel Jacobson <joel@gluefinance.com> wrote:

> I propose a set of new statistics functions and system views.
>
> I need these functions in order to do automated testing of our system,
> consisting of hundreds of stored procedures in plpgsql.
> My plan is to develop some additional functions to pgTAP, benefiting from
> the new system tables I've added.

I ported your patch into 9.0beta, but it doesn't work well.
I had two assertion failures from the run.sql:

TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: "pgstat.c", Line: 715)
TRAP: FailedAssertion("!(tabstat->trans == trans)", File: "pgstat.c", Line: 1756)

Also, pg_stat_transaction_functions returned no rows from the test case even
after I removed those assertions. There are no rows in your test/run.out, too.

I like your idea itself, but more works are required for the implementation.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Attachment

Re: pg_stat_transaction patch

From
Takahiro Itagaki
Date:
Joel Jacobson <joel@gluefinance.com> wrote:

> I applied all the changes on 9.0beta manually and then it compiled without
> any assertion failures.
> 
> I also changed the oids to a different unused range, since the ones I used
> before had been taken in 9.0beta1.

Thanks, but you still need to test your patch:
- You need to check your patch with "make check", because it requires  adjustments in "rule" test; Your
pg_stat_transaction_functionis the  longest name in the system catalog.
 
- You need to configure postgres with --enable-cassert to enable internal  varidations. The attached test case failed
withthe following TRAP.
 
TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: "pgstat.c", Line: 715)
TRAP: FailedAssertion("!(tabstat->trans == trans)", File: "pgstat.c", Line: 1758)

> I suspect it is because get_tabstat_entry for some reason returns NULL, in
> for example pg_stat_get_transaction_tuples_inserted(PG_FUNCTION_ARGS).
> 
> Does the function look valid? If you can find the error in it, the other
> functions probably have the same problem.

For the above trap, we can see the comment:   /* Shouldn't have any pending transaction-dependent counts */
We don't expect to read stats entries during transactions. I'm not sure
whether accessing transitional stats during transaction is safe or not.

We might need to go other directions, for example: - Use "session stats" instead "transaction stats". You can see the
same  information in difference of counters between before and after the   transaction. - Export pgBufferUsage instead
ofrelation counters. They are   buffer counters for all relations, but we can obviously export   them because they are
justplain variables.
 

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center




Re: pg_stat_transaction patch

From
Joel Jacobson
Date:
Hi Takahiro,

Here is an updated version of the patch.

Thanks Magnus H for the help :)

1.4: Ported to head. Updated tests. Removed pg_stat_report.


2010/5/25 Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp>

Joel Jacobson <joel@gluefinance.com> wrote:

> I applied all the changes on 9.0beta manually and then it compiled without
> any assertion failures.
>
> I also changed the oids to a different unused range, since the ones I used
> before had been taken in 9.0beta1.

Thanks, but you still need to test your patch:

 - You need to check your patch with "make check", because it requires
  adjustments in "rule" test; Your pg_stat_transaction_function is the
  longest name in the system catalog.

 - You need to configure postgres with --enable-cassert to enable internal
  varidations. The attached test case failed with the following TRAP.
TRAP: FailedAssertion("!(entry->trans == ((void *)0))", File: "pgstat.c", Line: 715)
TRAP: FailedAssertion("!(tabstat->trans == trans)", File: "pgstat.c", Line: 1758)

> I suspect it is because get_tabstat_entry for some reason returns NULL, in
> for example pg_stat_get_transaction_tuples_inserted(PG_FUNCTION_ARGS).
>
> Does the function look valid? If you can find the error in it, the other
> functions probably have the same problem.

For the above trap, we can see the comment:
   /* Shouldn't have any pending transaction-dependent counts */
We don't expect to read stats entries during transactions. I'm not sure
whether accessing transitional stats during transaction is safe or not.

We might need to go other directions, for example:
 - Use "session stats" instead "transaction stats". You can see the same
   information in difference of counters between before and after the
   transaction.
 - Export pgBufferUsage instead of relation counters. They are
   buffer counters for all relations, but we can obviously export
   them because they are just plain variables.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center





--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Attachment

Re: pg_stat_transaction patch

From
Itagaki Takahiro
Date:
"Accessor functions to get so far collected statistics for the current
transaction"
https://commitfest.postgresql.org/action/patch_view?id=301

The latest version of the patch works as expected, and also well-formed.
I'll mark the patch to "Ready for Committer".
http://archives.postgresql.org/message-id/AANLkTikSYdRwATGAP5U6O6zwIO4b_WNJXIbUd2y2tI01@mail.gmail.com

The added views and functions only accumulates activity counters
in the same transaction where the stat views are queried. So we
need to check the view before COMMIT or ROLLBACK.

The only issue in the patch is too long view and function names: - pg_stat_transaction_user_tables (31 chars) -
pg_stat_get_transaction_tuples_hot_updated(42 chars) - pg_stat_get_transaction_function_self_time (42 chars)
 

Since we've already used _xact_ in some system objects, we could replace
_transaction_ parts with _xact_. It will save 7 key types per query ;-)

-- 
Itagaki Takahiro


Re: pg_stat_transaction patch

From
Magnus Hagander
Date:
On Mon, Jul 12, 2010 at 11:36, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:
> "Accessor functions to get so far collected statistics for the current
> transaction"
> https://commitfest.postgresql.org/action/patch_view?id=301
>
> The latest version of the patch works as expected, and also well-formed.
> I'll mark the patch to "Ready for Committer".
> http://archives.postgresql.org/message-id/AANLkTikSYdRwATGAP5U6O6zwIO4b_WNJXIbUd2y2tI01@mail.gmail.com
>
> The added views and functions only accumulates activity counters
> in the same transaction where the stat views are queried. So we
> need to check the view before COMMIT or ROLLBACK.
>
> The only issue in the patch is too long view and function names:
>  - pg_stat_transaction_user_tables (31 chars)
>  - pg_stat_get_transaction_tuples_hot_updated (42 chars)
>  - pg_stat_get_transaction_function_self_time (42 chars)
>
> Since we've already used _xact_ in some system objects, we could replace
> _transaction_ parts with _xact_. It will save 7 key types per query ;-)

+1 on shortening that down, they're scary long now :-)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: pg_stat_transaction patch

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> "Accessor functions to get so far collected statistics for the current
> transaction"
> https://commitfest.postgresql.org/action/patch_view?id=301

> The latest version of the patch works as expected, and also well-formed.
> I'll mark the patch to "Ready for Committer".

I'm working through this patch now.  I kind of think that we ought to
drop the functions and view columns that claim to report live/dead
tuples.  In the first place, they're misnamed, because what they're
actually reporting is delta values (ie, new live tuples or new dead
tuples).  In the second place, they don't seem very useful.  The
live_tuples count is absolutely, positively guaranteed to read out as
zero, because a transaction that hasn't reached commit cannot have
created any known-live tuples.  The dead_tuples count can read out as
positive under certain circumstances, for example if a subtransaction
inserted some tuples and was then rolled back --- we know for certain
those tuples are dead and so the t_delta_dead_tuples count gets
incremented at subtransaction rollback.  But for the most part the
dead_tuples count is going to be a lot less than people might expect
based on what the transaction's done so far.

If we keep these we're going to have to document them a lot better
than the submitted patch does.  But I think we should just take 'em out.
Comments?
        regards, tom lane


Re: pg_stat_transaction patch

From
Tom Lane
Date:
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> "Accessor functions to get so far collected statistics for the current
> transaction"
> https://commitfest.postgresql.org/action/patch_view?id=301

> The only issue in the patch is too long view and function names:
>   - pg_stat_transaction_user_tables (31 chars)
>   - pg_stat_get_transaction_tuples_hot_updated (42 chars)
>   - pg_stat_get_transaction_function_self_time (42 chars)

> Since we've already used _xact_ in some system objects, we could replace
> _transaction_ parts with _xact_. It will save 7 key types per query ;-)

Applied, with assorted corrections -

* Renamed *_transaction_* to *_xact_* as suggested by Itagaki-san.

* Removed functions and view columns for delta live/dead tuple counts.

* Marked functions as volatile ... they certainly aren't stable.

* Got rid of use of get_tabstat_entry() to fetch table entries.  That
function forcibly creates tabstat entries if they weren't there before,
which was absolutely not what we want here: it'd result in bloating the
tabstat arrays with entries for tables the current transaction actually
never touched.  Worse, since you weren't passing the correct isshared
flag for the particular relation, the entries could be created with the
wrong isshared setting, leading to misbehavior if they did get used later
in the transaction.  We have to use a find-don't-create function here.

* Fixed bogus handling of inserted/updated/deleted counts --- you need to
add on the pending counts for all open levels of subtransaction.

* Assorted docs improvement and other minor polishing.

BTW, I notice that the patch provides pg_stat_get_xact_blocks_fetched()
and pg_stat_get_xact_blocks_hit(), but doesn't build any views on top of
them.  Was this intentional?  Providing a full complement of
pg_statio_xact_* views seems like overkill to me, but maybe that was where
you were intending to go and forgot.  If the functions are there then
anyone who needs the functionality can easily build their own views atop
them, so this might be an intentional compromise position, but I'm not
sure.  Or maybe we should decide that intratransaction statio numbers
aren't likely to be of interest to anybody, and drop the functions too.
        regards, tom lane


Re: pg_stat_transaction patch

From
Joel Jacobson
Date:
2010/8/8 Tom Lane <tgl@sss.pgh.pa.us>
Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:
> "Accessor functions to get so far collected statistics for the current
> transaction"
> https://commitfest.postgresql.org/action/patch_view?id=301

> The only issue in the patch is too long view and function names:
>   - pg_stat_transaction_user_tables (31 chars)
>   - pg_stat_get_transaction_tuples_hot_updated (42 chars)
>   - pg_stat_get_transaction_function_self_time (42 chars)

> Since we've already used _xact_ in some system objects, we could replace
> _transaction_ parts with _xact_. It will save 7 key types per query ;-)

Applied, with assorted corrections -

* Renamed *_transaction_* to *_xact_* as suggested by Itagaki-san.

* Removed functions and view columns for delta live/dead tuple counts.

* Marked functions as volatile ... they certainly aren't stable.

* Got rid of use of get_tabstat_entry() to fetch table entries.  That
function forcibly creates tabstat entries if they weren't there before,
which was absolutely not what we want here: it'd result in bloating the
tabstat arrays with entries for tables the current transaction actually
never touched.  Worse, since you weren't passing the correct isshared
flag for the particular relation, the entries could be created with the
wrong isshared setting, leading to misbehavior if they did get used later
in the transaction.  We have to use a find-don't-create function here.

* Fixed bogus handling of inserted/updated/deleted counts --- you need to
add on the pending counts for all open levels of subtransaction.

* Assorted docs improvement and other minor polishing.

BTW, I notice that the patch provides pg_stat_get_xact_blocks_fetched()
and pg_stat_get_xact_blocks_hit(), but doesn't build any views on top of
them.  Was this intentional?  Providing a full complement of
pg_statio_xact_* views seems like overkill to me, but maybe that was where
you were intending to go and forgot.  If the functions are there then
anyone who needs the functionality can easily build their own views atop
them, so this might be an intentional compromise position, but I'm not
sure.  Or maybe we should decide that intratransaction statio numbers
aren't likely to be of interest to anybody, and drop the functions too.

When I created the views, I just copied the existing pg_stat_user_* views without knowing if any columns where irrelevant for current transaction data.
I guess if someone would need the blocks_fetched/hit, they could build their own view.
 

                       regards, tom lane



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden