Thread: [PATCH] Enhancements to pg_stat_statements contrib extension
Hi -hackers,
queryid bigint,
query text,
start timestamp, *
total_time float8,
rows int8,
shared_blks_hit int8,
shared_blks_read int8,
shared_blks_dirtied int8,
shared_blks_written int8,
local_blks_hit int8,
local_blks_read int8,
local_blks_dirtied int8,
local_blks_written int8,
temp_blks_read int8,
temp_blks_written int8,
blk_read_time float8,
blk_write_time float8,
wal_records int8,
wal_fpi int8,
wal_bytes numeric
--
This is my first patch here in the mailing list, so I tried to explain the "why" and the "how" of this enhancement.
Needs :
Formerly, Oracle DBA, I used to query v$sql to know the latest queries issued by each session with their timestamp. I found postgresql pg_stat_statements very useful for this need, but the aggregation did not always permitted me to analyze correctly the queries issued (at least for a buffer stats per query overview). So, I enhanced the existing pg_stat_statements.
Changes overview :
- new configuration pg_stat_statements.track_every = (TRUE|FALSE)
-> generating per query data in a new view : pg_stat_sql
-> can be resetted by using : pg_stat_sql_reset(userid,dbid,queryid)
- added the timestamp per query to the view (replacing the number of calls of pg_stat_statements)
-> the view column itself :
userid oid,
dbid oid,queryid bigint,
query text,
start timestamp, *
total_time float8,
rows int8,
shared_blks_hit int8,
shared_blks_read int8,
shared_blks_dirtied int8,
shared_blks_written int8,
local_blks_hit int8,
local_blks_read int8,
local_blks_dirtied int8,
local_blks_written int8,
temp_blks_read int8,
temp_blks_written int8,
blk_read_time float8,
blk_write_time float8,
wal_records int8,
wal_fpi int8,
wal_bytes numeric
- data are stored in a hash in a new shared memory area.
- query texts are still stored in the same file.
The goal was to avoid generating too much data with track_every option enabled.
- added some tests to the sql/pg_stat_statements.sql
- added views to pg_stat_statements--1.7--1.8.sql
Bug fix :
- with UTF8 encoding, the "\0" to delimit the end of the query text was buggy; modified to query[query_len]=0;
Note :
I didn't want to change version number by myself, the attached files are still pointing to 1.8
This is my first code for pgsql.
I wanted to share with you this enhancement, hope you'll find it useful.
Regards,
Yoan SULTAN
Yoan SULTAN
Attachment
Hi, On Sun, Mar 28, 2021 at 09:05:10AM +0200, Yoan SULTAN wrote: > > *Needs :* > Formerly, Oracle DBA, I used to query v$sql to know the latest queries > issued by each session with their timestamp. I found postgresql > pg_stat_statements very useful for this need, but the aggregation did not > always permitted me to analyze correctly the queries issued (at least for a > buffer stats per query overview). So, I enhanced the existing > pg_stat_statements. I'm not an Oracle DBA so I may be lacking some background, but I don't really understand what this feature is about. Is it about having the statistics for the last query execution for each backend, or simply the last N queries executed even if they're all from the same backend? More details of what this feature should do and how it should interact with existing version would be welcome. > *Changes overview :* > - new configuration pg_stat_statements.track_every = (TRUE|FALSE) > -> generating per query data in a new view : pg_stat_sql > -> can be resetted by using : pg_stat_sql_reset(userid,dbid,queryid) > - added the timestamp per query to the view (replacing the number of calls > of pg_stat_statements) > -> the view column itself : > > *userid oid,* > *dbid oid,queryid bigint,query text,start timestamp, > *total_time float8,rows int8,shared_blks_hit int8, > shared_blks_read int8, shared_blks_dirtied int8, > shared_blks_written int8, local_blks_hit int8, local_blks_read > int8, local_blks_dirtied int8, local_blks_written int8, > temp_blks_read int8, temp_blks_written int8, blk_read_time > float8, blk_write_time float8, wal_records int8, wal_fpi > int8, wal_bytes numeric* > > > - data are stored in a hash in a new shared memory area. > - query texts are still stored in the same file.* If this is about storing the last query (or N queries), shouldn't it be stored on some kind of ring buffer rather than a hash table? I also don't understand what is the EVERY_FACTOR supposed to do, given that pgss_every_max seems to be used to request more memory but then never used to actually store additional entries in the hash table. > *Bug fix :* > - with UTF8 encoding, the "\0" to delimit the end of the query text was > buggy; modified to query[query_len]=0; Do you have an example on how to reproduce that? Also, if there's a bug it should be fixed outside of a new feature proposal. > *Note :* > I didn't want to change version number by myself, the attached files are > still pointing to 1.8 > This is my first code for pgsql. I tried to have a look at the patch but it's unfortunately a bit hard to do with the current format. You should instead submit a diff (e.g. with "git diff") or a patch (git format-patch) to ease review, ideally with a version number. If you're looking for pointers you can look at https://wiki.postgresql.org/wiki/Submitting_a_Patch, and for larger documentation maybe https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F and https://wiki.postgresql.org/wiki/Developer_FAQ.