[PATCH] optional cleaning queries stored in pg_stat_statements - Mailing list pgsql-hackers

From Tomas Vondra
Subject [PATCH] optional cleaning queries stored in pg_stat_statements
Date
Msg-id 4EB5E9A8.9060309@fuzzy.cz
Whole thread Raw
Responses Re: [PATCH] optional cleaning queries stored in pg_stat_statements
List pgsql-hackers
Hi everyone,

I propose a patch that would allow optional "cleaning" of queries
tracked in pg_stat_statements, compressing the result and making it more
readable.

The default behavior is that when the same query is run with different
parameter values, it's actually stored as two separate queries (the
string do differ).

A small example - when you run "pgbench -S" you'll get queries like
this

   SELECT abalance FROM pgbench_accounts WHERE aid = 12433
   SELECT abalance FROM pgbench_accounts WHERE aid = 2322
   SELECT abalance FROM pgbench_accounts WHERE aid = 52492

and so on, and each one is listed separately in the pg_stat_statements.
This often pollutes the pg_stat_statements.

The patch implements a simple "cleaning" that replaces the parameter
values with generic strings - e.g. numbers are turned to ":n", so the
queries mentioned above are turned to

   SELECT abalance FROM pgbench_accounts WHERE aid = :n

and thus tracked as a single query in pg_stat_statements.

The patch provides an enum GUC (pg_stat_statements.clean) with three
options - none, basic and aggressive. The default option is "none", the
"basic" performs the basic value replacement (described above) and
"aggressive" performs some additional cleaning - for example replaces
multiple spaces with a single one etc.

The parsing is intentionally very simple and cleans the query in a
single pass. Currently handles three literal types:

 a) string (basic, C-style escaped, Unicode-escaped, $-espaced)
 b) numeric (although 1.925e-3 is replaced by :n-:n)
 c) boolean (true/false)

There is probably room for improvement (e.g. handling UESCAPE).


Tomas

Attachment

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Include commit identifier in version() function
Next
From: Peter Geoghegan
Date:
Subject: Re: [PATCH] optional cleaning queries stored in pg_stat_statements