Thread: pg_stat_statements with query tree based normalization

pg_stat_statements with query tree based normalization

From
Greg Smith
Date:
Attached is a patch and associated test program that adds query
normalization of pg_stat_statements, based on transforming the query
tree into a series of integers and using them to match against previous
queries.  Currently pg_stat_statements only works usefully if all
queries executed use prepared statements.  This is a troublesome
limitation for a couple of reasons.  Prepared statements can easily have
performance regressions compared to the unprepared version, since the
optimizer won't necessarily be able to use as much data about MCVs etc.
as in the regular case.  People regularly avoid those for that
performance reason, and there are ORM limitations here too.  Someone was
telling me the other day Ruby just added prepared statement support
recently as one example.

That means that instead of using the existing pg_stat_statements
extension, most sites have to log statements to a text file, and then
parse those logs using programs like pgfouine.  The whole process is
inefficient, makes the database look clumsy, and is a major source of
complaints about PostgreSQL.  In many shared/managed hosting
environments, looking at things exposed at the database-level is vastly
easier than getting at the log files on the server too.  There's several
legitimate complaints here that make this problem rise to be an
advocacy/adoption issue as far as I'm concerned.

Code and this whole idea by Peter Geoghegan, with me as first round
reviewer and diff minimizing nitpicker.  Thanks to attendees and
sponsors of the PgWest conference for helping to fund initial
exploration of server side query normalization using regular
expressions.  The more complicated query tree approach used here was
sponsored by Heroku.  Hurray for parallel development tracks, I was
happy to take my regex idea to the bike shed and put it out of its
misery once this one worked.

Here's a quick demo that's based on the existing sample in the docs.
Build the server and the contrib modules for pg_stat_statements and
pgbench, edit postgresql.conf to add:

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 10000
pg_stat_statements.track = all

Then test like this using pgbench:

createdb bench
psql -d bench -c "CREATE EXTENSION pg_stat_statements"
pgbench -i bench
psql -d bench -c "SELECT pg_stat_statements_reset()"
pgbench -c10 -t30 bench
psql -x -d bench -c "SELECT query, calls, total_time, rows FROM
pg_stat_statements ORDER BY total_time DESC LIMIT 5;"

Unlike the example in the docs, there's no "-M prepared" here, but as
hoped all the similar statements are grouped together anyway.  The no
log file parsing or regex necessary output looks like this:

-[ RECORD 1
]------------------------------------------------------------------
query      | UPDATE pgbench_branches SET bbalance = bbalance + ? WHERE
bid = ?;
calls      | 300
total_time | 1.084803
rows       | 300
-[ RECORD 2
]------------------------------------------------------------------
query      | UPDATE pgbench_tellers SET tbalance = tbalance + ? WHERE
tid = ?;
calls      | 300
total_time | 0.875213
rows       | 300
...

Interestingly it even showed vacuum running against pgbench_branches in
my test case.

I've been focused on usability testing, and we could really use someone
familiar with the query node internals to review those changes too.
Probably the most controversial detail here is how exactly the "query"
field here is generated, and that's the newest code too.  I feel
strongly that we need to have a stable string there for common use
cases, so that successive views of this data across time or even server
restarts will have the same key for that query's name.  That's what
administrations really expect this sort of feature to do.  I consider it
a feature that you can tell this form of normalization from the type
that you get from a proper prepared statement, where these parameters
would be named "$1" etc.  There are surely some edge cases where this
may need escaping around, I haven't really dug into looking for those yet.

The approach Peter used adds a single integer to the Const structure in
order to have enough information to substitute "?" in place of those.
Adding and maintaining that is the only change outside of the extension
made here, and that overhead is paid by everyone--not just consumers of
this new code.  Neither of us like that, but if we're going to increase
something Const is not a bad choice; it's not that slim of a structure
already.  There's 7 int length fields, a Datum, and two booleans in
there.  It's going from about 44 bytes to 48, so maybe a 10% size
increase in that one node type.  I can't even measure a performance
change on simple pgbench tests, run variation is way bigger.  Might be
easier to measure if your query was mostly Const nodes now that I think
about it.

If someone can come up with another way to implement this that avoid
that overhead, obviously we'd do whatever we could to eliminate it.  I
don't think it's unreasonable server bloat given how valuable this
feature is.  As I recently mentioned in the other thread that whacks
around this extension, it's possible to install and use most of this
code as an extension, you just don't have this part and accordingly the
query text is less stable.  That's not a code consideration, but
presuming that's doesn't complicate the 9.2 implementation I hear enough
requests for this feature that I think it is a useful bonus for the
community,   If some form of this gets committed, I'd expect the updated
pg_stat_statements to be fairly popular as an extension, packaged to add
this to 8.4 through 9.1 servers.

Back to the code...an extensive regression testing program for this new
feature is included, which has made it much easier to modify the
underlying implementation without breaking it.  (If David Wheeler has
read this far down, he's now laughing at me)  It verifies many different
statements that should be considered the same are.  Running the program
requires Python, psycopg2, and the Dell Store 2 database:
http://pgfoundry.org/forum/forum.php?forum_id=603  Exactly how much
testing of this should be done using the PostgreSQL regression tests
instead is one of the things I don't have a good recommendation on yet.

The latest version of this patch right now is at
https://github.com/greg2ndQuadrant/postgres/tree/pg_stat_statements_norm
with their neato diff view at
https://github.com/greg2ndQuadrant/postgres/compare/master...pg_stat_statements_norm
That currently has only minor cleanup drift from Peter's development one
at https://github.com/Peter2ndQuadrant/postgres/branches which is the
place to watch for updates.  The git repo tree includes both the changes
and the regression test program, but I don't expect the test program to
be committed in core.

There are some known bugs/limitations that Peter and I have found in
recent testing of the program, all of which are fixable given a bit more
development time.  Just to really invert the status quo, my testing
showed this doesn't report correct numbers when using prepared
statements right now.  Peter tells me the code presently assumes that
there is one call to the planner hook followed by calls to the normal
executor hooks, each set of calls corresponding to one statement.  And
that assumption falls down when prepared statements are used.  Tom
mentioned something about synchronization issues in this area on the
other pg_stat_statements thread too, so it's not a surprise so much as a
known sticky point in the implementation.  Peter has a fix in mind for
this already; I wanted to get community feedback moving rather than
block waiting for that.

Timestamp literals aren't normalized properly yet, that just slipped
past the submission deadline.  There are some DEBUG1 logging statements
still around that Peter wanted to kill before submission.  I thought
they should stay around for initial review at least.

As for documentation, I consider this so valuable to server operation
that I think I want to mention it beyond just the notes in contrib on
how to use it.  Still muddling over what I want to say there.

I also wonder if it makes sense to allow disabling this feature, just
for people who want the rest of pg_stat_statements but not paying for
this.  The main use case I could see for that are people who are using
prepared statements, and are happy with the existing implementation.
Again, I didn't want to dive too deep into measuring overhead when it's
quite possible we'll get feedback that requires rework making that
redundant.  Suggestions for anything from the usage to implementation
approach is welcome.  As I'm sure it's clear by the end of this long
commentary, this is considered an important feature here.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Attachment

Re: pg_stat_statements with query tree based normalization

From
Peter Geoghegan
Date:
On 14 November 2011 04:42, Greg Smith <greg@2ndquadrant.com> wrote:
> The approach Peter used adds a single integer to the Const structure in
> order to have enough information to substitute "?" in place of those.
>  Adding and maintaining that is the only change outside of the extension
> made here, and that overhead is paid by everyone--not just consumers of this
> new code.

I've attempted to isolate that overhead, so far unsuccessfully. Attached are:

1. A simple python + psycopg2 script for repeatedly running a
succession of similar queries that explain would show as containing a
single "Result" node.  They contain 300 Const integer nodes by
default, which are simply selected.

2. The results of running the script on Greg's server, which has CPU
frequency scaling disabled. That's an ODS spreadsheet. Out of
consideration of filesize, I've deleted the query column in each
sheet, which wasn't actually useful information.

The results are...taking the median value of each set of runs as
representative, my patch appears to run marginally faster than head.
Of course, there is no reason to believe that it should, and I'm
certain that the difference can be explained by noise, even though
I've naturally strived to minimise noise.

If someone could suggest a more telling test case, or even a
worst-case, that would be useful. This was just my first run at this.
I know that the overhead will also exist in code not well-exercised by
these queries, but I imagine that any real-world query that attempts
to exercise them all is going to add other costs that dwarf the
additional overhead and further muddy the waters.

I intend to work through the known issues with this patch in the next
couple of days.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment

Re: pg_stat_statements with query tree based normalization

From
Marti Raudsepp
Date:
On Wed, Dec 7, 2011 at 03:19, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> The results are...taking the median value of each set of runs as
> representative, my patch appears to run marginally faster than head.
> Of course, there is no reason to believe that it should, and I'm
> certain that the difference can be explained by noise, even though
> I've naturally strived to minimise noise.

You should use the t-test to distinguish whether two data sets show a
consistent difference or whether it's just noise. Excel/OpenOffice
have the TTEST() macro for this purpose. For statistics doofuses like
me, just pick mode=2 and type=3 as that's the most conservative.

If the TTEST result is less than 0.05 then you have 95% certainty that
the two dataset are consistently different. If not, you need more
consistent data.

More information here:
http://www.gifted.uconn.edu/siegle/research/t-test/t-test.html

Regards,
Marti


Re: pg_stat_statements with query tree based normalization

From
Greg Smith
Date:
On 12/06/2011 08:19 PM, Peter Geoghegan wrote:
> I intend to work through the known issues with this patch in the next
> couple of days.
>

I heard about some bitrot creeping in here too, but it seems gone now; I
had no problem merging Peter's development branch against master.  I've
attached a newer patch of the main code, which fixes most of the earlier
issues there were disclaimers about.  The two test programs from
contrib/pg_stat_statements are attached as separate files for a cleaner
diff.  Merged and rebased version now available from
https://github.com/greg2ndQuadrant/postgres/tree/pg_stat_statements_norm
too.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Attachment

Re: pg_stat_statements with query tree based normalization

From
Peter Geoghegan
Date:
On 10 December 2011 13:56, Greg Smith <greg@2ndquadrant.com> wrote:
> I heard about some bitrot creeping in here too, but it seems gone now; I had
> no problem merging Peter's development branch against master.  I've attached
> a newer patch of the main code, which fixes most of the earlier issues there
> were disclaimers about.

I'm aware of some further bugs in the patch that Greg posted regarding
the synchronisation of executor and planner plugins, so please bear
with me while I squash them.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


Re: pg_stat_statements with query tree based normalization

From
Peter Geoghegan
Date:
The latest revision of the patch, along with latest revision of
regression tests are attached.

This patch handles synchronisation of the planner and executor
plugins. This was always going to be a problem, regardless of what
implementation was chosen (with the exception of plan hashing). I
think that the idea of hashing Plans themselves from an ExecutorEnd
hooked-in function, while interesting and probably worth pursuing at
some other point, is not a good match for pg_stat_statements today.
Any useful tool that did that would be orthogonal to what
pg_stat_statements is supposed to be - it would be affected by
completely external factors like planner cost constants. What would
the canonicalized representation of the query string look like then,
and how would that interact with the selectivity of constants,
particularly over time? What would the DBA's workflow be to isolate
poorly performing plans - how exactly is this tool useful? These are
questions that do not have simple answers, and the fact that I'm not
aware of a successful precedent makes me hesitant to sink too much
time into the idea just yet.

The other proposal that Tom had in the other thread, to directly hash
the raw parse tree, had a number of disadvantages to my mind. Firstly,
it prevents us from benefiting from additional normalisation performed
by the parser itself in later stages, like ignoring noise words.
Secondly, it prevents us from creating a less-complete version of this
that can be used with Postgres 8.4+ stock binaries. We therefore
cannot get field experience/ bug reports with the tool in advance of
the release of 9.2. Tom was pretty sceptical of this second reason.
However, I think that it's an issue of major concern to the PostgreSQL
community. For example, the immediate reaction of Depesz, writing on
pg_stat_statements when it was initially committed in 2009 was:

"To sum it up. I think that functionality of the module would be
greatly enhanced if it would store queries without parameters (instead
of “select 2 + 3″ -> “select $1 + $2″, or something like this) –
otherwise, with real-world databases, the buffer for queries will fill
up too soon, and it will not “catch” the fact that “select * from
table where id = 3″ and “select * from table where id = 23″ is
practically the same."

It would be great to have this functionality, but it would be even
better if we could get it to people today, rather than in a couple of
years time, assured by the fact that it is largely the same as the
pg_stat_statements that ships with 9.2. That was not the primary
consideration that informed the design of this patch though - I
genuinely thought that after the rewriting stage was the correct time
to hash the query tree, as it is at that point that it most accurately
reflects what the query will actually do without involving how it will
do it. This made good sense to me.

I'm now sometimes using a selective serialisation (or "jumble", as
I've termed it to reflect the fact that it is selective), and other
times using the query string instead. Basically, I'll only use a
jumble if:

* The query is not nested (i.e. it is directly issued by a client).
This generally isn't a problem as nested queries are almost always
prepared. The one case that this is a problem that I'm aware of is if
someone is dynamically executing SQL with SPI, using plpgsql's
execute, for example. However, that pattern already has plenty of
problems (It doesn't set FOUND, etc).

* We're not in legacy/compatibility mode (and we're not by default)

* The query is not prepared. It's not possible to synchronise Planner
and Executor hooks as things stand, at least not without adding
additional infrastructure and complex logic for keeping track of
things.

* The query is not a utility statement (including DECLARE CURSOR
statements) - It is a select, update, delete or insert.

You might find the fact that it serializes at this later stage
objectionable, because of the potential for that to interact with
things like search_path in ways that might not be appreciated - this
was something that I thought was an advantage. You might also find it
strange that it does one or the other. It might do this for some query
that is prepared but not another similar one, which is not prepared.
One solution to this could be to hash relationOids along with the
query string in pgss_ExecutorEnd for prepared queries. I don't think
this inconsistency is especially worth concerning ourselves with
though - a better solution might be to simply document it, as it isn't
hard to understand.

Likely other areas of concern with this patch are:

* My modifications to scan.l and to a lesser extent gram.y to
accommodate the new length field in Const nodes are not particularly
elegant. It would be useful to get an expert's input here. In
particular, note the strlen() calls within the SET_YYCCOC_LEN() macro.
It is unfortunate that by making the YYLTYPE representation closer to
that of the default (which has column and row positions for both the
start and end of lexemes), it reverberates further than I'd like,
including necessitating modifications to plpgsql, which is why the
patch is so large. It's a mostly mechanical change though.

* The fact that JUMBLE_SIZE is a compile time constant, whereas the
query size is not and has not been, which is important for the
purposes of determining query equivalency with large queries.
Basically, it's not quite obvious when you've run out of space to
store a query jumble, whereas it was obvious when that happens with
the query string.

* The possibility of two substantively different queries having the
same jumble, and therefore actually being considered equivalent. The
basic problem is that I'm not serialising NULL pointers - the
serialization logic is currently faulty. It's pretty simple to do this
with something like a binary tree, but I thought I'd get input here
before doing something better - it's probably possible to cheat and
not have to serialise every single NULL pointer, which would be quite
wasteful (consider the last bullet point), but I wouldn't like to
attempt that without input.

* Lack of documentation. The code is fairly heavily commented though.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment