Re: pg_stat_statements with query tree based normalization - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: pg_stat_statements with query tree based normalization
Date
Msg-id CAEYLb_WX50qKziuc2BHejYDmDYnBqbgc0cWvrBtK9G5iGSqnxA@mail.gmail.com
Whole thread Raw
In response to pg_stat_statements with query tree based normalization  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: pg_stat_statements with query tree based normalization
Re: pg_stat_statements with query tree based normalization
List pgsql-hackers
On 14 November 2011 04:42, Greg Smith <greg@2ndquadrant.com> wrote:
> The approach Peter used adds a single integer to the Const structure in
> order to have enough information to substitute "?" in place of those.
>  Adding and maintaining that is the only change outside of the extension
> made here, and that overhead is paid by everyone--not just consumers of this
> new code.

I've attempted to isolate that overhead, so far unsuccessfully. Attached are:

1. A simple python + psycopg2 script for repeatedly running a
succession of similar queries that explain would show as containing a
single "Result" node.  They contain 300 Const integer nodes by
default, which are simply selected.

2. The results of running the script on Greg's server, which has CPU
frequency scaling disabled. That's an ODS spreadsheet. Out of
consideration of filesize, I've deleted the query column in each
sheet, which wasn't actually useful information.

The results are...taking the median value of each set of runs as
representative, my patch appears to run marginally faster than head.
Of course, there is no reason to believe that it should, and I'm
certain that the difference can be explained by noise, even though
I've naturally strived to minimise noise.

If someone could suggest a more telling test case, or even a
worst-case, that would be useful. This was just my first run at this.
I know that the overhead will also exist in code not well-exercised by
these queries, but I imagine that any real-world query that attempts
to exercise them all is going to add other costs that dwarf the
additional overhead and further muddy the waters.

I intend to work through the known issues with this patch in the next
couple of days.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Inlining comparators as a performance optimisation
Next
From: Tom Lane
Date:
Subject: Re: Inlining comparators as a performance optimisation