Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements - Mailing list pgsql-hackers

From Lukas Fittl
Subject Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements
Date
Msg-id CAP53PkxyzaQMqEbPiFmN0sp8VRRjgrY1vLhCEeB_NqsvDAUSgQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Use $ parameters as replacement characters for pg_stat_statements  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Mar 6, 2017 at 9:36 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Mar 4, 2017 at 1:52 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> In my opinion, we expose query id (and dbid, and userid) as the
> canonical identifier for each pg_stat_statements entry, and have done
> so for some time. That's the stable API -- not query text. I'm aware
> of cases where query text was used as an identifier, but that ended up
> being hashed anyway.
>
> Query text is just for human consumption.

Lukas evidently thinks otherwise, based on the original post.

I actually agree with Peter that the queryid+userid+dbid is the canonical identifier,
not the query text.

There is however value in parsing the query, e.g. to find out which statement
type something is, or to determine which table names a query references
(assuming one knows the search_path) programatically.

It is for that latter reason I'm interested in parsing the query, and avoiding the
ambiguity that ? carries, since its also an operator.

Based on some hackery, I've previously built a little example script that
filters pg_stat_statements output: https://github.com/lfittl/pg_qtop#usage

This script currently breaks in complex cases of ? operators, since the
pg_stat_statements query text is ambiguous.
 
> I'd be in favor of a change
> that makes it easier to copy and paste a query, to run EXPLAIN and so
> on. Lukas probably realizes that there are no guarantees that the
> query text that appears in pg_stat_statements will even appear as
> normalized in all cases. The "sticky entry" stuff is intended to
> maximize the chances of that happening, but it's still generally quite
> possible (e.g. pg_stat_statements never swaps constants in a query
> like "SELECT 5, pg_stat_statements_reset()"). This means that we
> cannot really say that this buys us a machine-readable query text
> format, at least not without adding some fairly messy caveats.

Well, Lukas's original suggestion of using $n for a placeholder would
do that, unless there's already a $n with the same numerical value,
but Andres's proposal to use $-n or $:n would not.

Yes, and I do think that making it easier to run EXPLAIN would be the
primary user-visible benefit in core.

I'd be happy to add a docs section showing how to use this, if there is
some consensus that its worth pursuing this direction.

Thanks for all the comments, appreciate the discussion.

Best,
Lukas

--
Lukas Fittl

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Declarative partitioning optimization for large amountof partitions
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] Enabling replication connections by default inpg_hba.conf