Thread: Odd behavior with pg_stat_statements and queries called from SQL functions

Odd behavior with pg_stat_statements and queries called from SQL functions

From
Maciek Sakrejda
Date:
I noticed an odd behavior today in pg_stat_statements query
normalization for queries called from SQL-language functions. If I
have three functions that call an essentially identical query (the
functions are only marked SECURITY DEFINER to prevent inlining):

maciek=# create or replace function f1(f1param text) returns text
language sql as 'select f1param' security definer;
CREATE FUNCTION
maciek=# create or replace function f2(f2param text) returns text
language sql as 'select f2param' security definer;
CREATE FUNCTION
maciek=# create or replace function f3(text) returns text language sql
as 'select $1' security definer;
CREATE FUNCTION

and I have pg_stat_statements.track = 'all', so that queries called
from functions are tracked, these all end up with the same query id in
pg_stat_statements, but the query text includes the parameter name (if
one is referenced in the query in the function). E.g., if I call f1
first, then f2 and f3, I get:

maciek=# select queryid, query, calls from pg_stat_statements where
queryid = 6741491046520556186;
       queryid       |     query      | calls
---------------------+----------------+-------
 6741491046520556186 | select f1param |     3
(1 row)

If I call f3 first, then f2 and f1, I get

maciek=# select queryid, query, calls from pg_stat_statements where
queryid = 6741491046520556186;
       queryid       |   query   | calls
---------------------+-----------+-------
 6741491046520556186 | select $1 |     3
(1 row)

I understand that the query text may be captured differently for
different queries that map to the same id, but it seems confusing that
parameter names referenced in queries called from functions are not
normalized away, since they're not germane to the query execution
itself, and the context of the function is otherwise stripped away by
this point. I would expect that all three of these queries end up in
pg_stat_statements with the query text "select $1".Thoughts?

Thanks,
Maciek



Re: Odd behavior with pg_stat_statements and queries called from SQL functions

From
Julien Rouhaud
Date:
Hi,

On Wed, Nov 16, 2022 at 11:26:09PM -0800, Maciek Sakrejda wrote:
> I noticed an odd behavior today in pg_stat_statements query
> normalization for queries called from SQL-language functions. If I
> have three functions that call an essentially identical query (the
> functions are only marked SECURITY DEFINER to prevent inlining):
>
> maciek=# create or replace function f1(f1param text) returns text
> language sql as 'select f1param' security definer;
> CREATE FUNCTION
> maciek=# create or replace function f2(f2param text) returns text
> language sql as 'select f2param' security definer;
> CREATE FUNCTION
> maciek=# create or replace function f3(text) returns text language sql
> as 'select $1' security definer;
> CREATE FUNCTION
> [...]
> maciek=# select queryid, query, calls from pg_stat_statements where
> queryid = 6741491046520556186;
>        queryid       |     query      | calls
> ---------------------+----------------+-------
>  6741491046520556186 | select f1param |     3
> (1 row)
>
> If I call f3 first, then f2 and f1, I get
>
> maciek=# select queryid, query, calls from pg_stat_statements where
> queryid = 6741491046520556186;
>        queryid       |   query   | calls
> ---------------------+-----------+-------
>  6741491046520556186 | select $1 |     3
> (1 row)
>
> I understand that the query text may be captured differently for
> different queries that map to the same id, but it seems confusing that
> parameter names referenced in queries called from functions are not
> normalized away, since they're not germane to the query execution
> itself, and the context of the function is otherwise stripped away by
> this point. I would expect that all three of these queries end up in
> pg_stat_statements with the query text "select $1".Thoughts?

None of those queries actually contain any constant, so the query text is just
saved as-is in all the versions.

I'm not sure that doing normalization for parameters would give way better
results.  It's true that a parameter name can change between different
functions running the exact same statements, but is it really likely to happen?
And what if the two functions have different number of parameters in different
orders?  $1 could mean different things in different cases, and good luck
finding out which one it is.  At least with the parameter name you have a
chance to figure out what the parameter was exactly.



Re: Odd behavior with pg_stat_statements and queries called from SQL functions

From
Ashutosh Bapat
Date:
On Thu, Nov 17, 2022 at 2:44 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
>
> Hi,
>
> On Wed, Nov 16, 2022 at 11:26:09PM -0800, Maciek Sakrejda wrote:
> > I noticed an odd behavior today in pg_stat_statements query
> > normalization for queries called from SQL-language functions. If I
> > have three functions that call an essentially identical query (the
> > functions are only marked SECURITY DEFINER to prevent inlining):
> >
> > maciek=# create or replace function f1(f1param text) returns text
> > language sql as 'select f1param' security definer;
> > CREATE FUNCTION
> > maciek=# create or replace function f2(f2param text) returns text
> > language sql as 'select f2param' security definer;
> > CREATE FUNCTION
> > maciek=# create or replace function f3(text) returns text language sql
> > as 'select $1' security definer;
> > CREATE FUNCTION
> > [...]
> > maciek=# select queryid, query, calls from pg_stat_statements where
> > queryid = 6741491046520556186;
> >        queryid       |     query      | calls
> > ---------------------+----------------+-------
> >  6741491046520556186 | select f1param |     3
> > (1 row)
> >
> > If I call f3 first, then f2 and f1, I get
> >
> > maciek=# select queryid, query, calls from pg_stat_statements where
> > queryid = 6741491046520556186;
> >        queryid       |   query   | calls
> > ---------------------+-----------+-------
> >  6741491046520556186 | select $1 |     3
> > (1 row)
> >
> > I understand that the query text may be captured differently for
> > different queries that map to the same id, but it seems confusing that
> > parameter names referenced in queries called from functions are not
> > normalized away, since they're not germane to the query execution
> > itself, and the context of the function is otherwise stripped away by
> > this point. I would expect that all three of these queries end up in
> > pg_stat_statements with the query text "select $1".Thoughts?
>
> None of those queries actually contain any constant, so the query text is just
> saved as-is in all the versions.
>
> I'm not sure that doing normalization for parameters would give way better
> results.  It's true that a parameter name can change between different
> functions running the exact same statements, but is it really likely to happen?

Multiple functions running the same query is quite possible. I am
wondering why it took so long to identify this behaviour.

> And what if the two functions have different number of parameters in different
> orders?  $1 could mean different things in different cases, and good luck
> finding out which one it is.  At least with the parameter name you have a
> chance to figure out what the parameter was exactly.
>
Reporting one of the parameters as is is a problem yes. Can the
parameters be converted into some normailzed form like replacing
parameters with ? (or some constant string indicating parameter)
everywhere.


-- 
Best Wishes,
Ashutosh Bapat