Thread: RangeTblEntry jumble omissions
I think there are some fields from the RangeTblEntry struct missing in the jumble (function _jumbleRangeTblEntry()). Probably, some of these were really just forgotten, in other cases this might be an intentional decision, but then it might be good to document it. This has come up in thread [0] and there is a patch [1], but I figured I'd start a new thread here to get the attention of those who know more about pg_stat_statements. I think the following fields are missing. (See also attached patch.) - alias Currently, two queries like SELECT * FROM t1 AS foo SELECT * FROM t1 AS bar are counted together by pg_stat_statements -- that might be ok, but they both get listed under whichever one is run first, so here if you are looking for the "AS bar" query, you won't find it. - join_using_alias Similar situation, currently SELECT * FROM t1 JOIN t2 USING (a, b) SELECT * FROM t1 JOIN t2 USING (a, b) AS x are counted together. - funcordinality This was probably just forgotten. It should be included because the WITH ORDINALITY clause changes the query result. - lateral Also probably forgotten. A query specifying LATERAL is clearly different from one without it. Thoughts? Anything else missing perhaps? [0]: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org [1]: https://www.postgresql.org/message-id/attachment/154249/v2-0002-Remove-custom-_jumbleRangeTblEntry.patch
Attachment
On 2024-Feb-23, Peter Eisentraut wrote: > - alias > > Currently, two queries like > > SELECT * FROM t1 AS foo > SELECT * FROM t1 AS bar > > are counted together by pg_stat_statements -- that might be ok, but they > both get listed under whichever one is run first, so here if you are looking > for the "AS bar" query, you won't find it. Another, similar but not quite: if you do SET search_path TO foo; SELECT * FROM t1; SET search_path TO bar; SELECT * FROM t1; and you have both foo.t1 and bar.t1, you'll get two identical-looking queries in pg_stat_statements with different jumble IDs, with no way to know which is which. Not sure if the jumbling of the RTE (which includes the OID of the table in question) itself is to blame, or whether we want to store the relevant schemas with the entry somehow, or what. Obviously, failing to differentiate them would not be an improvement. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Hi, On Fri, Feb 23, 2024 at 04:26:53PM +0100, Peter Eisentraut wrote: > > - alias > > Currently, two queries like > > SELECT * FROM t1 AS foo > SELECT * FROM t1 AS bar > > are counted together by pg_stat_statements -- that might be ok, but they > both get listed under whichever one is run first, so here if you are looking > for the "AS bar" query, you won't find it. I think this one is intentional. This alias won't change the query behavior or the field names so it's good to avoid extraneous entries. It's true that you then won't find something matching "AS bar", but it's not something you can rely on anyway. If you first execute "select * from t1 as foo" and then "SELECT * FROM t1 AS foo" then you won't find anything matching "AS foo" either. There isn't even any guarantee that the stored query text will be jumbled. > - join_using_alias > > Similar situation, currently > > SELECT * FROM t1 JOIN t2 USING (a, b) > SELECT * FROM t1 JOIN t2 USING (a, b) AS x > > are counted together. IMHO same as above. > - funcordinality > > This was probably just forgotten. It should be included because the WITH > ORDINALITY clause changes the query result. Agreed. > - lateral > > Also probably forgotten. A query specifying LATERAL is clearly different > from one without it. Agreed.
Julien Rouhaud <rjuju123@gmail.com> writes: > On Fri, Feb 23, 2024 at 04:26:53PM +0100, Peter Eisentraut wrote: >> - funcordinality >> This was probably just forgotten. It should be included because the WITH >> ORDINALITY clause changes the query result. > Agreed. Seems OK. >> - lateral >> Also probably forgotten. A query specifying LATERAL is clearly different >> from one without it. > Agreed. Nah ... I think that LATERAL should be ignored on essentially the same grounds on which you argue for ignoring aliases. If it affects the query's semantics, it's because there is a lateral reference in the subject subquery or function, and that reference already contributes to the query hash. If there is no such reference, then LATERAL is a noise word. It doesn't help any that LATERAL is actually optional for functions, making it certainly a noise word there. IIRC, the parser+planner cooperatively fix things so that the final state of an RTE's lateral field reflects reality. But if we are hashing before that's happened, it's not worth all that much. regards, tom lane
On Fri, Feb 23, 2024 at 11:00:41PM +0100, Alvaro Herrera wrote: > > Another, similar but not quite: if you do > > SET search_path TO foo; > SELECT * FROM t1; > SET search_path TO bar; > SELECT * FROM t1; > > and you have both foo.t1 and bar.t1, you'll get two identical-looking > queries in pg_stat_statements with different jumble IDs, with no way to > know which is which. Not sure if the jumbling of the RTE (which > includes the OID of the table in question) itself is to blame, or > whether we want to store the relevant schemas with the entry somehow, or > what. Obviously, failing to differentiate them would not be an > improvement. Yeah that's also a very old known problem. This has been raised multiple times (on top of my head [1], [2], [3]). At this point I'm not exactly holding my breath. [1]: https://www.postgresql.org/message-id/flat/8f54c609-17c6-945b-fe13-8b07c0866420%40dalibo.com [2]: https://www.postgresql.org/message-id/flat/9baf5c06-d6ab-c688-010c-843348e3d98c%40gmail.com [3]: https://www.postgresql.org/message-id/flat/3aa097d7-7c47-187b-5913-db8366cd4491%40gmail.com
On Fri, Feb 23, 2024 at 06:52:54PM -0500, Tom Lane wrote: > Julien Rouhaud <rjuju123@gmail.com> writes: >> On Fri, Feb 23, 2024 at 04:26:53PM +0100, Peter Eisentraut wrote: >>> - funcordinality >>> This was probably just forgotten. It should be included because the WITH >>> ORDINALITY clause changes the query result. > >> Agreed. > > Seems OK. +1. >>> - lateral >>> Also probably forgotten. A query specifying LATERAL is clearly different >>> from one without it. > >> Agreed. > > Nah ... I think that LATERAL should be ignored on essentially the > same grounds on which you argue for ignoring aliases. If it > affects the query's semantics, it's because there is a lateral > reference in the subject subquery or function, and that reference > already contributes to the query hash. If there is no such > reference, then LATERAL is a noise word. It doesn't help any that > LATERAL is actually optional for functions, making it certainly a > noise word there. Sounds like a fair argument to me. Btw, I think that you should add a few queries to the tests of pg_stat_statements to track the change of behavior when you have aliases, as an effect of the fields added in the jumbling. -- Michael
Attachment
On 26.02.24 02:08, Michael Paquier wrote: > On Fri, Feb 23, 2024 at 06:52:54PM -0500, Tom Lane wrote: >> Julien Rouhaud <rjuju123@gmail.com> writes: >>> On Fri, Feb 23, 2024 at 04:26:53PM +0100, Peter Eisentraut wrote: >>>> - funcordinality >>>> This was probably just forgotten. It should be included because the WITH >>>> ORDINALITY clause changes the query result. >> >>> Agreed. >> >> Seems OK. > > +1. Ok, I have added funcordinality for the RTE_FUNCTION case, and left the others alone.