Re: pg_stat_statements and "IN" conditions - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: pg_stat_statements and "IN" conditions
Date
Msg-id 20220314145734.x3re4hcr6ukmppwi@ddolgov.remote.csb
Whole thread Raw
In response to Re: pg_stat_statements and "IN" conditions  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_stat_statements and "IN" conditions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
> On Mon, Mar 14, 2022 at 10:17:57AM -0400, Robert Haas wrote:
> On Sat, Mar 12, 2022 at 9:11 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> > Here is the limited version of list collapsing functionality, which
> > doesn't utilize eval_const_expressions and ignores most of the stuff
> > except ArrayExprs. Any thoughts/more suggestions?
>
> The proposed commit message says this commit intends to "Make Consts
> contribute nothing to the jumble hash if they're part of a series and
> at position further that specified threshold." I'm not sure whether
> that's what the patch actually implements because I can't immediately
> understand the new logic you've added, but I think if we did what that
> sentence said then, supposing the threshold is set to 1, it would
> result in producing the same hash for "x in (1,2)" that we do for "x
> in (1,3)" but a different hash for "x in (2,3)" which does not sound
> like what we want. What I would have thought we'd do is: if the list
> is all constants and long enough to satisfy the threshold then nothing
> in the list gets jumbled.

Well, yeah, the commit message is somewhat clumsy in this regard. It
works almost in the way you've described, except if the list is all
constants and long enough to satisfy the threshold then *first N
elements (where N == threshold) will be jumbled -- to leave at least
some traces of it in pgss.

> I'm a little surprised that there's not more context-awareness in this
> code. It seems that it applies to every ArrayExpr found in the query,
> which I think would extend to cases beyond something = IN(whatever).
> In particular, any use of ARRAY[] in the query would be impacted. Now,
> the comments seem to imply that's pretty intentional, but from the
> user's point of view, WHERE x in (1,3) and x = any(array[1,3]) are two
> different things. If anything like this is to be adopted, we certainly
> need to be precise about exactly what it is doing and which cases are
> covered.

I'm not sure if I follow the last point. WHERE x in (1,3) and x =
any(array[1,3]) are two different things for sure, but in which way are
they going to be mixed together because of this change? My goal was to
make only the following transformation, without leaving any uncertainty:

WHERE x in (1, 2, 3, 4, 5) -> WHERE x in (1, 2, ...)
WHERE x = any(array[1, 2, 3, 4, 5]) -> WHERE x = any(array[1, 2, ...])

> I thought of looking at the documentation to see whether you'd tried
> to clarify this there, and found that you hadn't written any.
>
> In short, I think this patch is not really very close to being in
> committable shape even if nobody were objecting to the concept.

Sure, I'll add documentation. To be honest I'm not targeting PG15 with
this, just want to make some progress. Thanks for the feedback, I'm glad
to see it coming!



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: pg_walinspect - a new extension to get raw WAL data and WAL stats
Next
From: Robert Haas
Date:
Subject: Re: pg_stat_statements and "IN" conditions