Re: Bug in pg_stat_statements - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: Bug in pg_stat_statements
Date
Msg-id CAA5RZ0uU2p_BXevGL17N8bva5Tvke+3xTcdpYbVegyb+kDfj_Q@mail.gmail.com
Whole thread Raw
In response to Re: Bug in pg_stat_statements  (Sami Imseih <samimseih@gmail.com>)
Responses Re: Bug in pg_stat_statements
List pgsql-hackers
> > > > Having said
> > > > that it seems that another solution would be to check for duplicated constants
> > > > in fill_in_constant_length
> > >
> > > Yes, I started thinking along these lines as well, where we check for
> > > duplicates
> > > in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
> > > have a squashable list, which is what I have attached with updated test cases.
> > >
> > > This means we do need to scan the locations one extra time during jumbling,
> > > but I don't see that as a problem. Maybe there is another better way?
> >
> > Why? What I hand in mind is something like this, after a quick test it seems to
> > be able to address both the original case and the one you've discovered.
>
> ahh, what you have works because clocations is sorted by location before
> the loop starts in `fill_in_constant_lengths` , so comparing locations
> makes sense in this case. I am OK with this.

v3-0001 does what Dimiti suggests with some slight tweaks:

1/ Moving "last_loc = loc;" earlier in the loop
2/ removing a comment for fill_in_constant_lengths that was
an oversight from 0f65f3eec, as we no longer assume that
the constant location is -1 when entering that routine. Also
added the test cases that were discovered, besides the
one from the original report.

> I was really hoping that the fix could be inside of query jumbling, as I think
> pg_stat_statements or any consumers of query jumbling don't need to
> care more about squashing ( or other internals of jumbling ).
> So now I also wonder if we should also move:
>
> ```
> static char *generate_normalized_query(JumbleState *jstate, const char *query,
> int query_loc, int *query_len_p);
>
> static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
> int query_loc);
> ```
>
> from pg_stat_statements.c to queryjumblefuncs.c?
>
>

v3-0002 moved both generate_normalized_query and fill_in_constant_lengths
to queryjumblefuncs.c, as these routines deal with the internal of jumbling
and should not be a concern of pg_stat_statements. I think this is a good
cleanup, but others may have different opinions on this.

--
Sami Imseih
Amazon Web Services (AWS)

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: use SIMD in GetPrivateRefCountEntry()
Next
From: Peter Geoghegan
Date:
Subject: Re: another autovacuum scheduling thread