On Sun, Mar 06, 2022 at 07:10:49PM -0800, Zhihong Yu wrote:
> On Sun, Mar 6, 2022 at 6:23 PM Julien Rouhaud <rjuju123@gmail.com> wrote:
> 
> > On Sun, Mar 06, 2022 at 12:37:00PM -0800, Zhihong Yu wrote:
> > >
> > > Here is one example (same query, q, is concerned).
> > > At t1, q is performed, leaving one row in pg_stat_statements with
> > mean_time
> > > of 10.
> > > At t2, operator examines pg_stat_statements and provides some suggestion
> > > for tuning q (which is carried out).
> > > At t3, q is run again leaving the row with mean_time of 9.
> > > Now with two rows for q, how do we know whether the row written at t3 is
> > > prior to or after implementing the suggestion made at t2 ?
> >
> > Well, if pg_stat_statements is read by people doing performance tuning
> > shouldn't they be able to distinguish which query text is the one they just
> > rewrote?
> >
> Did I mention rewriting ?
How else would you end up with two entries in pg_stat_statements?
> As you said below, adding index is one way of tuning which doesn't involve
> rewriting.
Yes, and in that case you have a single row for that query, and mean_time is
useless.  You need to compute it yourself using snapshots of
pg_stat_statements if you want to know how that query performed since the
optimization.
> So some information in pg_stat_statements (or related table) is needed to
> disambiguate.
In my opinion that's not pg_stat_statements' job.  Like all other similar
infrastructure in postgres it only provides cumulated counters.  You would
have exactly the same issue with e.g. pg_stat_user_indexes or pg_stat_bgwriter.