On Mon, Apr 15, 2024 at 11:20:16AM +1200, David Rowley wrote:
> I was recently asked internally about the stability guarantees we
> offer for queryid. My answer consisted of:
>
> 1. We cannot change Node enums in minor versions
> 2. We're *unlikely* to add fields to Node types in minor versions, and
> if we did we'd likely be leaving them out of the jumble calc, plus it
> seems highly unlikely any new field we wedged into the padding would
> relate at all to the parsed query.
Since 16 these new fields would be added by default unless the node
attribute query_jumble_ignore is appended to it. I agree that this
may not be entirely intuitive when it comes to force compatibility
across the same major version. Could there be cases where it is worth
breaking compatibility and include something more in the jumbling,
though? I've not seen the case in recent years even in stable
branches.
> Maybe the paragraph starting with "Consumers of" can detail the
> reasons queryid might be unstable and the following paragraph can
> describe the scenario for when the queryid can generally assumed to be
> stable.
>
> <para>
> As a rule of thumb, <structfield>queryid</structfield> values can be assumed to be
> - stable and comparable only so long as the underlying server version and
> - catalog metadata details stay exactly the same. Two servers
> + stable and comparable only between <productname>PostgreSQL</productname> instances
> + which are running the same major version of <productname>PostgreSQL</productname>
> + and are running on the same machine architecture and catalog metadata details match. Two servers
> participating in replication based on physical WAL replay can be expected
> to have identical <structfield>queryid</structfield> values for the same query.
> However, logical replication schemes do not promise to keep replicas
Assuming that a query ID will be always stable across major versions
is overconfident, I think. As Peter said, like for WAL, we may face
cases where a slight breakage for a subset of queries could be
justified, and pg_stat_statement would be able to cope with that by
discarding the oldest entries in its hash tables.
--
Michael