Some oversights in query_id calculation - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Some oversights in query_id calculation
Date
Msg-id 20210425081119.ulyzxqz23ueh3wuj@nol
Whole thread Raw
Responses Re: Some oversights in query_id calculation
List pgsql-hackers
Hi,

While doing some sanity checks on the regression tests, I found some queries
that are semantically different but end up with identical query_id.

Two are an old issues:

- the "ONLY" in FROM [ONLY] isn't hashed
- the agglevelsup field in GROUPING isn't hashed

Another one was introduced in pg13 with the WITH TIES not being hashed.

The last one new in pg14: the "DISTINCT" in "GROUP BY [DISTINCT]" isn't hash.

I'm attaching a patch that fixes those, with regression tests to reproduce each
problem.

There are also 2 additional debatable cases on whether this is a semantic
difference or not:

- aliases aren't hashed.  That's usually not a problem, except when you use
  row_to_json(), since you'll get different keys

- the NAME in XmlExpr (eg: xmlpi(NAME foo,...)) isn't hashed, so you generate
  different elements

Attachment

pgsql-hackers by date:

Previous
From: Yura Sokolov
Date:
Subject: Re: Use simplehash.h instead of dynahash in SMgr
Next
From: Julien Rouhaud
Date:
Subject: Re: compute_query_id and pg_stat_statements