Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET - Mailing list pgsql-hackers
From | Bykov Ivan |
---|---|
Subject | Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET |
Date | |
Msg-id | ca447b72d15745b9a877fad7e258407a@localhost.localdomain Whole thread Raw |
List | pgsql-hackers |
Hi, all!
It seems I have found a bug in the Query ID calculation.
Problem
=======
In some cases, we could have same IDs for not identical query trees.
For two structurally similar query subnodes, we may have query
trees that look like this:
----
QueryA->subNodeOne = Value X;
QueryA->subNodeTwo = NULL;
QueryB->subNodeOne = NULL;
QueryB->subNodeTwo = Value X;
----
When the query jumble walker calculates the query ID, it traverses the
Query members from top to bottom and generates the same IDs for these
two queries because it does not change the hash value when visiting
an empty node (= NULL).
Here is an example (the collection of jstate->jumble is omitted):
----
/* QueryA_ID = AAAA */
QueryA->subNodeOne = &(Value X);
/* QueryA_ID = hash_any_extended(&(Value X), sizeof(Value X), QueryA_ID) = BBBB */
QueryA->subNodeTwo = NULL;
/* QueryA_ID = BBBB; */
/* QueryB_ID = AAAA */
QueryB->subNodeOne = NULL;
/* QueryB_ID = AAAA; */
QueryB->subNodeTwo = &(Value X);
/* QueryB_ID = hash_any_extended(&(Value X), sizeof(Value X), QueryB_ID) = BBBB */
----
There are two pairs of subnodes that can trigger this error:
- distinctClause and sortClause (both contain a list of SortGroupClauses);
- limitOffset and limitCount (both contain an int8 expression).
Here is an example of such errors (all queries run on REL_17_0):
----
SET compute_query_id = on;
/* DISTINCT / ORDER BY *************************************/
EXPLAIN (VERBOSE) SELECT DISTINCT "oid" FROM pg_class;
/* Query Identifier: 751948508603549510 */
EXPLAIN (VERBOSE) SELECT "oid" FROM pg_class ORDER BY "oid";
/* Query Identifier: 751948508603549510 */
/* LIMIT / OFFSET ******************************************/
EXPLAIN (VERBOSE) SELECT "oid" FROM pg_class LIMIT 1;
/* Query Identifier: 5185884322440896420 */
EXPLAIN (VERBOSE) SELECT "oid" FROM pg_class OFFSET 1;
/* Query Identifier: 5185884322440896420 */
----
Solution One
============
The simplest way to fix the problem is to place the scalar field used in
the query ID calculation between similar subnodes. A patch for this
solution is attached below (0001-Query-ID-Calculation-Fix-Variant-A.patch).
Here is an example:
----
/* QueryA_ID = AAAA */
QueryA->subNodeOne = &(Value X);
/* QueryA_ID = hash_any_extended(&(Value X), sizeof(Value X), QueryA_ID) = BBBB */
QueryA->scalar = Value Y;
/* QueryA_ID = hash_any_extended(&(Value Y), sizeof(Value Y), QueryA_ID) = CCCC */
QueryA->subNodeTwo = NULL;
/* QueryA_ID = CCCC; */
/* QueryB_ID = AAAA */
QueryB->subNodeOne = NULL;
/* QueryB_ID = AAAA; */
QueryB->scalar = Value Y;
/* QueryB_ID = hash_any_extended(&(Value Y), sizeof(Value Y), QueryB_ID) = DDDD */
QueryB->subNodeOne = &(Value X);
/* QueryB_ID = hash_any_extended(&(Value X), sizeof(Value X), QueryB_ID) = EEEE */
----
Solution Two
============
Alternatively, we can change the hash sum when we encounter an empty node.
This approach may impact performance but will protect us from such errors
in the future. A patch for this solution is attached below
(0001-Query-ID-Calculation-Fix-Variant-B.patch).
Here is an example:
----
/* QueryA_ID = AAAA */
QueryA->subNodeOne = &(Value X);
/* QueryA_ID = hash_any_extended(&(Value Y), sizeof(Value Y), QueryA_ID) = BBBB */
QueryA->subNodeTwo = NULL;
/* QueryA_ID = hash_any_extended(&('\0'), sizeof(char), QueryA_ID) = CCCC */
/* QueryB_ID = AAAA */
QueryB->subNodeOne = NULL;
/* QueryB_ID = hash_any_extended(&('\0'), sizeof(char), QueryB_ID) = DDDD */
QueryB->subNodeOne = &(Value X);
/* QueryB_ID = hash_any_extended(&(Value X), sizeof(Value X), QueryB_ID) = EEEE */
----
pgsql-hackers by date: