Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET
Date
Msg-id Z9NrnVk7MtMe8uNF@paquier.xyz
Whole thread Raw
In response to Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Query ID Calculation Fix for DISTINCT / ORDER BY and LIMIT / OFFSET
List pgsql-hackers
On Wed, Mar 12, 2025 at 03:28:27PM +1300, David Rowley wrote:
> If we account for the NULLs, those two cases become: "<windowClause
> bytes><distinctClause bytes><byte for NULL sortClause>" and
> "<windowClause bytes><byte for NULL distinct clause><sortClause
> bytes>", which is going to be highly unlikely to hash to the same
> value due to the buffer not being the same.
>
> Can you explain where my understanding is wrong?

(I am a bit busy this week, sorry for the delay)

I have looked at the patches sent at
https://www.postgresql.org/message-id/5ac172e0b77a4baba50671cd1a15285f@localhost.localdomain,
and they don't really address what I'm trying to point at here.

Custom jumbling functions could still be problematic, making
uneffective a NULL check at the beginning of jumbleNode().

Here is an example, possible by design, that would still be a problem:
static void
_jumbleNodeFoo(JumbleState *jstate, Node *node)
{
   Foo *expr = (Foo *) node;

   if (expr->field1 == 0)
       return;

    JUMBLE_FIELD(field2);
   [..]
}

typedef struct Foo
{
   pg_node_attr(custom_query_jumble)

   NodeTag     type;
   int field1;
   char *field2;
   [..]
}

Then we have the same problem with another Node using this Foo node
two times in a row, depending on how it's used by the query parsing
and transform:
typedef struct FooBar
{
   pg_node_attr(custom_query_jumble)

   NodeTag     type;
   Foo *foo1;
   Foo *foo2;
   [..]
}

Adding a depth level incremented once we go through a Node would add
more entropy, but it may not ve completely bullet-proof either, even
if we add an offset.  I am pretty sure I could find a Node structure
that acts as counter example even if you add a depth level and an
offset in the jumbling..
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: ecdh support causes unnecessary roundtrips
Next
From: Thomas Munro
Date:
Subject: Re: md.c vs elog.c vs smgrreleaseall() in barrier