Re: [survey] New "Stable" QueryId based on normalized query text - Mailing list pgsql-hackers

From Jim Finnerty
Subject Re: [survey] New "Stable" QueryId based on normalized query text
Date
Msg-id 1565400424579-0.post@n3.nabble.com
Whole thread Raw
In response to Re: [survey] New "Stable" QueryId based on normalized query text  (Julien Rouhaud <rjuju123@gmail.com>)
Responses Re: [survey] New "Stable" QueryId based on normalized query text  (Julien Rouhaud <rjuju123@gmail.com>)
Re: [survey] New "Stable" QueryId based on normalized query text  (legrand legrand <legrand_legrand@hotmail.com>)
List pgsql-hackers
I missed this thread.  I'd be happy to post the code for what we use as the
stable query identifier, but we could definitely come up with a more
efficient algorithm if we're willing to assume that the sql statements are
the same if and only if the parse tree structure is the same.

Currently what we do for the sql hash is to simply replace all the literals
and then hash the resulting SQL string, because for our use case we wanted
to be insensitive to the even the structure of the parse tree from one
release to the next.  That may be too conservative for other use cases.  If
it's ok to assume that the structure of the Query tree doesn't change, then
you could define a stable identifier for each node type, ignore literal
constants, and hash fully-qualified object names instead of OIDs.  That
should be pretty fast.

We also compute a plan hash that converts Plan tree node id's into stable
identifiers, and computes a cheap hash function over all nodes in the plan. 
This is fast and efficient.  It's also pretty straightforward to convert
node id's to stable identifiers.

A complication that we recently had to deal with was hashing and normalizing
the text of queries inside pl/pgsql functions, where variables are converted
to parameter markers.  In that case the sql text is transformed to contain
both parameter markers and literal replacement markers before computing the
sql hash.



-----
Jim Finnerty, AWS, Amazon Aurora PostgreSQL
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Add "password_protocol" connection parameter to libpq
Next
From: Craig Ringer
Date:
Subject: Re: Global temporary tables