[POC] Allow an extension to add data into Query and PlannedStmt nodes - Mailing list pgsql-hackers

From Andrey Lepikhov
Subject [POC] Allow an extension to add data into Query and PlannedStmt nodes
Date
Msg-id e321eec2-b91c-1378-250a-e38dcf0ed827@postgrespro.ru
Whole thread Raw
Responses Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
Hi,

Previously, we read int this mailing list some controversial opinions on 
queryid generation and Jumbling technique. Here we don't intend to solve 
these problems but help an extension at least don't conflict with others 
on the queryId value.

Extensions could need to pass some query-related data through all stages 
of the query planning and execution. As a trivial example, 
pg_stat_statements uses queryid at the end of execution to save some 
statistics. One more reason - extensions now conflict on queryid value 
and the logic of its changing. With this patch, it can be managed.

This patch introduces the structure 'ExtensionData' which allows to 
manage of a list of entries with a couple of interface functions 
addExtensionDataToNode() and GetExtensionData(). Keep in mind the 
possible future hiding of this structure from the public interface.
An extension should invent a symbolic key to identify its data. It may 
invent as many additional keys as it wants but the best option here - is 
no more than one entry for each extension.
Usage of this machinery is demonstrated by the pg_stat_statements 
example - here we introduced Bigint node just for natively storing of 
queryId value.

Ruthless pgbench benchmark shows that we got some overhead:
1.6% - in default mode
4% - in prepared mode
~0.1% in extended mode.

An optimization that avoids copying of queryId by storing it into the 
node pointer field directly allows to keep this overhead in a range of 
%0.5 for all these modes but increases complexity. So here we 
demonstrate not optimized variant.

Some questions still cause doubts:
- QueryRewrite: should we copy extension fields from the parent 
parsetree to the rewritten ones?
- Are we need to invent a registration procedure to do away with the 
names of entries and use some compact integer IDs?
- Do we need to optimize this structure to avoid a copy for simple data 
types, for example, inventing something like A_Const?

All in all, in our opinion, this issue is tend to grow with an 
increasing number of extensions that utilize planner and executor hooks 
for some purposes. So, any thoughts will be useful.

-- 
Regards
Andrey Lepikhov
Postgres Professional
Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [BUG] pg_stat_statements and extended query protocol
Next
From: Peter Eisentraut
Date:
Subject: Re: TAP output format in pg_regress