Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes - Mailing list pgsql-hackers

From Andrey Lepikhov
Subject Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes
Date
Msg-id 9209b8f9-5e63-9f6f-b011-84274823f89e@postgrespro.ru
Whole thread Raw
In response to Re: [POC] Allow an extension to add data into Query and PlannedStmt nodes  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-hackers
On 30/3/2023 12:57, Julien Rouhaud wrote:
>> Extensions could need to pass some query-related data through all stages of
>> the query planning and execution. As a trivial example, pg_stat_statements
>> uses queryid at the end of execution to save some statistics. One more
>> reason - extensions now conflict on queryid value and the logic of its
>> changing. With this patch, it can be managed.
> 
> I just had a quick lookc at the patch, and IIUC it doesn't really help on that
> side, as there's still a single official "queryid" that's computed, stored
> everywhere and later used by pg_stat_statements (which does then store in
> additionally to that official queryid).
Thank you for the attention!
This patch doesn't try to solve the problem of oneness of queryId. In 
this patch we change pg_stat_statements and it doesn't set 0 into 
queryId field according to its internal logic. And another extension 
should do the same - use queryId on your own but not erase it - erase 
your private copy in the ext_field.

>> Ruthless pgbench benchmark shows that we got some overhead:
>> 1.6% - in default mode
>> 4% - in prepared mode
>> ~0.1% in extended mode.
> 
> That's a quite significant overhead.  But the only reason to accept such a
> change is to actually use it to store additional data, so it would be
> interesting to see what the overhead is like once you store at least 2
> different values there.
Yeah, but as I said earlier, it can be reduced to much smaller value 
just with simple optimization. Here I intentionally avoid it to discuss 
the core concept.
> 
>> - Are we need to invent a registration procedure to do away with the names
>> of entries and use some compact integer IDs?
> 
> Note that the patch as proposed doesn't have any defense for two extensions
> trying to register something with the same name, or update a stored value, as
> AddExtensionDataToNode() simply prepend the new value to the list.  You can
> actually update the value by just storing the new value, but it will add a
> significant overhead to every other extension that want to read another value.
Thanks a lot! Patch in attachment implements such an idea - extension 
can allocate some entries and use these private IDs to add entries. I 
hope, an extension would prefer to use only one entry for all the data 
to manage overhead, but still.
> 
> The API is also quite limited as each stored value has a single identifier.
> What if your extension needs to store multiple things?  Since it's all based on
> Node you can't really store some custom struct, so you probably have to end up
> with things like "my_extension.my_val1", "my_extension.my_val2" which isn't
> great.
Main idea here - if an extension holds custom struct and want to pass it 
along all planning and execution stages it should use extensible node 
with custom read/write/copy routines.

-- 
regards,
Andrey Lepikhov
Postgres Professional

Attachment

pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Thoughts on using Text::Template for our autogenerated code?
Next
From: Tom Lane
Date:
Subject: Re: Thoughts on using Text::Template for our autogenerated code?