On 30/3/2023 12:57, Julien Rouhaud wrote:
>> Extensions could need to pass some query-related data through all stages of
>> the query planning and execution. As a trivial example, pg_stat_statements
>> uses queryid at the end of execution to save some statistics. One more
>> reason - extensions now conflict on queryid value and the logic of its
>> changing. With this patch, it can be managed.
>
> I just had a quick lookc at the patch, and IIUC it doesn't really help on that
> side, as there's still a single official "queryid" that's computed, stored
> everywhere and later used by pg_stat_statements (which does then store in
> additionally to that official queryid).
Thank you for the attention!
This patch doesn't try to solve the problem of oneness of queryId. In
this patch we change pg_stat_statements and it doesn't set 0 into
queryId field according to its internal logic. And another extension
should do the same - use queryId on your own but not erase it - erase
your private copy in the ext_field.
>> Ruthless pgbench benchmark shows that we got some overhead:
>> 1.6% - in default mode
>> 4% - in prepared mode
>> ~0.1% in extended mode.
>
> That's a quite significant overhead. But the only reason to accept such a
> change is to actually use it to store additional data, so it would be
> interesting to see what the overhead is like once you store at least 2
> different values there.
Yeah, but as I said earlier, it can be reduced to much smaller value
just with simple optimization. Here I intentionally avoid it to discuss
the core concept.
>
>> - Are we need to invent a registration procedure to do away with the names
>> of entries and use some compact integer IDs?
>
> Note that the patch as proposed doesn't have any defense for two extensions
> trying to register something with the same name, or update a stored value, as
> AddExtensionDataToNode() simply prepend the new value to the list. You can
> actually update the value by just storing the new value, but it will add a
> significant overhead to every other extension that want to read another value.
Thanks a lot! Patch in attachment implements such an idea - extension
can allocate some entries and use these private IDs to add entries. I
hope, an extension would prefer to use only one entry for all the data
to manage overhead, but still.
>
> The API is also quite limited as each stored value has a single identifier.
> What if your extension needs to store multiple things? Since it's all based on
> Node you can't really store some custom struct, so you probably have to end up
> with things like "my_extension.my_val1", "my_extension.my_val2" which isn't
> great.
Main idea here - if an extension holds custom struct and want to pass it
along all planning and execution stages it should use extensible node
with custom read/write/copy routines.
--
regards,
Andrey Lepikhov
Postgres Professional