Re: Refactor query normalization into core query jumbling - Mailing list pgsql-hackers
| From | Sami Imseih |
|---|---|
| Subject | Re: Refactor query normalization into core query jumbling |
| Date | |
| Msg-id | CAA5RZ0tKhUXQcyqOqKaBXfmjMZnYVkx44=3DHneomRuBBsZ4bA@mail.gmail.com Whole thread Raw |
| In response to | Re: Refactor query normalization into core query jumbling (Michael Paquier <michael@paquier.xyz>) |
| List | pgsql-hackers |
> > This way, any extension that wishes to return a normalized string from > > the same JumbleState can invoke this callback and get consistent results. > > pg_stat_statements and other extensions with a need to normalize a query > > string based on the locations of a JumbleState do not need to care about the > > internals of normalization, they simply invoke the callback and > > receive the final > > string. > > Hmm. I did not wrap completely my head with your problem, but, > assuming that what you are proposing goes in the right direction, The first goal is to move all query-normalization-related infrastructure that pg_stat_statements (and other extensions) rely on into core, so extensions no longer need to copy or reimplement normalization logic and can all depend on a single, shared implementation. In addition, query normalization necessarily modifies JumbleState (to record constant locations and lengths). This responsibility should not fall to extensions and should instead be delegated to core. I will argue that the current design, in which extensions handle this directly, is a layering violation. As a first step, we can move generate_normalized_query to core as a global function, allowing extensions to simply call it. > I am wondering if we should not expose a bit more the jumble query APIs so > as the normal default callback can be reused by out-of-core rather > than hide it entirely. This would mean exposing > GenerateNormalizedQuery(), which also giving a way for callers of > JumbleQuery() to pass down a custom callback? This would imply > thinking harder about the initialization state we expect in the > structure, but I think that we should try to design things so as > extensions do not need to copy-paste more code from the core tree at > the end, just less of it. ... and this will be taking the next step which is providing callbacks and making more jumbling utilities global. This will require more discussion, but I would think we would expose InitJumble() and it will do the bare minimum to initialize a JumbleState, and some fields that can define callbacks after the fact. There will be a callback for a normalization function and a callback function that will allow the user to implement jumbling functions for nodes that are currently not included in queryjumblefuncs.switch.c, or perhaps they can override the existing logic in this generated file. > Of course, this sentence is written with the same line of thoughts as > previously mentioned in the other thread we have discussed: extensions > should not be allowed to update a JumbleState after it's been set by > the backend code, so as once the same JumbleState pointer is passed > down across multiple extensions they don't get confused. If an > extension wants to use their own policy within the JumbleState, they > had better recreate a new independent one if they are unhappy about > has been generated previously. Yes, correct. If we provide the interface to create an additional JumbleState, they can create an independent state. For this thread, I would like to focus on the first goal. What do you think? -- Sami Imseih Amazon Web Services (AWS)
pgsql-hackers by date: