Re: WIP: Generic functions for Node types using generated metadata - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: WIP: Generic functions for Node types using generated metadata |
Date | |
Msg-id | alpine.DEB.2.21.1908301414100.28828@lancre Whole thread Raw |
In response to | WIP: Generic functions for Node types using generated metadata (Andres Freund <andres@anarazel.de>) |
Responses |
Re: WIP: Generic functions for Node types using generated metadata
|
List | pgsql-hackers |
Hello Andres, Just my 0.02 €: > There's been a lot of complaints over the years about how annoying it is > to keep the out/read/copy/equalfuncs.c functions in sync with the actual > underlying structs. > > There've been various calls for automating their generation, but no > actual patches that I am aware of. I started something a while back, AFAICR after spending stupid time looking for a stupid missing field copy or whatever. I wrote a (simple) perl script deriving all (most) node utility functions for the header files. I gave up as the idea did not gather much momentum from committers, so I assumed the effort would be rejected in the end. AFAICR the feedback spirit was something like "node definition do not change often, we can manage it by hand". > There also recently has been discussion about generating more efficient > memory layout for node trees that we know are read only (e.g. plan trees > inside the plancache), and about copying such trees more efficiently > (e.g. by having one big allocation, and then just adjusting pointers). If pointers are relative to the start, it could be just indexes that do not need much adjusting. > One way to approach this problem would be to to parse the type > definitions, and directly generate code for the various functions. But > that does mean that such a code-generator needs to be expanded for each > such functions. No big deal for the effort I made. The issue was more dealing with exceptions (eg "we do not serialize this field because it is not used for some reason") and understanding some implicit assumptions in the struct declarations. > An alternative approach is to have a parser of the node definitions that > doesn't generate code directly, but instead generates metadata. And then > use that metadata to write node aware functions. This seems more > promising to me. Hmmm. The approach we had in an (old) research project was to write the meta data, and derive all struct & utility functions from these. It is simpler this way because you save parsing some C, and it can be made language agnostic (i.e. serializing the data structure from a language and reading its value from another). > I'm fairly sure this metadata can also be used to write the other > currently existing node functions. Beware of strange exceptions… > With regards to using libclang for the parsing: I chose that because it > seemed the easiest to experiment with, compared to annotating all the > structs with enough metadata to be able to easily parse them from a perl > script. I did not find this an issue when I tried, because the annotation needed is basically the type name of the field. > The node definitions are after all distributed over quite a few headers. Yep. > I think it might even be the correct way forward, over inventing our own > mini-languages and writing ad-hoc parsers for those. It sure is easier > to understand plain C code, compared to having to understand various > embeded mini-languages consisting out of macros. Dunno. > The obvious drawback is that it'd require more people to install > libclang - a significant imposition. Indeed. A perl-only dependence would be much simpler that relying on a particular library from a particular compiler to compile postgres, possibly with an unrelated compiler. > Alternatively we could annotate the code enough to be able to write our > own parser, or use some other C parser. If you can dictate some conventions, eg one line/one field, simple perl regexpr would work well I think, you would not need a parser per se. > I don't really want to invest significantly more time into this without > first debating the general idea. That what I did, and I quitted quickly:-) On the general idea, I'm 100% convinced that stupid utility functions should be either generic or generated, not maintained by hand. -- Fabien.
pgsql-hackers by date: