On 06.12.23 22:08, Matthias van de Meent wrote:
> PFA a patch that reduces the output size of nodeToString by 50%+ in
> most cases (measured on pg_rewrite), which on my system reduces the
> total size of pg_rewrite by 33% to 472KiB. This does keep the textual
> pg_node_tree format alive, but reduces its size signficantly.
>
> The basic techniques used are
> - Don't emit scalar fields when they contain a default value, and
> make the reading code aware of this.
> - Reasonable defaults are set for most datatypes, and overrides can
> be added with new pg_node_attr() attributes. No introspection into
> non-null Node/Array/etc. is being done though.
> - Reset more fields to their default values before storing the values.
> - Don't write trailing 0s in outDatum calls for by-ref types. This
> saves many bytes for Name fields, but also some other pre-existing
> entry points.
Based on our discussions, my understanding is that you wanted to produce
an updated patch set that is split up a bit.
My suggestion is to make incremental patches along these lines:
- Omit from output all fields that have value zero.
- Omit location fields that have value -1.
- Omit trailing zeroes for scalar values.
- Recent location fields before storing in pg_rewrite (or possibly
catalogs in general?)
- And then whatever is left, including the "default" value system that
you have proposed.
The last one I have some doubts about, as previously expressed, but the
first few seem sensible to me. By splitting it up we can consider these
incrementally.