Re: Reducing output size of nodeToString - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: Reducing output size of nodeToString
Date
Msg-id CAEze2WhfRn0cdNer0Vkye_61BwAmMqM6D9_cJp8i6JmZ8U4wAA@mail.gmail.com
Whole thread Raw
In response to Re: Reducing output size of nodeToString  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Reducing output size of nodeToString
List pgsql-hackers
On Thu, 15 Feb 2024 at 15:37, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> On Thu, 15 Feb 2024 at 13:59, Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > Thanks, this patch set is a good way to incrementally work through these
> > changes.
> >
> > I have looked at
> > v4-0001-pg_node_tree-Omit-serialization-of-fields-with-de.patch today.
> > Here are my thoughts:
> >
> > I believe we had discussed offline to not omit enum fields with value 0
> > (WRITE_ENUM_FIELD).  This is because the values of enum fields are
> > implementation artifacts, and this could be confusing for readers.
>
> Thanks for reminding me, I didn't remember this when I worked on
> updating the patchset. I'll update this soon.

This has been split into patch 0008 in the set. A query on ev_action
shows that enum default-0-omission is effective on 1994 fields:

select match, count(*)
from pg_rewrite,
    lateral (
        select unnest(regexp_matches(ev_action, '(:\w+ 0)[^0-9]', 'g')) match
    )
group by 1 order by 2 desc;
     match      | count
-----------------+-------
 :funcformat 0   |   587
 :rtekind 0      |   449
 :limitOption 0  |   260
 :querySource 0  |   260
 :override 0     |   260
 :jointype 0     |   156
 :aggsplit 0     |    15
 :subLinkType 0  |     5
 :nulltesttype 0 |     2

> > On the reading side, the macro nesting has gotten a bit out of hand. :)
> > We had talked earlier in the thread about the _DIRECT macros and you
> > said there were left over from something else you want to try, but I see
> > nothing else in this patch set uses this.  I think this could all be
> > much simpler, like (omitting required punctuation)
> [...]
> > Not only is this simpler, but it might also have better performance,
> > because we don't have separate pg_strtok_next() and pg_strtok() calls in
> > sequence.
>
> Good points. I'll see what I can do here.

Attached the updated version of the patch on top of 5497daf3, which
incorporates this last round of feedback. It moves the
default-0-omission for Enums to newly added 0008, and checks the sign
to deal with +0/-0 issues in float default checks.
See below for updated numbers.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

New numbers:

select 'master' as "version"
     , pg_database_size('template0') as "template0"
     , pg_total_relation_size('pg_rewrite') as "rel_total"
     , pg_relation_size('pg_rewrite', 'main') as "rel_main"
     , sum(pg_column_size(ev_action)) as "toasted"
     , sum(octet_length(ev_action)) as "raw"
from pg_rewrite;

 version | template0 | rel_total | rel_main | toasted |   raw
---------+-----------+-----------+----------+---------+---------
 master  |   7528975 |    770048 |   114688 |  574051 | 3002981
 0001    |   7348751 |    630784 |   131072 |  448495 | 1972854
 0002    |   7250447 |    589824 |   131072 |  412261 | 1866880
 0003    |   7242255 |    581632 |   131072 |  410476 | 1864843
 0004    |   7225871 |    565248 |   139264 |  393801 | 1678735
 0005    |   7225871 |    565248 |   139264 |  393556 | 1675165
 0006    |   7217679 |    557056 |   139264 |  379062 | 1654178
 0007    |   7160335 |    491520 |   155648 |  322145 | 1363885
 0008    |   7135759 |    475136 |   155648 |  311294 | 1337649

Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Next
From: Tomas Vondra
Date:
Subject: Re: Optimize planner memory consumption for huge arrays