Home > mailing lists

Re: Reducing output size of nodeToString - Mailing list pgsql-hackers

From	Matthias van de Meent
Subject	Re: Reducing output size of nodeToString
Date	January 31, 2024 16:17:03
Msg-id	CAEze2Wgd1Z+7Z2bb8Q4Nnk1ki55aH0acWxAyO7TfesMozVs5JQ@mail.gmail.com Whole thread
In response to	Re: Reducing output size of nodeToString (Peter Eisentraut <peter@eisentraut.org>)
Responses	Re: Reducing output size of nodeToString
List	pgsql-hackers

Tree view

On Wed, 31 Jan 2024, 09:16 Peter Eisentraut, <peter@eisentraut.org> wrote:

On 30.01.24 12:26, Matthias van de Meent wrote:
>> Most of the other defaults I'm doubtful about. First, we are colliding
>> here between the goals of minimizing the storage size and making the
>> debug output more readable.
> I've never really wanted to make the output "more readable". The
> current one is too verbose, yes.

My motivations at the moment to work in this area are (1) to make the
output more readable, and (2) to reduce maintenance burden of node
support functions.

There can clearly be some overlap with your goals. For example, a less
verbose and less redundant output can ease readability. But it can also
go the opposite direction; a very minimalized output can be less readable.

I would like to understand your target more. You have shown some
figures how these various changes reduce storage size in pg_rewrite.
But it's a few hundred kilobytes, if I read this correctly, maybe some
megabytes if you add a lot of user views. Does this translate into any
other tangible benefits, like you can store more views, or processing
views is faster, or something like that?

I was also thinking about smaller per-attribute expression storage, for index attribute expressions, table default expressions, and functions. Other than that, less memory overhead for the serialized form of these constructs also helps for catalog cache sizes, etc.
People complained about the size of a fresh initdb, and I agreed with them, so I started looking at low-hanging fruits, and this is one.

I've not done any tests yet on whether it's more performant in general. I'd expect the new code to do a bit better given the extremely verbose nature of the data and the rather complex byte-at-a-time token read method used, but this is currently hypothesis.

I do think that serialization itself may be slightly slower, but given that this generally happens only in DDL, and that we have to grow the output buffer less often, this too may still be a net win (but, again, this is an untested hypothesis).

Kind regards,

Matthias van de Meent

Neon (https://neon.tech)

pgsql-hackers by date:

From: Robert Haas
Date: 31 January 2024, 15:56:13
Subject: Re: Possibility to disable `ALTER SYSTEM`

From: Robert Haas
Date: 31 January 2024, 17:07:53
Subject: Re: pgsql: Clean pg_walsummary's tmp_check directory.

Re: Reducing output size of nodeToString - Mailing list pgsql-hackers

Previous

Next