Re: WIP Patch: pg_dump structured - Mailing list pgsql-hackers

From Attila Soki
Subject Re: WIP Patch: pg_dump structured
Date
Msg-id 374850DA-783D-4616-BE03-3CCAA4AB7FB2@attilasoki.com
Whole thread Raw
In response to Re: WIP Patch: pg_dump structured  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: WIP Patch: pg_dump structured  (Attila Soki <pgsql@attilasoki.com>)
List pgsql-hackers

> On 12 Mar 2023, at 21:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Attila Soki <pgsql@attilasoki.com> writes:
>> This patch adds the structured output format to pg_dump.
>> This format is a plaintext output split up into multiple files and the
>> resulting small files are stored in a directory path based on the dumped object.
>
> Won't this fail completely with SQL objects whose names aren't suitable
> to be pathname components?  "A/B" is a perfectly good name so far as
> SQL is concerned.  You could also have problems with collisions on
> case-insensitive filesystems.

The “A/B” case is handled in _CleanFilename function, the slash and other
problematic characters are replaced.
You are right about the case-insensivity, this is not handled and will fail. I forgot
to handle that. I trying to find a way to handle this.

>
>> This format can be restored by feeding its plaintext toc file (restore-dump.sql)
>> to psql. The output is also suitable for manipulating the files with standard
>> editing tools.
>
> This seems a little contradictory: if you want to edit the individual
> files, you'd have to also update restore-dump.sql, or else it's pointless.
> It might make more sense to consider this as a write-only dump format
> and not worry about whether it can be restored directly.

The main motivation was to track changes with VCS at the file (object) level,
editing small files was intended as a second possible use case.
I did not know that a write-only format would go.

>
>> What do you think of this feature, any chance it will be added to pg_dump once
>> the patch is ready?
>
> I'm not clear on how big the use-case is.  It's not really obvious to
> me that this'd have any benefit over the existing plain-text dump
> capability.  You can edit those files too, at least till the schema
> gets too big for your editor.  (But if you've got many many thousand
> SQL objects, a file-per-SQL-object directory will also be no fun to
> deal with.)


I use something like this (a previous version) to track several thousand
objects. But I'm not sure if that would have a wide user base.
Therefore the wip to see if there is interest in this feature.
I think the advantage of having many small files is that it is recognizable
which file (object) is involved in a commit and that the SQL functions and
tables get a change history.

Thank you for your feedback.

Regards,
Attila Soki


pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Next
From: Justin Pryzby
Date:
Subject: Re: Progress report of CREATE INDEX for nested partitioned tables