On Mon, 14 Apr 2025, Dimitrios Apostolou wrote:
> Hello list,
>
> I implemented --clean support for --data-only, in order to avoid logging to
> the WAL while populating the database. The attached patch issues a TRUNCATE
> before COPY on each worker process, and provides a significant speed
> advantage if the cluster is configure with wal_level=minimal.
>
> It also provides a safer way to load the database, as avoiding WAL logging
> also avoids potential and painful ENOSPACE on the WAL partition as I
> experienced in [1]. In other words it makes things much better for my use
> case.
>
> [1] https://www.postgresql.org/message-id/flat/076464ad-3d70-dd25-9e8f-e84f27decfba%40gmx.net
>
Rebased and attached v2 of the patch.
It needed some adjustments for the new flags --with-schema and
--with-data.
I have used this patch several times to pg_restore terabytes of tables
without logging through the WAL, and it performs great.
> But it has some rough edges. I would appreciate guidance and feedback.
The rough edges remain: TRUNCATE fails if there are foreign keys. So if
you try pg_restore --data-only --clean to a table referenced via foreign
keys, the patch will not work, as mentioned below.
>
> * When the table-to-be-TRUNCATEd is referenced as foreign key from other
> table, the whole transaction fails with:
>
> ERROR: cannot truncate a table referenced in a foreign key constraint
>
> 1. As a first step, when TRUNCATE fails I want to try a DELETE FROM
> instead, which has more chances of succeeding, and continuing with
> the COPY. How to detect the failure of ahprintf("TRUNCATE") and do
> the alternative without failing the whole transaction?
>
> 2. Why doesn't --disable-triggers help?
> To test this, I have manually issued
>
> ALTER TABLE x DISABLE TRIGGER ALL
>
> to every table and issued manual TRUNCATE still fails. Shouldn't
> postgres skip the referential integrity checks?
>
> 3. In my tests, all my tables start empty since I have just created the
> schema. Then pg_restore --data-only --clean first populates
> the /referencing/ tables, which is allowed because of disabled
> triggers, and then it tries to load the /referenced/ table.
>
> At this point the referential integrity is already broken. Getting an
> error when TRUNCATing the empty /referenced/ table doesn't make
> sense.
So is there a way to turn off the referential checks for a TRUNCATE?
Do you have any other feedback for this patch?
Thanks,
Dimitris