pg_dump seems to be broken in regards to the "--exclude-table-data" option on Windows. - Mailing list pgsql-bugs

From tutiluren@tutanota.com
Subject pg_dump seems to be broken in regards to the "--exclude-table-data" option on Windows.
Date
Msg-id MCymWR3--3-2@tutanota.com
Whole thread Raw
Responses Re: pg_dump seems to be broken in regards to the "--exclude-table-data" option on Windows.
List pgsql-bugs
After taking a longer break from my problem, I have now made a fresh, clean, scientifically conducted experiment in order to truly get to the bottom of this annoying problem once and for all.

The issue is that pg_dump refuses to dump the database, with bizarre errors outputted, whenever the "--exclude-table-data" is included in the command *and* its value contains anything but lowercase a-z. See for yourselves:

The database in question is verified to have "UTF8" as encoding. It has one schema called "Test schäma" containing one table called "Test täble", containing one column called "Test cålumn" with one row where the column has the text "This should not be in the dump." (so that I can easily check if it's included in the dump).

First, I set cmd.exe to use Unicode, just to be sure:

C:\pg_dump_test>chcp 65001
Active code page: 65001

Now I try this series of commands:

C:\pg_dump_test>pg_dump --format plain --verbose --file "testdump.txt" --host="localhost" --port="5432" --username="postgres" --dbname="test"
pg_dump: last built-in OID is 16383
[retracted]
pg_dump: creating CONSTRAINT "Test schäma.Test täble Test täble_pkey"

= WORKS. The dump was successful. (In spite of weird output chars.)

C:\pg_dump_test>pg_dump --format plain --verbose --file "testdump.txt" --exclude-table-data="test" --host="localhost" --port="5432" --username="postgres" --dbname="test"
pg_dump: last built-in OID is 16383
[retracted]
pg_dump: creating CONSTRAINT "Test schäma.Test täble Test täble_pkey"

= WORKS. The dump was successful. (In spite of weird output chars.)

C:\pg_dump_test>pg_dump --format plain --verbose --file "testdump.txt" --exclude-table-data="\"Test schäma\".\"Test täble\"" --host="localhost" --port="5432" --username="postgres" --dbname="test"
pg_dump: last built-in OID is 16383
pg_dump: [archiver (db)] query failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xe4 0x62 0x6c
pg_dump: [archiver (db)] query was: SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n
ON n.oid OPERATOR(pg_catalog.=) c.relnamespace
WHERE c.relkind OPERATOR(pg_catalog.=) ANY
(array['r', 'S', 'v', 'm', 'f', 'p'])
  AND c.relname OPERATOR(pg_catalog.~) '^(Test täble)$'
  AND n.nspname OPERATOR(pg_catalog.~) '^(Test schäma)$'

= FAILED. The dump was aborted with these nonsensical errors.

C:\pg_dump_test>pg_dump --format plain --verbose --file "testdump.txt" --exclude-table-data="Test schäma"."Test täble" --host="localhost" --port="5432" --username="postgres" --dbname="test"
pg_dump: last built-in OID is 16383
pg_dump: [archiver (db)] query failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xe4 0x62 0x6c
pg_dump: [archiver (db)] query was: SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n
ON n.oid OPERATOR(pg_catalog.=) c.relnamespace
WHERE c.relkind OPERATOR(pg_catalog.=) ANY
(array['r', 'S', 'v', 'm', 'f', 'p'])
  AND c.relname OPERATOR(pg_catalog.~) '^(test täble)$'
  AND n.nspname OPERATOR(pg_catalog.~) '^(test schäma)$'

= FAILED. The dump was aborted with these nonsensical errors.

C:\pg_dump_test>pg_dump --format plain --verbose --file "testdump.txt" --exclude-table-data="Test schäma.Test täble" --host="localhost" --port="5432" --username="postgres" --dbname="test"
pg_dump: last built-in OID is 16383
pg_dump: [archiver (db)] query failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xe4 0x62 0x6c
pg_dump: [archiver (db)] query was: SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n
ON n.oid OPERATOR(pg_catalog.=) c.relnamespace
WHERE c.relkind OPERATOR(pg_catalog.=) ANY
(array['r', 'S', 'v', 'm', 'f', 'p'])
  AND c.relname OPERATOR(pg_catalog.~) '^(test täble)$'
  AND n.nspname OPERATOR(pg_catalog.~) '^(test schäma)$'

= FAILED. The dump was aborted with these nonsensical errors.

Finally, I tried the same command again like this:

--exclude-table-data="ä"

= FAILS!

--exclude-table-data="a"

= WORKS!

I looked everywhere for some kind of "client-encoding" option in the pg_dump manual, but there is no such thing. The only thing I can think of is that the client's encoding (that is, pg_dump) is for some reason not set to "UTF8" even though that's the encoding of the "test" database which I'm connecting to.

I believe that I have taken every reasonable step at this point to debug this on my own, and it truly appears as if the issue is that pg_dump has some broken internal logic which fails to account for non-simplistic identifiers, specifically the code which powers the "--exclude-table-data" option. However, it *could* still be "my fault"... although I don't see how!

pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: BUG #16550: Problem with pg_service.conf
Next
From: "David G. Johnston"
Date:
Subject: Re: pg_dump seems to be broken in regards to the "--exclude-table-data" option on Windows.