Re: pg_dump's "--exclude-table" and "--exclude-table-data" options are ignored and/or cause the dump to fail entirely unless both the schema and table name use 1950s-era identifiers. - Mailing list pgsql-bugs

From Juan José Santamaría Flecha
Subject Re: pg_dump's "--exclude-table" and "--exclude-table-data" options are ignored and/or cause the dump to fail entirely unless both the schema and table name use 1950s-era identifiers.
Date
Msg-id CAC+AXB0-vx6wzfYg93f=YgZNUNgow1n+9ertuoV5pUVzjHJtOQ@mail.gmail.com
Whole thread Raw
In response to pg_dump's "--exclude-table" and "--exclude-table-data" options are ignored and/or cause the dump to fail entirely unless both the schema and table name use 1950s-era identifiers.  (tutiluren@tutanota.com)
List pgsql-bugs
Please keep the list in CC for future reference, and so the subscribers can contribute.

On Tue, Jul 21, 2020 at 7:32 PM <tutiluren@tutanota.com> wrote:
Jul 21, 2020, 11:12 AM by juanjo.santamaria@gmail.com:

On Tue, Jul 21, 2020 at 8:30 AM <tutiluren@tutanota.com> wrote:

Try it out yourself, by creating a test schema called "Personal stöff" and a table in it called "My däiary". Then create a text column and make it PK and then add the text "This is supposed to be ignored.". Then try to run this command:

pg_dump --format plain --verbose --file "C:\test.txt" --exclude-table-data="Personal stöff"."My däiary" --host="localhost" --port="5432" --username="postgres" --dbname="TestDB"

Just to avoid wasting time, when the command doesn't work at all, it outputs things like this:

pg_dump: [archiver (db)] query failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xf6 0x72 0x66 0x72
pg_dump: [archiver (db)] query was: SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n
ON n.oid OPERATOR(pg_catalog.=) c.relnamespace
WHERE c.relkind OPERATOR(pg_catalog.=) ANY
(array['r', 'S', 'v', 'm', 'f', 'p'])
  AND c.relname OPERATOR(pg_catalog.~) '^(table name)$'
  AND n.nspname OPERATOR(pg_catalog.~) '^(schema name)$'


The source of the problem is coming from how CMD works with UTF8 (or does not). The error you are getting is using code page Windows-1252 [1], 0xf6 is ö, but pg_dump is expecting UTF8 and crashes.

You can try to configure UTF8 as your CMD encoding, see [2]. Please tell us if this works for you.


I actually have very carefully made sure (from past problems) that the cmd.exe uses UTF-8 and the same goes for my databases and the connection and everything. It truly doesn't seem to have anything to do with this. Isn't it obvious from the output that pg_dump is lowercasing/changing the input?

The problem with that query is not that it does not return any rows because of case folding. Actually it crashes because it is expecting UTF8 input but is getting something else: "pg_dump: [archiver (db)] query failed: ERROR:  invalid byte sequence for encoding "UTF8": 0xf6 0x72 0x66 0x72"

I can reproduce a test case in an English_United States.1252 WIndows 10 machine, and the setting "Beta: Use unicode UTF-8 for worldwide language support", as mentioned above, worked in that case.

Regards,

Juan José Santamaría Flecha

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: cannot find postgresqllogreaderadapter
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: BUG #16476: pgp_sym_encrypt_bytea with compress-level=6 : Wrong key or corrupt data