Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail - Mailing list pgsql-bugs
From | Jan Lentfer |
---|---|
Subject | Encoding problems with "COMMENT ON DATABASE .." causing pg_restore (and pg_upgrade) to fail |
Date | |
Msg-id | 666abab3fe03561c77dbfc661b023985@neslonek.homeunix.org Whole thread Raw |
Responses |
Re: Encoding problems with "COMMENT ON DATABASE .." causing
pg_restore (and pg_upgrade) to fail
|
List | pgsql-bugs |
We found a problem with comments on databases using german umlauts (=C3=A4= =C3=B6=C3=BC)=20 - or more generally, we found encoding problems when doing "COMMENT ON=20 database foo ..." when foo is UTF-8 encoded and the database you are=20 connected to (e.g. postgres) is SQL_ASCII. I analyzed this with the help of Andrew Gierth (and others) on IRC - it=20 seems in that constellation you can write an non-UTF-8 comment onto a=20 UTF-8 database. This leads to the problem, that wen trying to use pg_restore -C with a=20 dump created with -Fc the restore will fail when trying to do the=20 "COMMENT ON DATABASE.." We stumpled across this because also pg_upgrade fails on this (during=20 the schema part), but doesn't detect this situation with the -c (check)=20 option beforehand. This is 9.4.5 on Solaris 11 - but we did have that problem already when=20 using pg_upgrade from 8.4 to 9.1 a few years back. I am sorry we didn't=20 analyze further back then, but it was "just a comment". But now it bit=20 us again.... Here is what I did so far to analyze the problem (with instrucions from=20 Andrew Gierth). I hope this makes it clear, otherwise please don't=20 hesitate to request more details. postgres=3D# create database comment_test template template0 encoding=20 'UTF-8'; CREATE DATABASE ##### ---> use pgadmin3 on Windows to set comment on database coment_test with=20 string "f=C3=BCr", while being connected to postgres (SQL_ASCII encoded) ##### $ pg_dump -Fc comment_test -f comment_test.pgdump postgres=3D# alter database comment_test rename to comment_test_orig; ALTER DATABASE ##### $ pg_restore -C -d template1 comment_test.pgdump pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 1989; 1262 641528=20 COMMENT comment_test postgres pg_restore: [archiver (db)] could not execute query: ERROR: invalid=20 byte sequence for encoding "UTF8": 0xfc Command was: COMMENT ON DATABASE comment_test IS 'f=C2=A6r'; ### postgres@sz-pp-pg02-dev[~] $ pg_restore comment_test.pgdump -- -- PostgreSQL database dump -- SET statement_timeout =3D 0; SET lock_timeout =3D 0; SET client_encoding =3D 'UTF8'; SET standard_conforming_strings =3D on; SET check_function_bodies =3D false; SET client_min_messages =3D warning; -- -- Name: comment_test; Type: COMMENT; Schema: -; Owner: postgres -- COMMENT ON DATABASE comment_test IS 'f=C2=A6r'; -- -- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: -- CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog; -- -- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: -- COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language'; -- -- Name: public; Type: ACL; Schema: -; Owner: postgres -- REVOKE ALL ON SCHEMA public FROM PUBLIC; REVOKE ALL ON SCHEMA public FROM postgres; GRANT ALL ON SCHEMA public TO postgres; GRANT ALL ON SCHEMA public TO PUBLIC; -- -- PostgreSQL database dump complete -- ####### $ pg_restore comment_test.pgdump | head -16 | tail -1 |xxd 0000000: 434f 4d4d 454e 5420 4f4e 2044 4154 4142 COMMENT ON DATAB 0000010: 4153 4520 636f 6d6d 656e 745f 7465 7374 ASE comment_test 0000020: 2049 5320 2766 fc72 273b 0a IS 'f.r';. ###### postgres=3D# select=20 convert_to(shobj_description(oid,'pg_database'),'SQL_ASCII') from=20 pg_database where datname=3D'comment_test_orig'; convert_to ------------ \x66fc72 (1 row) ###### postgres=3D# \l List of databases Name | Owner | Encoding | Collate | =20 Ctype | Access privileges --------------------+-----------+-----------+------------------+---------= ---------+----------------------- comment_test | postgres | UTF8 | C | C = =20 | [...] postgres | postgres | SQL_ASCII | C | C = =20 | template0 | postgres | SQL_ASCII | C | C = =20 | =3Dc/postgres + | | | | = =20 | postgres=3DCTc/postgres template1 | postgres | SQL_ASCII | C | C = =20 | postgres=3DCTc/postgres+ | | | | = =20 | =3Dc/postgres
pgsql-bugs by date: