Thread: BUG #15342: pg_dump - XML with mixed content types generates invalidbackup file
BUG #15342: pg_dump - XML with mixed content types generates invalidbackup file
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 15342 Logged by: Ryan Lambert Email address: ryan@rustprooflabs.com PostgreSQL version: 9.6.7 Operating system: Ubuntu 16; Ubuntu 18; Raspbian (Pi) Description: Greetings! It seems that `pg_dump` is unable to provide a reliable database backups that include specific combinations of XML data. The following SQL Fiddle creates a table with three rows of XML data. The first row, "Document, no DOCTYPE" is the only row of the three that will always load from a backup from `pg_dump`. I've tried this one a few sub-versions of 9.6 and 9.5. http://sqlfiddle.com/#!17/78a83/1/0 The second row added includes a DOCTYPE declaration in the XML. Restoring this row from pg_dump will fail unless you add `SET XML OPTION DOCUMENT;`. Trying to restore the pg_dump file without adding `SET XML OPTION DOCUMENT` returns: ``` ERROR: invalid XML content DETAIL: line 2: StartTag: invalid element name <!DOCTYPE document SYSTEM "subjects.dtd"> ^ CONTEXT: COPY xml_doc, line 2, column data: "<?xml version="1.0" standalone="no"?> <!DOCTYPE document SYSTEM "subjects.dtd"> <document> <..." ``` The third row restores with the default setting but fails if `SET XML OPTION DOCUMENT;` is set. ``` ERROR: invalid XML document DETAIL: line 1: Start tag expected, '<' not found abc<foo>bar</foo><bar>foo</bar> ^ CONTEXT: COPY xml_doc, line 3, column data: "abc<foo>bar</foo><bar>foo</bar>" ``` So it seems that if you have XML data that includes <!DOCTYPE> and other XML that is just fragments... pg_dump won't work without manual tinkering and headaches. The specific data I use that is hanging me up is the QGIS layer style data (stored in `public.layer_styles`).
Re: BUG #15342: pg_dump - XML with mixed content types generates invalid backup file
From
Tom Lane
Date:
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > It seems that `pg_dump` is unable to provide a reliable database backups > that include specific combinations of XML data. The following SQL Fiddle > creates a table with three rows of XML data. The first row, "Document, no > DOCTYPE" is the only row of the three that will always load from a backup > from `pg_dump`. I've tried this one a few sub-versions of 9.6 and 9.5. Hm. So there are two problems here: pg_dump neglects to force a safe value of xmloption for the restore step, plus there doesn't seem to be a safe value for it to force :-(. The first part of that is trivial to fix, the second perhaps not so much. However, the fine manual quoth (in 8.13 XML Type) SET xmloption TO { DOCUMENT | CONTENT }; The default is CONTENT, so all forms of XML data are allowed. which makes it seem that the CONTENT setting was intended to work for this. Perhaps somebody just got overenthusiastic about throwing errors? regards, tom lane