Add GUC to enable libxml2's XML_PARSE_HUGE - Mailing list pgsql-hackers
From | Jim Jones |
---|---|
Subject | Add GUC to enable libxml2's XML_PARSE_HUGE |
Date | |
Msg-id | 074d9029-45df-4bed-b3c7-58981bd4b545@uni-muenster.de Whole thread Raw |
Responses |
Re: Add GUC to enable libxml2's XML_PARSE_HUGE
|
List | pgsql-hackers |
Hi, In commit 71c0921 we re-introduced use of xmlParseBalancedChunkMemory in order to allow parsing of large XML documents with certain libxml2 versions [1]. While that solved a regression issue, it still leaves the handling of very large or deeply nested XML documents tied to libxml2’s internal limits and behaviuor. To address this, Erik and I would like to propose a new GUC, xml_parse_huge, which controls libxml2’s XML_PARSE_HUGE option. This makes the handling of large XML documents explicit and independent of libxml2 version quirks. The new predefined role pg_xml_parse_huge allows superusers to grant session-level use of this option without granting full superuser rights, so DBAs can flexibly delegate the capability in a controlled manner. Examples: $ /usr/local/postgres-dev/bin/psql postgres psql (19devel) Type "help" for help. postgres=# CREATE USER u1; CREATE ROLE postgres=# CREATE DATABASE db OWNER u1; CREATE DATABASE postgres=# \q # By default a user cannot set this parameter and the default value is 'off' $ /usr/local/postgres-dev/bin/psql -d db -U u1 psql (19devel) Type "help" for help. db=> SHOW xml_parse_huge; xml_parse_huge ---------------- off (1 row) db=> SET xml_parse_huge TO on; ERROR: permission denied to set parameter "xml_parse_huge" HINT: You must be a superuser or a member of the "pg_xml_parse_huge" role to set this option. db=> ALTER SYSTEM SET xml_parse_huge TO on; ERROR: permission denied to set parameter "xml_parse_huge" # This leads libxml2 to raise an error for text nodes exceeding XML_MAX_TEXT_LENGTH db=> CREATE TABLE t1 AS SELECT ('<root>' || repeat('X',10000001) || '</root>')::xml; ERROR: invalid XML content DETAIL: line 1: Resource limit exceeded: Text node too long, try XML_PARSE_HUGE XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX # The role pg_xml_parse_huge allows the user to set the new parameter $ /usr/local/postgres-dev/bin/psql postgres psql (19devel) Type "help" for help. postgres=# GRANT pg_xml_parse_huge TO u1; GRANT ROLE postgres=# \q $ /usr/local/postgres-dev/bin/psql -d db -U u1 psql (19devel) Type "help" for help. db=> SET xml_parse_huge TO on; SET db=> CREATE TABLE t1 AS SELECT ('<root>' || repeat('X',10000001) || '</root>')::xml; SELECT 1 # It is also possible to enable this feature by default for a user $ /usr/local/postgres-dev/bin/psql postgres psql (19devel) Type "help" for help. postgres=# CREATE USER u2; CREATE ROLE postgres=# GRANT pg_xml_parse_huge TO u2; GRANT ROLE postgres=# ALTER USER u2 SET xml_parse_huge TO on; ALTER ROLE postgres=# \q $ /usr/local/postgres-dev/bin/psql -d db -U u2 psql (19devel) Type "help" for help. db=> SHOW xml_parse_huge ; xml_parse_huge ---------------- on (1 row) # A superuser can enable this feature for a whole database (or the whole cluster via postgresql.conf): $ /usr/local/postgres-dev/bin/psql postgres psql (19devel) Type "help" for help. postgres=# CREATE DATABASE db2; CREATE DATABASE postgres=# ALTER DATABASE db2 SET xml_parse_huge TO on; ALTER DATABASE postgres=# SHOW xml_parse_huge ; xml_parse_huge ---------------- off (1 row) postgres=# \c db2 You are now connected to database "db2" as user "jim". db2=# SHOW xml_parse_huge ; xml_parse_huge ---------------- on (1 row) Attached is a first draft. * I'm CC'ing Tom and Michael since they were involved in the earlier discussion. Initially we considered creating a second GUC instead of a role, but decided that would be confusing and less manageable than having a single GUC with role-based delegation. Any thoughts or comments? [1] https://www.postgresql.org/message-id/flat/a8771e75-60ee-4c99-ae10-ca4832e1ec8d%40uni-muenster.de#1cfece11b1d62fbd43ed644e1f9710e2 Best regards, Jim
Attachment
pgsql-hackers by date: