Re: [PATCH] Add pretty-printed XML output option - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PATCH] Add pretty-printed XML output option
Date
Msg-id 532259.1678393296@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PATCH] Add pretty-printed XML output option  (Peter Smith <smithpb2250@gmail.com>)
Responses Re: [PATCH] Add pretty-printed XML output option
Re: [PATCH] Add pretty-printed XML output option
List pgsql-hackers
Peter Smith <smithpb2250@gmail.com> writes:
> The patch v19 LGTM.

I've looked through this now, and have some minor complaints and a major
one.  The major one is that it doesn't work for XML that doesn't satisfy
IS DOCUMENT.  For example,

regression=# select '<bar><val x="y">42</val></bar><foo></foo>'::xml is document;
 ?column?
----------
 f
(1 row)

regression=# select xmlserialize (content '<bar><val x="y">42</val></bar><foo></foo>' as text);
               xmlserialize
-------------------------------------------
 <bar><val x="y">42</val></bar><foo></foo>
(1 row)

regression=# select xmlserialize (content '<bar><val x="y">42</val></bar><foo></foo>' as text indent);
ERROR:  invalid XML document
DETAIL:  line 1: Extra content at the end of the document
<bar><val x="y">42</val></bar><foo></foo>
                              ^

This is not what the documentation promises, and I don't think it's
good enough --- the SQL spec has no restriction saying you can't
use INDENT with CONTENT.  I tried adjusting things so that we call
xml_parse() with the appropriate DOCUMENT or CONTENT xmloption flag,
but all that got me was empty output (except for a document header).
It seems like xmlDocDumpFormatMemory is not the thing to use, at least
not in the CONTENT case.  But libxml2 has a few other "dump"
functions, so maybe we can use a different one?  I see we are using
xmlNodeDump elsewhere, and that has a format option, so maybe there's
a way forward there.

A lesser issue is that INDENT tacks on a document header (XML declaration)
whether there was one or not.  I'm not sure whether that's an appropriate
thing to do in the DOCUMENT case, but it sure seems weird in the CONTENT
case.  We have code that can strip off the header again, but we
need to figure out exactly when to apply it.

I also suspect that it's outright broken to attach a header claiming
the data is now in UTF8 encoding.  If the database encoding isn't
UTF8, then either that's a lie or we now have an encoding violation.

Another thing that's mildly irking me is that the current
factorization of this code will result in xml_parse'ing the data
twice, if you have both DOCUMENT and INDENT specified.  We could
consider avoiding that if we merged the indentation functionality
into xmltotext_with_xmloption, but it's probably premature to do so
when we haven't figured out how to get the output right --- we might
end up needing two xml_parse calls anyway with different parameters,
perhaps.

I also had a bunch of cosmetic complaints (mostly around this having
a bad case of add-at-the-end-itis), which I've cleaned up in the
attached v20.  This doesn't address any of the above, however.

            regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 467b49b199..53d59662b9 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4460,14 +4460,18 @@ xml '<foo>bar</foo>'
     <type>xml</type>, uses the function
     <function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
 <synopsis>
-XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
+XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> [ [NO]
INDENT] ) 
 </synopsis>
     <replaceable>type</replaceable> can be
     <type>character</type>, <type>character varying</type>, or
     <type>text</type> (or an alias for one of those).  Again, according
     to the SQL standard, this is the only way to convert between type
     <type>xml</type> and character types, but PostgreSQL also allows
-    you to simply cast the value.
+    you to simply cast the value. The option <type>INDENT</type> allows to
+    indent the serialized xml output - the default is <type>NO INDENT</type>.
+    It is designed to indent XML strings of type <type>DOCUMENT</type>, but it can also
+   be used with <type>CONTENT</type> as long as <replaceable>value</replaceable>
+   contains a well-formed XML.
    </para>

    <para>
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index 0fb9ab7533..bb4c135a7f 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -621,7 +621,7 @@ X061    XMLParse: character string input and DOCUMENT option            YES
 X065    XMLParse: binary string input and CONTENT option            NO
 X066    XMLParse: binary string input and DOCUMENT option            NO
 X068    XMLSerialize: BOM            NO
-X069    XMLSerialize: INDENT            NO
+X069    XMLSerialize: INDENT            YES
 X070    XMLSerialize: character string serialization and CONTENT option            YES
 X071    XMLSerialize: character string serialization and DOCUMENT option            YES
 X072    XMLSerialize: character string serialization            YES
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 19351fe34b..3dcd15d5f0 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -3829,6 +3829,7 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
             {
                 Datum       *argvalue = op->d.xmlexpr.argvalue;
                 bool       *argnull = op->d.xmlexpr.argnull;
+                text       *result;

                 /* argument type is known to be xml */
                 Assert(list_length(xexpr->args) == 1);
@@ -3837,8 +3838,12 @@ ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op)
                     return;
                 value = argvalue[0];

-                *op->resvalue = PointerGetDatum(xmltotext_with_xmloption(DatumGetXmlP(value),
-                                                                         xexpr->xmloption));
+                result = xmltotext_with_xmloption(DatumGetXmlP(value),
+                                                  xexpr->xmloption);
+                if (xexpr->indent)
+                    result = xmlserialize_indent(result);
+
+                *op->resvalue = PointerGetDatum(result);
                 *op->resnull = false;
             }
             break;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a0138382a1..efe88ccf9d 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -613,7 +613,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 %type <node>    xml_root_version opt_xml_root_standalone
 %type <node>    xmlexists_argument
 %type <ival>    document_or_content
-%type <boolean> xml_whitespace_option
+%type <boolean>    xml_indent_option xml_whitespace_option
 %type <list>    xmltable_column_list xmltable_column_option_list
 %type <node>    xmltable_column_el
 %type <defelt>    xmltable_column_option_el
@@ -702,7 +702,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     HANDLER HAVING HEADER_P HOLD HOUR_P

     IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
-    INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
+    INCLUDING INCREMENT INDENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
     INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
     INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION

@@ -15532,13 +15532,14 @@ func_expr_common_subexpr:
                     $$ = makeXmlExpr(IS_XMLROOT, NULL, NIL,
                                      list_make3($3, $5, $6), @1);
                 }
-            | XMLSERIALIZE '(' document_or_content a_expr AS SimpleTypename ')'
+            | XMLSERIALIZE '(' document_or_content a_expr AS SimpleTypename xml_indent_option ')'
                 {
                     XmlSerialize *n = makeNode(XmlSerialize);

                     n->xmloption = $3;
                     n->expr = $4;
                     n->typeName = $6;
+                    n->indent = $7;
                     n->location = @1;
                     $$ = (Node *) n;
                 }
@@ -15592,6 +15593,11 @@ document_or_content: DOCUMENT_P                        { $$ = XMLOPTION_DOCUMENT; }
             | CONTENT_P                                { $$ = XMLOPTION_CONTENT; }
         ;

+xml_indent_option: INDENT                            { $$ = true; }
+            | NO INDENT                                { $$ = false; }
+            | /*EMPTY*/                                { $$ = false; }
+        ;
+
 xml_whitespace_option: PRESERVE WHITESPACE_P        { $$ = true; }
             | STRIP_P WHITESPACE_P                    { $$ = false; }
             | /*EMPTY*/                                { $$ = false; }
@@ -16828,6 +16834,7 @@ unreserved_keyword:
             | INCLUDE
             | INCLUDING
             | INCREMENT
+            | INDENT
             | INDEX
             | INDEXES
             | INHERIT
@@ -17384,6 +17391,7 @@ bare_label_keyword:
             | INCLUDE
             | INCLUDING
             | INCREMENT
+            | INDENT
             | INDEX
             | INDEXES
             | INHERIT
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 78221d2e0f..2331417552 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -2331,6 +2331,7 @@ transformXmlSerialize(ParseState *pstate, XmlSerialize *xs)
     typenameTypeIdAndMod(pstate, xs->typeName, &targetType, &targetTypmod);

     xexpr->xmloption = xs->xmloption;
+    xexpr->indent = xs->indent;
     xexpr->location = xs->location;
     /* We actually only need these to be able to parse back the expression. */
     xexpr->type = targetType;
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 079bcb1208..4d2549ed03 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -631,6 +631,39 @@ xmltotext_with_xmloption(xmltype *data, XmlOptionType xmloption_arg)
 }


+text *
+xmlserialize_indent(text *data)
+{
+#ifdef USE_LIBXML
+    text       *result;
+    xmlDocPtr    doc;
+    xmlChar    *xmlbuf;
+    int            nbytes;
+
+    doc = xml_parse(data, XMLOPTION_DOCUMENT, false,
+                    GetDatabaseEncoding(), NULL);
+    Assert(doc);
+
+    /* Reformat with indenting requested */
+    xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);
+
+    xmlFreeDoc(doc);
+
+    if (!nbytes)
+        elog(ERROR, "could not indent the given XML document");
+
+    result = cstring_to_text_with_len((const char *) xmlbuf, nbytes);
+
+    xmlFree(xmlbuf);
+
+    return result;
+#else
+    NO_XML_SUPPORT();
+    return NULL;
+#endif
+}
+
+
 xmltype *
 xmlelement(XmlExpr *xexpr,
            Datum *named_argvalue, bool *named_argnull,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 371aa0ffc5..028588fb33 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -840,6 +840,7 @@ typedef struct XmlSerialize
     XmlOptionType xmloption;    /* DOCUMENT or CONTENT */
     Node       *expr;
     TypeName   *typeName;
+    bool        indent;            /* [NO] INDENT */
     int            location;        /* token location, or -1 if unknown */
 } XmlSerialize;

diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 4220c63ab7..8fb5b4b919 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1464,7 +1464,7 @@ typedef enum XmlExprOp
     IS_XMLPARSE,                /* XMLPARSE(text, is_doc, preserve_ws) */
     IS_XMLPI,                    /* XMLPI(name [, args]) */
     IS_XMLROOT,                    /* XMLROOT(xml, version, standalone) */
-    IS_XMLSERIALIZE,            /* XMLSERIALIZE(is_document, xmlval) */
+    IS_XMLSERIALIZE,            /* XMLSERIALIZE(is_document, xmlval, indent) */
     IS_DOCUMENT                    /* xmlval IS DOCUMENT */
 } XmlExprOp;

@@ -1489,6 +1489,8 @@ typedef struct XmlExpr
     List       *args;
     /* DOCUMENT or CONTENT */
     XmlOptionType xmloption pg_node_attr(query_jumble_ignore);
+    /* INDENT option for XMLSERIALIZE */
+    bool        indent;
     /* target type/typmod for XMLSERIALIZE */
     Oid            type pg_node_attr(query_jumble_ignore);
     int32        typmod pg_node_attr(query_jumble_ignore);
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index bb36213e6f..753e9ee174 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -205,6 +205,7 @@ PG_KEYWORD("in", IN_P, RESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("include", INCLUDE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("including", INCLUDING, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("increment", INCREMENT, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("indent", INDENT, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("index", INDEX, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("indexes", INDEXES, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("inherit", INHERIT, UNRESERVED_KEYWORD, BARE_LABEL)
diff --git a/src/include/utils/xml.h b/src/include/utils/xml.h
index 311da06cd6..a1dfe4c631 100644
--- a/src/include/utils/xml.h
+++ b/src/include/utils/xml.h
@@ -78,6 +78,7 @@ extern xmltype *xmlpi(const char *target, text *arg, bool arg_is_null, bool *res
 extern xmltype *xmlroot(xmltype *data, text *version, int standalone);
 extern bool xml_is_document(xmltype *arg);
 extern text *xmltotext_with_xmloption(xmltype *data, XmlOptionType xmloption_arg);
+extern text *xmlserialize_indent(text *data);
 extern char *escape_xml(const char *str);

 extern char *map_sql_identifier_to_xml_name(const char *ident, bool fully_escaped, bool escape_period);
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index ad852dc2f7..ddbf0ca16b 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -486,6 +486,112 @@ SELECT xmlserialize(content 'good' as char(10));

 SELECT xmlserialize(document 'bad' as text);
 ERROR:  not an XML document
+-- indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val x="y">42</val>               +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val x="y">42</val>               +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+-- no indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+               xmlserialize
+-------------------------------------------
+ <foo><bar><val x="y">42</val></bar></foo>
+(1 row)
+
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+               xmlserialize
+-------------------------------------------
+ <foo><bar><val x="y">42</val></bar></foo>
+(1 row)
+
+\set VERBOSITY terse
+-- indent malformed xml
+SELECT xmlserialize(DOCUMENT '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+ERROR:  not an XML document
+SELECT xmlserialize(CONTENT  '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+ERROR:  invalid XML document
+-- indent empty string
+SELECT xmlserialize(DOCUMENT '' AS text INDENT);
+ERROR:  not an XML document
+SELECT xmlserialize(CONTENT  '' AS text INDENT);
+ERROR:  invalid XML document
+-- whitespaces
+SELECT xmlserialize(DOCUMENT '  ' AS text INDENT);
+ERROR:  not an XML document
+SELECT xmlserialize(CONTENT  '  ' AS text INDENT);
+ERROR:  invalid XML document
+\set VERBOSITY default
+-- indent null
+SELECT xmlserialize(DOCUMENT NULL AS text INDENT);
+ xmlserialize
+--------------
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  NULL AS text INDENT);
+ xmlserialize
+--------------
+
+(1 row)
+
+-- indent different encoding (returns UTF-8)
+SELECT xmlserialize(DOCUMENT '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val>42</val>                     +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val>42</val>                     +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+-- 'no indent' = not using 'no indent'
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(DOCUMENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+ ?column?
+----------
+ t
+(1 row)
+
+SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(CONTENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+ ?column?
+----------
+ t
+(1 row)
+
 SELECT xml '<foo>bar</foo>' IS DOCUMENT;
  ?column?
 ----------
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index 70fe34a04f..2944f84103 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -309,6 +309,80 @@ ERROR:  unsupported XML feature
 LINE 1: SELECT xmlserialize(document 'bad' as text);
                                      ^
 DETAIL:  This functionality requires the server to be built with libxml support.
+-- indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val><...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val><...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+-- no indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val><...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val><...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+\set VERBOSITY terse
+-- indent malformed xml
+SELECT xmlserialize(DOCUMENT '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+ERROR:  unsupported XML feature at character 30
+SELECT xmlserialize(CONTENT  '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+ERROR:  unsupported XML feature at character 30
+-- indent empty string
+SELECT xmlserialize(DOCUMENT '' AS text INDENT);
+ERROR:  unsupported XML feature at character 30
+SELECT xmlserialize(CONTENT  '' AS text INDENT);
+ERROR:  unsupported XML feature at character 30
+-- whitespaces
+SELECT xmlserialize(DOCUMENT '  ' AS text INDENT);
+ERROR:  unsupported XML feature at character 30
+SELECT xmlserialize(CONTENT  '  ' AS text INDENT);
+ERROR:  unsupported XML feature at character 30
+\set VERBOSITY default
+-- indent null
+SELECT xmlserialize(DOCUMENT NULL AS text INDENT);
+ xmlserialize
+--------------
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  NULL AS text INDENT);
+ xmlserialize
+--------------
+
+(1 row)
+
+-- indent different encoding (returns UTF-8)
+SELECT xmlserialize(DOCUMENT '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(DOCUMENT '<?xml version="1.0" encoding="...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+SELECT xmlserialize(CONTENT  '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(CONTENT  '<?xml version="1.0" encoding="...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+-- 'no indent' = not using 'no indent'
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(DOCUMENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val><...
+                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(CONTENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+ERROR:  unsupported XML feature
+LINE 1: SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></...
+                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
 SELECT xml '<foo>bar</foo>' IS DOCUMENT;
 ERROR:  unsupported XML feature
 LINE 1: SELECT xml '<foo>bar</foo>' IS DOCUMENT;
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 4f029d0072..60dcb3d36a 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -466,6 +466,112 @@ SELECT xmlserialize(content 'good' as char(10));

 SELECT xmlserialize(document 'bad' as text);
 ERROR:  not an XML document
+-- indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val x="y">42</val>               +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val x="y">42</val>               +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+-- no indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+               xmlserialize
+-------------------------------------------
+ <foo><bar><val x="y">42</val></bar></foo>
+(1 row)
+
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+               xmlserialize
+-------------------------------------------
+ <foo><bar><val x="y">42</val></bar></foo>
+(1 row)
+
+\set VERBOSITY terse
+-- indent malformed xml
+SELECT xmlserialize(DOCUMENT '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+ERROR:  not an XML document
+SELECT xmlserialize(CONTENT  '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+ERROR:  invalid XML document
+-- indent empty string
+SELECT xmlserialize(DOCUMENT '' AS text INDENT);
+ERROR:  not an XML document
+SELECT xmlserialize(CONTENT  '' AS text INDENT);
+ERROR:  invalid XML document
+-- whitespaces
+SELECT xmlserialize(DOCUMENT '  ' AS text INDENT);
+ERROR:  not an XML document
+SELECT xmlserialize(CONTENT  '  ' AS text INDENT);
+ERROR:  invalid XML document
+\set VERBOSITY default
+-- indent null
+SELECT xmlserialize(DOCUMENT NULL AS text INDENT);
+ xmlserialize
+--------------
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  NULL AS text INDENT);
+ xmlserialize
+--------------
+
+(1 row)
+
+-- indent different encoding (returns UTF-8)
+SELECT xmlserialize(DOCUMENT '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val>42</val>                     +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+SELECT xmlserialize(CONTENT  '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+              xmlserialize
+----------------------------------------
+ <?xml version="1.0" encoding="UTF-8"?>+
+ <foo>                                 +
+   <bar>                               +
+     <val>42</val>                     +
+   </bar>                              +
+ </foo>                                +
+
+(1 row)
+
+-- 'no indent' = not using 'no indent'
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(DOCUMENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+ ?column?
+----------
+ t
+(1 row)
+
+SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(CONTENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+ ?column?
+----------
+ t
+(1 row)
+
 SELECT xml '<foo>bar</foo>' IS DOCUMENT;
  ?column?
 ----------
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 24e40d2653..fea875adfd 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -132,6 +132,32 @@ SELECT xmlserialize(content data as character varying(20)) FROM xmltest;
 SELECT xmlserialize(content 'good' as char(10));
 SELECT xmlserialize(document 'bad' as text);

+-- indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text INDENT);
+-- no indent
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+SELECT xmlserialize(CONTENT  '<foo><bar><val x="y">42</val></bar></foo>' AS text NO INDENT);
+\set VERBOSITY terse
+-- indent malformed xml
+SELECT xmlserialize(DOCUMENT '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+SELECT xmlserialize(CONTENT  '<foo></foo><bar><val x="y">42</val></bar>' AS text INDENT);
+-- indent empty string
+SELECT xmlserialize(DOCUMENT '' AS text INDENT);
+SELECT xmlserialize(CONTENT  '' AS text INDENT);
+-- whitespaces
+SELECT xmlserialize(DOCUMENT '  ' AS text INDENT);
+SELECT xmlserialize(CONTENT  '  ' AS text INDENT);
+\set VERBOSITY default
+-- indent null
+SELECT xmlserialize(DOCUMENT NULL AS text INDENT);
+SELECT xmlserialize(CONTENT  NULL AS text INDENT);
+-- indent different encoding (returns UTF-8)
+SELECT xmlserialize(DOCUMENT '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+SELECT xmlserialize(CONTENT  '<?xml version="1.0" encoding="ISO-8859-1"?><foo><bar><val>42</val></bar></foo>'
AStext INDENT); 
+-- 'no indent' = not using 'no indent'
+SELECT xmlserialize(DOCUMENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(DOCUMENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 
+SELECT xmlserialize(CONTENT '<foo><bar><val x="y">42</val></bar></foo>' AS text) = xmlserialize(CONTENT
'<foo><bar><valx="y">42</val></bar></foo>' AS text NO INDENT); 

 SELECT xml '<foo>bar</foo>' IS DOCUMENT;
 SELECT xml '<foo>bar</foo><bar>foo</bar>' IS DOCUMENT;

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: ICU locale validation / canonicalization
Next
From: Tom Lane
Date:
Subject: Re: buildfarm + meson