Re: [PATCH] Add CANONICAL option to xmlserialize - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [PATCH] Add CANONICAL option to xmlserialize
Date
Msg-id CAFj8pRB11UDhubGinA9sLVVx678wXMkvxStLFTLvn+H3D2M=Qg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Add CANONICAL option to xmlserialize  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers


ne 25. 8. 2024 v 20:57 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

so 24. 8. 2024 v 7:40 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:

On 19.06.24 10:59, Jim Jones wrote:
> On 09.02.24 14:19, Jim Jones wrote:
>> v9 attached with rebase due to changes done to primnodes.h in 615f5f6
>>
> v10 attached with rebase due to changes in primnodes, parsenodes.h, and
> gram.y
>
v11 attached with rebase due to changes in xml.c

I try to check this patch

There is unwanted white space in the patch

-<-><--><-->xmlFreeDoc(doc);
+<->else if (format == XMLSERIALIZE_CANONICAL || format == XMLSERIALIZE_CANONICAL_WITH_NO_COMMENTS)
+ <>{
+<-><-->xmlChar    *xmlbuf = NULL;
+<-><-->int         nbytes;
+<-><-->int    

1. the xml is serialized to UTF8 string every time, but when target type is varchar or text, then it should be every time encoded to database encoding. Is not possible to hold utf8 string in latin2 database varchar.

2. The proposed feature can increase some confusion in implementation of NO IDENT. I am not an expert on this area, so I checked other databases. DB2 does not have anything similar. But Oracle's "NO IDENT" clause is very similar to the proposed "CANONICAL". Unfortunately, there is different behaviour of NO IDENT - Oracle's really removes formatting, Postgres does nothing.

Regards

I read https://www.w3.org/TR/xml-c14n11/ and if I understand this document, then CANONICAL <> "NO INDENT" ?

Regards

Pavel



Pavel



--
Jim

pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: type cache cleanup improvements
Next
From: Alexander Korotkov
Date:
Subject: Re: type cache cleanup improvements