On 06/17/2011 11:29 AM, Nicolas Barbier wrote:
> 2011/6/17, Andrew Dunstan<andrew@dunslane.net>:
>
>> On 06/17/2011 10:55 AM, Radosław Smogura wrote:
>>
>>> XML canonization preservs whitespaces, if I remember
>>> well, I think there is example.
>>>
>>> In any case if I will store image in XML (I've seen this), preservation of
>>> white spaces and new lines is important.
>> If you store images you should encode them anyway, in base64 or hex.
> Whitespace that is not at certain obviously irrelevant places (such as
> right after "<", between attributes, outside of the whole document,
> etc), and that is not defined to be irrelevant by some schema (if the
> parser is schema-aware), is relevant. You cannot just muck around with
> it and consider that correct.
Sure, but if you're storing arbitrary binary data such as images
whitespace is the least of your problems. That's why I've always encoded
them in base64.
>> More generally, data that needs that sort of preservation should
>> possibly be in CDATA nodes.
> CDATA sections are just syntactic sugar (a form of escaping):
>
> <URL:http://www.w3.org/TR/xml-infoset/#omitted>
>
>
Yeah. OTOH doesn't an empty CDATA section force a child element, where a
pure empty element does not?
Anyway, we're getting a bit far from what Postgres needs to be doing.
cheers
andrew