Re: contrib/xml2: add function xml_encode_special_chars - Mailing list pgsql-patches

From John Gray
Subject Re: contrib/xml2: add function xml_encode_special_chars
Date
Msg-id pan.2004.11.08.14.53.57.515728@azuli.co.uk
Whole thread Raw
In response to contrib/xml2: add function xml_encode_special_chars  (Markus Bertheau <twanger@bluetwanger.de>)
List pgsql-patches
On Sun, 07 Nov 2004 13:03:33 +0000, Simon Riggs wrote:

> On Sun, 2004-11-07 at 12:56, Markus Bertheau wrote:
>> В Вск, 07.11.2004, в 09:33, Simon Riggs пишет:
>> > On Sat, 2004-11-06 at 23:42, Markus Bertheau wrote:
>> > > В Сбт, 06.11.2004, в 23:13, Simon Riggs пишет:
>> > > > On Sat, 2004-11-06 at 00:36, Markus Bertheau wrote:
>> > > > > В Сбт, 06.11.2004, в 01:24, Peter Eisentraut пишет:
>> > > > > > Markus Bertheau wrote:
>> > > > > > > attached is a patch that adds the function xml_encode_special_chars
>> > > > > > > to the xml2 contrib module. It's against 8.0beta4. It's intended for
>> > > > > > > commit.
>> > > > > >
>> > > > > > Would you also tell us what this function does?
>> > > > >
>> > > > > It calls the similarly named function from libxml2. It replaces
>> > > > > characters that carry a special meaning in XML (<, >, &, " and \r) with
>> > > > > their respective XML entities.
>> > > >
>> > > > Wow! Hadn't noticed xml2 didn't do that. Thats pretty important...
>> > >
>> > > What do you mean, it didn't do that? Where had you expected it to do
>> > > that?
>> >
>> > eh? I'm agreeing that your patch is important...
>>
>> I didn't question that :) I just don't understand what you mean with
>> "xml2 doesn't do that" - do you mean that you thought that that function
>> was already there? Or that special character encoding already takes
>> place somewhere else in xml2? I can't imagine where that would be, so I
>> asked :)
>
> I mistakenly assumed that the special character encoding took place
> automatically, without calling a specific function.
>
> It's pretty fragile without that, but you could go a long way before the
> lack of it hit you in the face, then no further.

It's not really fragile, considering the usage scope of contrib/xml2 -
Peter E points this out elsewhere, but if the characters are not already
escaped then the document is not valid XML. The routines in contrib/xml2
deal with processing incoming XML, in which the characters have to be
escaped already - if you used this routine automatically, you'd strip all
the XML tags from your document and turn it into a long text string!

It's certainly a worthwhile routine to add. I'd consider the lack of
exposure of character set management APIs to be a bigger failure of
contrib/xml2 (the Brazilian webpage appears to cover this point).

This function is (sort of) a "cast" function from text ->
XML. One of the things I would really like to do (if I can ever find the
time!) is produce a wrapper type for XML possessing relevant operators for
XPath etc. and enforcing well-formedness on data. xml_encode_special_chars
is then essentially one data-conversion function for composing simple XML
documents out of ordinary unescaped text (Strictly, you'd also have to
wrap it in tags to make it a well-formed document)

My email address is in README.xml2 (hidden in a paragraph near the bottom).

Regards

John


pgsql-patches by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] pg_arch.c call to sleep()
Next
From: Andrew Dunstan
Date:
Subject: fix compile warning for pg_backup_tar.c