Re: Regarding bytea column in Posgresql - Mailing list pgsql-general

From Craig Ringer
Subject Re: Regarding bytea column in Posgresql
Date
Msg-id CAMsr+YHycF3KLuypOqSH4HDuJG4MaDeFhs5t-APRgr0x5tuVpQ@mail.gmail.com
Whole thread Raw
In response to Re: Regarding bytea column in Posgresql  (John R Pierce <pierce@hogranch.com>)
List pgsql-general


On 10 April 2015 at 03:27, John R Pierce <pierce@hogranch.com> wrote:
 
one possible rationale for using BYTEA is that the data could be in various encodings, which the application wishes to preserve, and keeps track of somewhere else (perhaps in a field within the XML?).

Thanks for bringing this up, as it's a good reason to use bytea for XML.

XML actually has an encoding field in the DTD declaration, e.g.

    <?xml version="1.0" encoding="UTF-8"?>

It is common - and of dubious correctness - for applications to store XML in a 'text' or 'xml' field without changing the 'encoding' field in the doctype to reflect the encoding at rest.

Personally I wish the 'xml' type in Pg knew how to change the encoding declaration dynamically, but I know it's a hairy problem; e.g. if the client_encoding is iso-8859-1, but the client then converts the XML document to utf-8 internally, the encoding will be wrong if the client doesn't change it back.

I've also run into XML documents that shove data in different encodings into CDATA sections. This is wrong, of course, but apps sometimes do it anyway.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: no pg_hba.conf entry for replication connection from host
Next
From: Craig Ringer
Date:
Subject: Re: Can a bdr enabled server belong to more than one bdr group?