Re: Bug in UTF8-Validation Code? - Mailing list pgsql-hackers

From Mario Weilguni
Subject Re: Bug in UTF8-Validation Code?
Date
Msg-id 200703161217.15110.mweilguni@sime.com
Whole thread Raw
In response to Re: Bug in UTF8-Validation Code?  (Michael Paesold <mpaesold@gmx.at>)
List pgsql-hackers
Am Mittwoch, 14. März 2007 08:01 schrieb Michael Paesold:
> Andrew Dunstan wrote:
> >
> > This strikes me as essential. If the db has a certain encoding ISTM we
> > are promising that all the text data is valid for that encoding.
> >
> > The question in my mind is how we help people to recover from the fact
> > that we haven't done that.
>
> I would also say that it's a bug that escape sequences can get characters
> into the database that are not valid in the specified encoding. If you
> compare the encoding to table constraints, there is no way to simply
> "escape" a constraint check.
>
> This seems to violate the principle of consistency in ACID. Additionally,
> if you include pg_dump into ACID, it also violates durability, since it
> cannot restore what it wrote itself.
> Is there anything in the SQL spec that asks for such a behaviour? I guess
> not.
>
> A DBA will usually not even learn about this issue until they are presented
> with a failing restore.

Is there anything I can do to help with this problem? Maybe implementing a new
GUC variable that turns off accepting wrong encoded sequences (so DBAs still
can turn it on if they really depend on it)?

For me,

Best regards,
Mario Weilguni


pgsql-hackers by date:

Previous
From: Grzegorz Jaskiewicz
Date:
Subject: Re: [RFC] CLUSTER VERBOSE
Next
From: "Pavan Deolasee"
Date:
Subject: Question: pg_class attributes and race conditions ?