Re: Implementing full UTF-8 support (aka supporting 0x00) - Mailing list pgsql-hackers

From Álvaro Hernández Tortosa
Subject Re: Implementing full UTF-8 support (aka supporting 0x00)
Date
Msg-id a7346dd0-a677-d3f2-814a-15705641f8cf@8kdata.com
Whole thread Raw
In response to Re: Implementing full UTF-8 support (aka supporting 0x00)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers

On 03/08/16 17:23, Tom Lane wrote:
> Álvaro Hernández Tortosa <aht@8kdata.com> writes:
>>       As has been previously discussed (see
>> https://www.postgresql.org/message-id/BAY7-F17FFE0E324AB3B642C547E96890%40phx.gbl
>> for instance) varlena fields cannot accept the literal 0x00 value.
> Yup.
>
>>       What would it take to support it?
> One key reason why that's hard is that datatype input and output
> functions use nul-terminated C strings as the representation of the
> text form of any datatype.  We can't readily change that API without
> breaking huge amounts of code, much of it not under the core project's
> control.
>
> There may be other places where nul-terminated strings would be a hazard
> (mumble fgets mumble), but offhand that API seems like the major problem
> so far as the backend is concerned.
>
> There would be a slew of client-side problems as well.  For example this
> would assuredly break psql and pg_dump, along with every other client that
> supposes that it can treat PQgetvalue() as returning a nul-terminated
> string.  This end of it would possibly be even worse than fixing the
> backend, because so little of the affected code is under our control.
>
> In short, the problem is not with having an embedded nul in a stored
> text value.  The problem is the reams of code that suppose that the
> text representation of any data value is a nul-terminated C string.
>
>             regards, tom lane
    Wow. That seems like a daunting task.
    I guess, then, than even implementing a new datatype based on bytea 
but that would use the text IO functions to show up as text (not 
send/recv) would neither work, right?
    Thanks for the input,
    Álvaro


-- 

Álvaro Hernández Tortosa


-----------
8Kdata




pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PostmasterContext survives into parallel workers!?
Next
From: Robert Haas
Date:
Subject: Re: Why we lost Uber as a user