Re: 7.4 Wishlist - Mailing list pgsql-hackers

From Joe Conway
Subject Re: 7.4 Wishlist
Date
Msg-id 3DEAD54D.5080303@joeconway.com
Whole thread Raw
In response to Re: 7.4 Wishlist  (David Wheeler <david@wheeler.net>)
Responses Re: 7.4 Wishlist
List pgsql-hackers
David Wheeler wrote:
> My understanding is that the nul character is legal in a byte sequence, 
> but if it's not properly escaped, it'll be parsed as the end of the 
> statement. Unfortunately, I think that it's a very tough problem to solve.

No question wrt '\0' bytes -- they would have to be escaped when casting from 
bytea to text.

The harder issue is that there are apparently many other multiple byte 
sequences that, while valid in an ASCII encoding, are not valid in one or more 
multibyte encodings. See this thread:

http://archives.postgresql.org/pgsql-hackers/2002-04/msg00236.php

This is why currently all "non printable characters" are escaped (which I 
think is all bytes > 127). Text on the other hand is already known to be valid 
for a particular encoding, so it doesn't need escaping.

I'm not sure what happens when the backend encoding and client encoding don't 
match -- I'd guess there is some probability of invalid byte sequences in that 
case too.

Joe



pgsql-hackers by date:

Previous
From: Philip Warner
Date:
Subject: toast table growing indefinitely? Known problems?
Next
From: Tom Lane
Date:
Subject: Re: toast table growing indefinitely? Known problems?