Re: have you feel anything when you read this ? - Mailing list pgsql-sql

From PFC
Subject Re: have you feel anything when you read this ?
Date
Msg-id op.s6pt28eacigqcu@apollo13
Whole thread Raw
In response to Re: have you feel anything when you read this ?  ("Eugene E." <sad@bankir.ru>)
Responses Re: have you feel anything when you read this ?
List pgsql-sql
> the problem is: you'll get this four byte sequence '\000' _instead_ of  
> NUL-byte anyway.


http://www.postgresql.org/docs/8.1/interactive/datatype-binary.html says :

"A binary string is a sequence of octets (or bytes). Binary strings are  
distinguished from character strings by two characteristics: First, binary  
strings specifically allow storing octets of value zero and other  
"non-printable" octets (usually, octets outside the range 32 to 126).  
Character strings disallow zero octets, and also disallow any other octet  
values and sequences of octet values that are invalid according to the  
database's selected character set encoding. Second, operations on binary  
strings process the actual bytes, whereas the processing of character  
strings depends on locale settings. In short, binary strings are  
appropriate for storing data that the programmer thinks of as "raw bytes",  
whereas character strings are appropriate for storing text."

That's the whole point of escaping, so that data generated by COPY, for  
instance, does not include any funky characters, including the \0 (zero)  
character, so that you can use any standard tool on it, including grep...

I LIKE the fact that TEXT refuses invalid characters. It helps find bugs  
in my applications, like when I forget to process some 8-bit string before  
inserting it in my database which uses UNICODE. I definitely prefer an  
error than finding a month later that half my data has had all its  
accented characters screwed up.

in psql, you have to use the escaped syntax :

SELECT length('\\000'::BYTEA), length('\\001'::BYTEA),  
length('\\000'::TEXT), length('\\001'::TEXT); length | length | length | length
--------+--------+--------+--------      1 |      1 |      4 |      4

Your client library should take care of escaping and de-escaping. Here, in  
python :

>>> cursor.execute( "SELECT %s,%s::BYTEA", ("this is a normal string",  
>>> psycopg2.Binary( ">\x00< this is a string with a zero byte" ),) )
>>> r = cursor.fetchone()
>>> print r
['this is a normal string', <read-only buffer for 0x2aaaaab32210, size 37,  
offset 0 at 0x2aaaace27c70>]
>>> print str(r[1])
> < this is a string with a zero byte
>>> print repr(str(r[1]))
'>\x00< this is a string with a zero byte'
>>> ord(r[1][1])
0


Note : \x00 is python's escaping for the null byte


pgsql-sql by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: have you feel anything when you read this ?
Next
From: Richard Huxton
Date:
Subject: Re: update before drop causes OID problems in transaction?