Thread: Lead and tail quotes with \pset fieldsep

Lead and tail quotes with \pset fieldsep

From
"Bath, David"
Date:
Folks

If I want psql to generate CSV files fully-double-quoted I can
use pset as follows
psql> \pset fieldsep ","
However this does not put a quote before the first field and
after the last, so each row comes out as
1234","blahblah","sdfgsg","foo","bar

Is there a way to use psql to give rows like
1234","blahblah","sdfgsg","foo","bar
or is there another tool I should use.

COPY doesn't deal with views and restrictions well so it is
not an option.

Thanks in advance
-- 
David T. Bath
dave.bath@unix.net



Re: Lead and tail quotes with \pset fieldsep

From
Bruce Momjian
Date:
Bath, David wrote:
> Folks
> 
> If I want psql to generate CSV files fully-double-quoted I can
> use pset as follows
> psql> \pset fieldsep ","
> However this does not put a quote before the first field and
> after the last, so each row comes out as
> 1234","blahblah","sdfgsg","foo","bar
> 
> Is there a way to use psql to give rows like
> 1234","blahblah","sdfgsg","foo","bar
> or is there another tool I should use.
> 
> COPY doesn't deal with views and restrictions well so it is
> not an option.

Just do:
CREATE TEMP TABLE x AS SELECT * FROM view;COPY ... x

Sorry that is the only way to dump CVS properly.

--  Bruce Momjian   http://candle.pha.pa.us SRA OSS, Inc.   http://www.sraoss.com
 + If your life is a hard drive, Christ can be your backup. +


PostgreSQL Handling of Special Characters

From
"Christian Paul B. Cosinas"
Date:
Hi to all,

I am currently using PostgreSQL 8.0.3
My Database uses SQL_ASCII encoding.

I have a program in Visual Basic that connects to PostgreSQL using ODBC
Connection through File DSN.

I insert a text in one fields of my table. 
For example, I insert a special character to that field. 
Let's say a character of 150 ASCII code. Which looks like a hypen.

When I retrieve the value of that field it gives me a question mark
character instead of that  150 ASCII code character.

What could be the possible reason of this?
Is there a setting in the database configuration to understand that such
special characters?

Please..I really need an answer to this..

Thanks You.


I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html    



Re: PostgreSQL Handling of Special Characters

From
"Markus Bertheau"
Date:
2006/3/20, Christian Paul B. Cosinas <cpc@cybees.com>:

> Let's say a character of 150 ASCII code. Which looks like a hypen.
>
> When I retrieve the value of that field it gives me a question mark
> character instead of that  150 ASCII code character.
>
> What could be the possible reason of this?

Perhaps the ODBC driver thinks SQL_ASCII means ASCII and therefore
discards all bytes > 127. On PostgreSQL SQL_ASCII really means
SQL_ANYTHING, so to speak. Try to use for the database the encoding
you really use.

Markus Bertheau


Re: PostgreSQL Handling of Special Characters

From
"Christian Paul B. Cosinas"
Date:
Hi Markus,

Can you please elaborate more on this.
I'm really lost.
Thank You So Much.

-----Original Message-----
From: Markus Bertheau [mailto:mbertheau.pg@googlemail.com]
Sent: Sunday, March 19, 2006 8:09 PM
To: Christian Paul B. Cosinas
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] PostgreSQL Handling of Special Characters

2006/3/20, Christian Paul B. Cosinas <cpc@cybees.com>:

> Let's say a character of 150 ASCII code. Which looks like a hypen.
>
> When I retrieve the value of that field it gives me a question mark 
> character instead of that  150 ASCII code character.
>
> What could be the possible reason of this?

Perhaps the ODBC driver thinks SQL_ASCII means ASCII and therefore discards
all bytes > 127. On PostgreSQL SQL_ASCII really means SQL_ANYTHING, so to
speak. Try to use for the database the encoding you really use.

Markus Bertheau


I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html    



Re: PostgreSQL Handling of Special Characters

From
"Eugene E."
Date:
Christian Paul B. Cosinas wrote:
> Hi Markus,
> 
> Can you please elaborate more on this.
> I'm really lost.

Be sure that postgresql ITSELF is handling all chars transparently 
except ZEROES.

Look for the error in the media layer


Re: PostgreSQL Handling of Special Characters

From
Peter Eisentraut
Date:
Christian Paul B. Cosinas wrote:
> My Database uses SQL_ASCII encoding.

Do yourself a favor and use something else.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: PostgreSQL Handling of Special Characters

From
PFC
Date:
>> My Database uses SQL_ASCII encoding.
I just received an email with all accented characters destroyed. UNICODE
should be the default for anything in 2006.


have you feel anything when you read this ?

From
"Eugene E."
Date:
http://dev.mysql.com/doc/refman/5.0/en/news-5-0-19.html
--- cut ---
mysql no longer terminates data value display when it encounters a NUL 
byte. Instead, it displays NUL bytes as spaces. (Bug #16859)
--- cut ---


Re: have you feel anything when you read this ?

From
PFC
Date:
> have you feel anything when you read this ?
Business as usual...
It's more fun to grep "crash" on this page, which gets about 27 results...


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
PFC wrote:
> 
>> have you feel anything when you read this ?
> 
> 
>     Business as usual...
> 
>     It's more fun to grep "crash" on this page, which gets about 27 
> results...

i am not trying to fight against or for any brandname: not Mesql nor 
postgres.

just sed 's/MySQL/SomeDBMS/g' and concentrate on the fact:
that even so stupid DBMS handling NULs properly. :-)


Re: have you feel anything when you read this ?

From
Peter Eisentraut
Date:
Eugene E. wrote:
> PFC wrote:
> >> have you feel anything when you read this ?
> >
> >     Business as usual...
> >
> >     It's more fun to grep "crash" on this page, which gets about 27
> > results...
>
> i am not trying to fight against or for any brandname: not Mesql nor
> postgres.
>
> just sed 's/MySQL/SomeDBMS/g' and concentrate on the fact:
> that even so stupid DBMS handling NULs properly. :-)

So printing a space is "properly"?  Curious ...

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Peter Eisentraut wrote:
> Eugene E. wrote:
> 
>>PFC wrote:
>>
>>>>have you feel anything when you read this ?
>>>
>>>    Business as usual...
>>>
>>>    It's more fun to grep "crash" on this page, which gets about 27
>>>results...
>>
>>i am not trying to fight against or for any brandname: not Mesql nor
>>postgres.
>>
>>just sed 's/MySQL/SomeDBMS/g' and concentrate on the fact:
>>that even so stupid DBMS handling NULs properly. :-)
> 
> 
> So printing a space is "properly"?  Curious ...
> 

you may decide to print something else, aint'you ?
BUT
if they print them then they at least OUTPUT them.


Re: have you feel anything when you read this ?

From
Peter Eisentraut
Date:
Eugene E. wrote:
> you may decide to print something else, aint'you ?
> BUT
> if they print them then they at least OUTPUT them.

I'm not sure what you are getting at here.  The only data type in 
PostgreSQL that has a notion of null bytes is bytea, and bytea prints 
out null bytes in unambigious form.  Note that printing out a space 
will lose the null byte on restore, so that solution does not seem 
satisfactory.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Peter Eisentraut wrote:
> Eugene E. wrote:
> 
>>you may decide to print something else, aint'you ?
>>BUT
>>if they print them then they at least OUTPUT them.
> 
> 
> I'm not sure what you are getting at here.  The only data type in 
> PostgreSQL that has a notion of null bytes is bytea, and bytea prints 
> out null bytes in unambigious form.

the bytea does not output NULs at all.
don't mock me.


>  Note that printing out a space 
> will lose the null byte on restore,

ok, if you or they or me miscall OUTPUT "the printing"
then "print" NUL-byte itself to preserve it on restore.




Re: have you feel anything when you read this ?

From
Peter Eisentraut
Date:
Eugene E. wrote:
> the bytea does not output NULs at all.
> don't mock me.

peter=# create table test (a bytea);
CREATE TABLE
peter=# insert into test values ('a\\000b');
INSERT 0 1
peter=# select * from test;  a
--------a\000b

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: have you feel anything when you read this ?

From
Achilleus Mantzios
Date:
O Peter Eisentraut έγραψε στις Mar 20, 2006 :

> Eugene E. wrote:
> > the bytea does not output NULs at all.
> > don't mock me.
> 
> peter=# create table test (a bytea);
> CREATE TABLE
> peter=# insert into test values ('a\\000b');
> INSERT 0 1
> peter=# select * from test;
>    a
> --------
>  a\000b

Just did

dynacom=# SELECT '\150\145\154\154\157'::text;text
-------hello
(1 row)

dynacom=#
dynacom=#
dynacom=# SELECT '\000\150\145\154\154\157'::text;text
------

(1 row)

dynacom=#         

Oops!

> 
> 

-- 
-Achilleus



Re: have you feel anything when you read this ?

From
Alvaro Herrera
Date:
Achilleus Mantzios wrote:

> dynacom=# SELECT '\000\150\145\154\154\157'::text;
>  text
> ------
> 
> (1 row)
> 
> dynacom=#         
> 
> Oops!

text is not bytea.

alvherre=# SELECT $$\000\150\145\154\154\157$$::bytea;  bytea   
-----------\000hello
(1 fila)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Peter Eisentraut wrote:
> Eugene E. wrote:
> 
>>the bytea does not output NULs at all.
>>don't mock me.
> 
> 
> peter=# create table test (a bytea);
> CREATE TABLE
> peter=# insert into test values ('a\\000b');
> INSERT 0 1
> peter=# select * from test;
>    a
> --------
>  a\000b

are you kidding ?
where is NUL-byte in "a\000b" ???



Re: have you feel anything when you read this ?

From
Achilleus Mantzios
Date:
O Alvaro Herrera έγραψε στις Mar 20, 2006 :

> Achilleus Mantzios wrote:
> 
> > dynacom=# SELECT '\000\150\145\154\154\157'::text;
> >  text
> > ------
> > 
> > (1 row)
> > 
> > dynacom=#         
> > 
> > Oops!
> 
> text is not bytea.
> 
> alvherre=# SELECT $$\000\150\145\154\154\157$$::bytea;
>    bytea   
> -----------
>  \000hello
> (1 fila)

Sure, but we are trying to reproduce the mysql phaenomenon right? :)

> 
> 

-- 
-Achilleus



Re: have you feel anything when you read this ?

From
Alvaro Herrera
Date:
Achilleus Mantzios wrote:
> O Alvaro Herrera ?????? ???? Mar 20, 2006 :
> 
> > text is not bytea.
> > 
> > alvherre=# SELECT $$\000\150\145\154\154\157$$::bytea;
> >    bytea   
> > -----------
> >  \000hello
> > (1 fila)
> 
> Sure, but we are trying to reproduce the mysql phaenomenon right? :)

I don't really know what we are doing in this thread.  This Eugene E.
seems to be only trolling and the rest of us are feeding him.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: have you feel anything when you read this ?

From
Achilleus Mantzios
Date:
O Eugene E. έγραψε στις Mar 20, 2006 :

> Peter Eisentraut wrote:
> > Eugene E. wrote:
> > 
> >>the bytea does not output NULs at all.
> >>don't mock me.
> > 
> > 
> > peter=# create table test (a bytea);
> > CREATE TABLE
> > peter=# insert into test values ('a\\000b');
> > INSERT 0 1
> > peter=# select * from test;
> >    a
> > --------
> >  a\000b
> 
> are you kidding ?
> where is NUL-byte in "a\000b" ???

Null byte is a byte of value zero, 
and allow me to say that the \000 in "a\000b" is exactly this.
A NULL value is commonly used in C to terminate a pointer's data.


> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq
> 

-- 
-Achilleus



Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Alvaro Herrera wrote:
> Achilleus Mantzios wrote:
> 
> 
>>dynacom=# SELECT '\000\150\145\154\154\157'::text;
>> text
>>------
>>
>>(1 row)
>>
>>dynacom=#         
>>
>>Oops!
> 
> 
> text is not bytea.

source says:

typedef text varlena;
typedef bytea varlena;

:-)


Re: have you feel anything when you read this ?

From
Alvaro Herrera
Date:
Eugene E. wrote:
> Alvaro Herrera wrote:

> >text is not bytea.
> 
> source says:
> 
> typedef text varlena;
> typedef bytea varlena;

This means that as far as the C type system is concerned, both bytea and
text are treated as "struct varlena".  It doesn't mean that they are
processed by the same input/output functions, which they aren't.

NUL bytes are preserved in bytea, and used as terminators in text.  I
don't find this surprising at all, do you?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Achilleus Mantzios wrote:

>>>   a
>>>--------
>>> a\000b
>>
>>are you kidding ?
>>where is NUL-byte in "a\000b" ???
> 
> 
> Null byte is a byte of value zero, 
> and allow me to say that the \000 in "a\000b" is exactly this.

if ("\0"=="\\000")  printf("congratulations!!!");


NOTE:
I am not care about a _display_ NUL-byte on a screen !
weither it'll be "space" or "some escape sequence"

the problem is: you'll get this four byte sequence '\000' _instead_ of 
NUL-byte anyway.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Alvaro Herrera wrote:
> Eugene E. wrote:
> 
>>Alvaro Herrera wrote:
> 
> 
>>>text is not bytea.
>>
>>source says:
>>
>>typedef text varlena;
>>typedef bytea varlena;
> 
> 
> This means that as far as the C type system is concerned, both bytea and
> text are treated as "struct varlena".  It doesn't mean that they are
> processed by the same input/output functions, which they aren't.
> 
> NUL bytes are preserved in bytea, and used as terminators in text.  I
> don't find this surprising at all, do you?

I do found surprising it.
since both (text and bytea) I/O functions has CSTRING arguments and 
resut type. - this only means a user should perform some unescaping on 
the bytea value he got. THE SAME THING he should do with a string value 
if he decide to use type text and to escape NUL-bytes before input.
then what a difference bitween those types except strlen() ?


Re: have you feel anything when you read this ?

From
Peter Eisentraut
Date:
Eugene E. wrote:
> the problem is: you'll get this four byte sequence '\000' _instead_
> of NUL-byte anyway.

What you seem to be missing is that PostgreSQL data can be represented 
in textual and in binary form.  What you in psql is the textual form.  
If you want the binary form you need to select it.  Then you can pass 
the exact bytes back and forth.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: have you feel anything when you read this ?

From
Peter Eisentraut
Date:
Eugene E. wrote:
> input. then what a difference bitween those types except strlen() ?

bytea does not consider character set encodings and locales, and it 
handles null bytes.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: have you feel anything when you read this ?

From
PFC
Date:
> the problem is: you'll get this four byte sequence '\000' _instead_ of  
> NUL-byte anyway.


http://www.postgresql.org/docs/8.1/interactive/datatype-binary.html says :

"A binary string is a sequence of octets (or bytes). Binary strings are  
distinguished from character strings by two characteristics: First, binary  
strings specifically allow storing octets of value zero and other  
"non-printable" octets (usually, octets outside the range 32 to 126).  
Character strings disallow zero octets, and also disallow any other octet  
values and sequences of octet values that are invalid according to the  
database's selected character set encoding. Second, operations on binary  
strings process the actual bytes, whereas the processing of character  
strings depends on locale settings. In short, binary strings are  
appropriate for storing data that the programmer thinks of as "raw bytes",  
whereas character strings are appropriate for storing text."

That's the whole point of escaping, so that data generated by COPY, for  
instance, does not include any funky characters, including the \0 (zero)  
character, so that you can use any standard tool on it, including grep...

I LIKE the fact that TEXT refuses invalid characters. It helps find bugs  
in my applications, like when I forget to process some 8-bit string before  
inserting it in my database which uses UNICODE. I definitely prefer an  
error than finding a month later that half my data has had all its  
accented characters screwed up.

in psql, you have to use the escaped syntax :

SELECT length('\\000'::BYTEA), length('\\001'::BYTEA),  
length('\\000'::TEXT), length('\\001'::TEXT); length | length | length | length
--------+--------+--------+--------      1 |      1 |      4 |      4

Your client library should take care of escaping and de-escaping. Here, in  
python :

>>> cursor.execute( "SELECT %s,%s::BYTEA", ("this is a normal string",  
>>> psycopg2.Binary( ">\x00< this is a string with a zero byte" ),) )
>>> r = cursor.fetchone()
>>> print r
['this is a normal string', <read-only buffer for 0x2aaaaab32210, size 37,  
offset 0 at 0x2aaaace27c70>]
>>> print str(r[1])
> < this is a string with a zero byte
>>> print repr(str(r[1]))
'>\x00< this is a string with a zero byte'
>>> ord(r[1][1])
0


Note : \x00 is python's escaping for the null byte


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
I wrote:

>> the problem is: you'll get this four byte sequence '\000' _instead_ 
>> of  NUL-byte anyway.

You wrote:

> Your client library should take care of escaping and de-escaping. 

We both agree as you see.

Then i am asking:
WHY should a client take care of de-escaping ? Why not to get his data 
unchanged ?

If i forced to de-escape when using bytea then i may use text with the 
same escaping/de-escaping AS WELL.


Re: have you feel anything when you read this ?

From
Rod Taylor
Date:
On Mon, 2006-03-20 at 17:53 +0300, Eugene E. wrote:
> I wrote:
> 
> >> the problem is: you'll get this four byte sequence '\000' _instead_ 
> >> of  NUL-byte anyway.
> 
> You wrote:
> 
> > Your client library should take care of escaping and de-escaping. 
> 
> We both agree as you see.
> 
> Then i am asking:
> WHY should a client take care of de-escaping ? Why not to get his data 
> unchanged ?

Request the data be delivered to you in binary format instead of text
format (see  PQexecParams, PQexecPrepared, etc).

When you request text format it requires some escaping to occur for
obvious reasons, but you don't need to worry about the length of the
data.

When you request binary escaping is not required but you do need to
worry about the length of the data.

Pick the method that suits you best.
-- 



Re: have you feel anything when you read this ?

From
Scott Marlowe
Date:
On Mon, 2006-03-20 at 02:06, Eugene E. wrote:
> http://dev.mysql.com/doc/refman/5.0/en/news-5-0-19.html
> --- cut ---
> mysql no longer terminates data value display when it encounters a NUL 
> byte. Instead, it displays NUL bytes as spaces. (Bug #16859)
> --- cut ---

Everyone here realizes that this is a mysql CLIENT bug, not server side.

i.e. if you're writing an application and request that binary text
string, you'll get it with nuls in it, just like you put in.

Now, I'm not sure that turning nulls into spaces is the best way to
handle this in the client.  In fact, I'm sure it's not.  But this is not
a server bug, it's a client bug.

Just FYI.


Re: have you feel anything when you read this ?

From
PFC
Date:
> I wrote:
>
>>> the problem is: you'll get this four byte sequence '\000' _instead_  
>>> of  NUL-byte anyway.
>
> You wrote:
>
>> Your client library should take care of escaping and de-escaping.
>
> We both agree as you see.
>
> Then i am asking:
> WHY should a client take care of de-escaping ? Why not to get his data  
> unchanged ?
I can understand why you say that for something as simple as a BYTEA, but  
if the value to be passed to the client is an ARRAY of geometric types or  
something, you gonna need an open, platform-agnostic exchange format  
between the way postgres internally represents it and the way the client  
represents it (in my case, a python list containing instances of python  
classes representing boxes, etc, it'll be different for every language).
Exporting data from postgres in binary is only useful to C programmers  
who can import the required struct definitions, and you still have to  
manage the format, it's just that you walk struct's instead of unescaping  
\'s


Re: have you feel anything when you read this ?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> text is not bytea.

Indeed.  I wonder whether we shouldn't tweak the SQL string literal
parser to reject \000, because AFAICS that isn't going to do anything
useful for any datatype, and it leads to what are at best questionable
results.  (bytea's processing of \000 happens somewhere further
downstream, and wouldn't be affected.)
        regards, tom lane


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
PFC wrote:
> 
>> I wrote:
>>
>>>> the problem is: you'll get this four byte sequence '\000' _instead_  
>>>> of  NUL-byte anyway.
>>
>>
>> You wrote:
>>
>>> Your client library should take care of escaping and de-escaping.
>>
>>
>> We both agree as you see.
>>
>> Then i am asking:
>> WHY should a client take care of de-escaping ? Why not to get his 
>> data  unchanged ?
> 
> 
>     I can understand why you say that for something as simple as a 
> BYTEA, but  if the value to be passed to the client is an ARRAY of 
> geometric types or  something

Who said "array" ? I just want to restore _one byte_ from bytea storage. 
that's all.

>     Exporting data from postgres in binary is only useful to C 
> programmers

Serious judgment ! Extremely seriuos...
nonetheless C programmers could not do this.




Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Scott Marlowe wrote:
> On Mon, 2006-03-20 at 02:06, Eugene E. wrote:
> 
>>http://dev.mysql.com/doc/refman/5.0/en/news-5-0-19.html
>>--- cut ---
>>mysql no longer terminates data value display when it encounters a NUL 
>>byte. Instead, it displays NUL bytes as spaces. (Bug #16859)
>>--- cut ---
> 
> 
> Everyone here realizes that this is a mysql CLIENT bug, not server side.
> 
> i.e. if you're writing an application and request that binary text
> string, you'll get it with nuls in it, just like you put in.
> 
> Now, I'm not sure that turning nulls into spaces is the best way to
> handle this in the client.  In fact, I'm sure it's not.  But this is not
> a server bug, it's a client bug.

I was not sure about MySQL, thank you for your explaination.
This ensures me that even MySQL server handles NUL-bytes properly
regardless to client problems.



Re: have you feel anything when you read this ?

From
Bruno Wolff III
Date:
On Mon, Mar 20, 2006 at 17:40:03 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> Indeed.  I wonder whether we shouldn't tweak the SQL string literal
> parser to reject \000, because AFAICS that isn't going to do anything
> useful for any datatype, and it leads to what are at best questionable
> results.  (bytea's processing of \000 happens somewhere further
> downstream, and wouldn't be affected.)

I think that makes sense. That character is effectively not allowed in text, so
it shouldn't be accepted as input.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Peter Eisentraut wrote:
> Eugene E. wrote:
> 
>>the problem is: you'll get this four byte sequence '\000' _instead_
>>of NUL-byte anyway.
> 
> 
> What you seem to be missing is that PostgreSQL data can be represented 
> in textual and in binary form.  What you in psql is the textual form.  
> If you want the binary form you need to select it.  Then you can pass 
> the exact bytes back and forth.

your sentence is not true.
I can not select exact bytes even if i use BYTEA type

the folloiwing tiny C-program shows this pretty clear

#include <stdlib.h>
#include <stdio.h>
#include "libpq-fe.h"



int
main (void) {  PGconn   * conn;  PGresult * res;  char     * val;  char     *   l;  int        len;


  conn = PQconnectdb("user=scott password=tiger dbname=test_db");
  PQexec(conn, "CREATE TABLE t (a BYTEA)");  PQexec(conn, "INSERT INTO t VALUES ('ab\\\\000cd')");

  res = PQexec(conn, "SELECT a,length(a) FROM t");  val = PQgetvalue(res,0,0);  l   = PQgetvalue(res,0,1);  len =
PQgetlength(res,0,0); printf("what_we_retrive='%s' its_value_length=%i but 
 
orig_length=%s\n",val,len,l);
  PQclear(res);  PQfinish(conn);
  return 0; }


Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Fri, 31 Mar 2006, Eugene E. wrote:

> Peter Eisentraut wrote:
> > Eugene E. wrote:
> >
> >>the problem is: you'll get this four byte sequence '\000' _instead_
> >>of NUL-byte anyway.
> >
> >
> > What you seem to be missing is that PostgreSQL data can be represented
> > in textual and in binary form.  What you in psql is the textual form.
> > If you want the binary form you need to select it.  Then you can pass
> > the exact bytes back and forth.
>
> your sentence is not true.
> I can not select exact bytes even if i use BYTEA type

No, that is still using the textual form.  If you use PQexecParams and set
the last argument to show you want binary data, you should get binary
data.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Stephan Szabo wrote:
> On Fri, 31 Mar 2006, Eugene E. wrote:
> 
> 
>>Peter Eisentraut wrote:
>>
>>>Eugene E. wrote:
>>>
>>>
>>>>the problem is: you'll get this four byte sequence '\000' _instead_
>>>>of NUL-byte anyway.
>>>
>>>
>>>What you seem to be missing is that PostgreSQL data can be represented
>>>in textual and in binary form.  What you in psql is the textual form.
>>>If you want the binary form you need to select it.  Then you can pass
>>>the exact bytes back and forth.
>>
>>your sentence is not true.
>>I can not select exact bytes even if i use BYTEA type
> 
> 
> No, that is still using the textual form.  If you use PQexecParams and set
> the last argument to show you want binary data, you should get binary
> data.

Documentation says:
===
PQexecParams
    Submits a command to the server and waits for the result, with the 
ability to pass parameters separately from the SQL command text.
===

How should i use this func to change so-called "textual form" of a 
select-result to so-called "binary form" ?


Re: have you feel anything when you read this ?

From
Michael Glaesemann
Date:
On Apr 4, 2006, at 19:30 , Eugene E. wrote:

> Stephan Szabo wrote:
>> On Fri, 31 Mar 2006, Eugene E. wrote:
>>> Peter Eisentraut wrote:
>>>
>> No, that is still using the textual form.  If you use PQexecParams  
>> and set
>> the last argument to show you want binary data, you should get binary
>> data.
>
> Documentation says:
> ===
> PQexecParams
>
>     Submits a command to the server and waits for the result, with  
> the ability to pass parameters separately from the SQL command text.
> ===
>
> How should i use this func to change so-called "textual form" of a  
> select-result to so-called "binary form" ?

It looks like you quoted

http://www.postgresql.org/docs/8.1/interactive/libpq-exec.html#AEN25123

If you read the rest of the explanation of PQexecParams, I think  
you'll find the answer you're looking for. (Please note I don't have  
any experience using libpq. I'm just trying to help you find an  
answer to your question.)

The usage block includes the parameters PQexecParams accepts.

> PGresult *PQexecParams(PGconn *conn,
>                        const char *command,
>                        int nParams,
>                        const Oid *paramTypes,
>                        const char * const *paramValues,
>                        const int *paramLengths,
>                        const int *paramFormats,
>                        int resultFormat);

That last parameter (resultFormat) looks promising.

The next sentence gives query result formats:

> PQexecParams is like PQexec, but offers additional functionality:  
> parameter values can be specified separately from the command  
> string proper, and query results can be requested in either text or  
> binary format.

And in the next paragraph:

> ... resultFormat is zero to obtain results in text format, or one  
> to obtain results in binary format. ...

Hope this helps.

Michael Glaesemann
grzm myrealbox com





Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Tue, 4 Apr 2006, Eugene E. wrote:

> Stephan Szabo wrote:
> > On Fri, 31 Mar 2006, Eugene E. wrote:
> >
> >
> >>Peter Eisentraut wrote:
> >>
> >>>Eugene E. wrote:
> >>>
> >>>
> >>>>the problem is: you'll get this four byte sequence '\000' _instead_
> >>>>of NUL-byte anyway.
> >>>
> >>>
> >>>What you seem to be missing is that PostgreSQL data can be represented
> >>>in textual and in binary form.  What you in psql is the textual form.
> >>>If you want the binary form you need to select it.  Then you can pass
> >>>the exact bytes back and forth.
> >>
> >>your sentence is not true.
> >>I can not select exact bytes even if i use BYTEA type
> >
> >
> > No, that is still using the textual form.  If you use PQexecParams and set
> > the last argument to show you want binary data, you should get binary
> > data.
>
> Documentation says:
> ===
> PQexecParams
>
>      Submits a command to the server and waits for the result, with the
> ability to pass parameters separately from the SQL command text.
> ===
>
> How should i use this func to change so-called "textual form" of a
> select-result to so-called "binary form" ?

From the 8.1 docs (although I believe this applies back to 7.4):

PQexecParams
   Submits a command to the server and waits for the result, with the
ability to pass parameters separately from the SQL command text.
   PGresult *PQexecParams(PGconn *conn,                          const char *command,                          int
nParams,                         const Oid *paramTypes,                          const char * const *paramValues,
                  const int *paramLengths,                          const int *paramFormats,
intresultFormat);
 
   PQexecParams is like PQexec, but offers additional functionality:
parameter values can be specified separately from the command string
proper, and query results can be requested in either text or binary
format. PQexecParams is supported only in protocol 3.0 and later
connections; it will fail when using protocol 2.0.
   If parameters are used, they are referred to in the command string as
$1, $2, etc. nParams is the number of parameters supplied; it is the
length of the arrays paramTypes[], paramValues[], paramLengths[], and
paramFormats[]. (The array pointers may be NULL when nParams is zero.)
paramTypes[] specifies, by OID, the data types to be assigned to the
parameter symbols. If paramTypes is NULL, or any particular element in the
array is zero, the server assigns a data type to the parameter symbol in
the same way it would do for an untyped literal string. paramValues[]
specifies the actual values of the parameters. A null pointer in this
array means the corresponding parameter is null; otherwise the pointer
points to a zero-terminated text string (for text format) or binary data
in the format expected by the server (for binary format). paramLengths[]
specifies the actual data lengths of binary-format parameters. It is
ignored for null parameters and text-format parameters. The array pointer
may be null when there are no binary parameters. paramFormats[] specifies
whether parameters are text (put a zero in the array) or binary (put a one
in the array). If the array pointer is null then all parameters are
presumed to be text. resultFormat is zero to obtain results in text
format, or one to obtain results in binary format. (There is not currently
a provision to obtain different result columns in different formats,
although that is possible in the underlying protocol.)

---

Note the last argument to the function, and the last couple of sentences
in the above describe how to use resultFormat. It'd be nice if we could
get an interface which allowed mixing, but that's secondary to can we get
binary data or not.

Here's a similar app to the one you sent which for me seemingly gives the
binary data:


#include <stdlib.h>
#include <stdio.h>
#include "libpq-fe.h"



int
main (void) {  PGconn   * conn;  PGresult * res;  char     * val;  int        i;  int        len;


  conn = PQconnectdb("user=sszabo password=a dbname=sszabo");
  PQexec(conn, "CREATE TABLE t (a BYTEA)");  PQexec(conn, "INSERT INTO t VALUES ('ab\\\\000cd')");

  res = PQexecParams(conn, "SELECT a FROM t", 0, NULL, NULL,   NULL, NULL, 1);  val = PQgetvalue(res,0,0);  len =
PQgetlength(res,0,0); printf("what_we_retrive='%s' its_value_length=%i\n",val,len);
 
  for (i=0; i < len; ++i) {   printf("Position %d is %d (%c)\n", i, val[i], val[i]);  }
  PQclear(res);  PQfinish(conn);
  return 0; }



Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Stephan Szabo wrote:
> On Tue, 4 Apr 2006, Eugene E. wrote:
> 
> 
>>Stephan Szabo wrote:
>>
>>>On Fri, 31 Mar 2006, Eugene E. wrote:
>>>
>>>
>>>
>>>>Peter Eisentraut wrote:
>>>>
>>>>
>>>>>Eugene E. wrote:
>>>>>
>>>>>
>>>>>
>>>>>>the problem is: you'll get this four byte sequence '\000' _instead_
>>>>>>of NUL-byte anyway.
>>>>>
>>>>>
>>>>>What you seem to be missing is that PostgreSQL data can be represented
>>>>>in textual and in binary form.  What you in psql is the textual form.
>>>>>If you want the binary form you need to select it.  Then you can pass
>>>>>the exact bytes back and forth.
>>>>
>>>>your sentence is not true.
>>>>I can not select exact bytes even if i use BYTEA type
>>>
>>>
>>>No, that is still using the textual form.  If you use PQexecParams and set
>>>the last argument to show you want binary data, you should get binary
>>>data.

ok
then i am using PQexecParams

the following tiny program shows a wonderful lameness...

#include <stdlib.h>
#include <stdio.h>
#include "libpq-fe.h"



int
main (void) {  PGconn   * conn;  PGresult * res;  char     * val;  char     *   l;  int        len;


  conn = PQconnectdb("user=scott password=tiger name=test_db");
  PQexec(conn, "CREATE TABLE t (a BYTEA)");  PQexec(conn, "INSERT INTO t VALUES ('ab\\\\000cd')");

  res = PQexecParams(conn, "SELECT a,length(a) FROM t", 0, NULL, NULL, 
NULL,  NULL, 1);  val = PQgetvalue(res,0,0);  l   = PQgetvalue(res,0,1);  len = PQgetlength(res,0,0);
printf("what_we_retrive='screened'its_value_length=%i but 
 
orig_length=%s\n",len,l);
  PQclear(res);  PQfinish(conn);
  return 0; }


Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Wed, 5 Apr 2006, Eugene E. wrote:

> Stephan Szabo wrote:
> > On Tue, 4 Apr 2006, Eugene E. wrote:
> >
> >
> >>Stephan Szabo wrote:
> >>
> >>>On Fri, 31 Mar 2006, Eugene E. wrote:
> >>>
> >>>
> >>>
> >>>>Peter Eisentraut wrote:
> >>>>
> >>>>
> >>>>>Eugene E. wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>>the problem is: you'll get this four byte sequence '\000' _instead_
> >>>>>>of NUL-byte anyway.
> >>>>>
> >>>>>
> >>>>>What you seem to be missing is that PostgreSQL data can be represented
> >>>>>in textual and in binary form.  What you in psql is the textual form.
> >>>>>If you want the binary form you need to select it.  Then you can pass
> >>>>>the exact bytes back and forth.
> >>>>
> >>>>your sentence is not true.
> >>>>I can not select exact bytes even if i use BYTEA type
> >>>
> >>>
> >>>No, that is still using the textual form.  If you use PQexecParams and set
> >>>the last argument to show you want binary data, you should get binary
> >>>data.
>
> ok
> then i am using PQexecParams
>
> the following tiny program shows a wonderful lameness...

What lameness? The fact that you're trying to use a binary (network order
maybe) integer as a string? That's not the fault of PQexecParams but
of the code calling it.

It'd be nice for ease of use to be able to say, give me this
column (the integer) as a string and this column (the bytea) as binary,
especially as the commentary implies that the protocol supports it. But as
I said before, that's a separate discussion from whether or not Peter's
claim that it's possible to get binary data is false.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Stephan Szabo wrote:
> On Wed, 5 Apr 2006, Eugene E. wrote:
> 
> 
>>Stephan Szabo wrote:
>>
>>>On Tue, 4 Apr 2006, Eugene E. wrote:
>>>
>>>
>>>
>>>>Stephan Szabo wrote:
>>>>
>>>>
>>>>>On Fri, 31 Mar 2006, Eugene E. wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Peter Eisentraut wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Eugene E. wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>the problem is: you'll get this four byte sequence '\000' _instead_
>>>>>>>>of NUL-byte anyway.
>>>>>>>
>>>>>>>
>>>>>>>What you seem to be missing is that PostgreSQL data can be represented
>>>>>>>in textual and in binary form.  What you in psql is the textual form.
>>>>>>>If you want the binary form you need to select it.  Then you can pass
>>>>>>>the exact bytes back and forth.
>>>>>>
>>>>>>your sentence is not true.
>>>>>>I can not select exact bytes even if i use BYTEA type
>>>>>
>>>>>
>>>>>No, that is still using the textual form.  If you use PQexecParams and set
>>>>>the last argument to show you want binary data, you should get binary
>>>>>data.
>>
>>ok
>>then i am using PQexecParams
>>
>>the following tiny program shows a wonderful lameness...
> 
> 
> What lameness? The fact that you're trying to use a binary (network order
> maybe) integer as a string? That's not the fault of PQexecParams but
> of the code calling it.

You're right !
That's is not a fault of PQexecParams at all. That's the fault of its 
design. (I pretty know why an integer has been not displayed, but why 
they designed this function that way ? i do not know)

> It'd be nice for ease of use to be able to say, give me this
> column (the integer) as a string and this column (the bytea) as binary,

I always want to retrieve TEXT (and mostly INT) in TEXTUAL-FORM, and 
BYTEA always in BINARY-FORM. (at least by defaul)

WHY SHOULD I RETRIVE A SINGLE BYTEA FIELD IN A SEPARATE QUERY ???

Look: if i define a field as of type BYTEA, doesn't it mean the field 
tends to store binary data ? If does, then WHY they (by default) convert 
its value to TEXTUAL-FORM which is not needed by default -- i already 
inform the server: "i want some BINARY to in and out" when i have 
choosed the type BYTEA


Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Thu, 6 Apr 2006, Eugene E. wrote:

> Stephan Szabo wrote:
> > On Wed, 5 Apr 2006, Eugene E. wrote:
> >
> >
> >>Stephan Szabo wrote:
> >>
> >>>On Tue, 4 Apr 2006, Eugene E. wrote:
> >>>
> >>>
> >>>
> >>>>Stephan Szabo wrote:
> >>>>
> >>>>
> >>>>>On Fri, 31 Mar 2006, Eugene E. wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Peter Eisentraut wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Eugene E. wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>the problem is: you'll get this four byte sequence '\000' _instead_
> >>>>>>>>of NUL-byte anyway.
> >>>>>>>
> >>>>>>>
> >>>>>>>What you seem to be missing is that PostgreSQL data can be represented
> >>>>>>>in textual and in binary form.  What you in psql is the textual form.
> >>>>>>>If you want the binary form you need to select it.  Then you can pass
> >>>>>>>the exact bytes back and forth.
> >>>>>>
> >>>>>>your sentence is not true.
> >>>>>>I can not select exact bytes even if i use BYTEA type
> >>>>>
> >>>>>
> >>>>>No, that is still using the textual form.  If you use PQexecParams and set
> >>>>>the last argument to show you want binary data, you should get binary
> >>>>>data.
> >>
> >>ok
> >>then i am using PQexecParams
> >>
> >>the following tiny program shows a wonderful lameness...
> >
> >
> > What lameness? The fact that you're trying to use a binary (network order
> > maybe) integer as a string? That's not the fault of PQexecParams but
> > of the code calling it.
>
> You're right !
> That's is not a fault of PQexecParams at all. That's the fault of its
> design.

No, I'd argue in this case that it was the fault of a programmer not
paying enough attention/not thinking through what the api doc says.

>(I pretty know why an integer has been not displayed, but why
> they designed this function that way ? i do not know)

What would you expect it to do given a single result format argument?

If you want to propose a new function (set of functions) that have
different behavior, make a coherent proposal. Statements like it should
do X because I want it to aren't coherent proposals.  Expect to get asked
why bytea is special -- why should integer be passed as a string given
that you may often want to do operations on the value which expect it as
an actual number not the string representation of a number.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Praescriptum:
If my english is ugly and something is written unclear, please 
complaint, and i'll try to rephrase.
anyway i am trying to be understood.


I said
>> WHY SHOULD I RETRIVE A SINGLE BYTEA FIELD IN A SEPARATE QUERY ???


Stephan Szabo wrote:

> What would you expect it to do given a single result format argument?
> 
> If you want to propose a new function (set of functions) that have
> different behavior, make a coherent proposal.> Statements like it should> do X because I want it to aren't coherent
proposals.

AFAIK, they convert each value before put it to a result set.
I propose to do the following convertion to the textual-form for bytea 
values:
X->X where X is byte [0..255]

>   Expect to get asked
> why bytea is special

_Because each type is special._

And at the same time they made bytea MORE special than any other type.
Look:
every type has many representations for its values, some are obvious 
some are more usefull, some are less useful.
they define very useful and obvious representations for all the types
but BYTEA.
(They call those representations "textual-form".)
and the input of a value demands escaping (we all undersdand why)
and for each type the following equality is TRUE:

some_data == OUTPUT(INPUT(ESCAPE(some_data)))

but for the BYTEA this equality is FALSE !
Why BYTEA is so special ?

every value of every type is expected to be given to a client UNCHANGED.
I expect a value of BYTEA to be unchaged too.


> why should integer be passed as a string given

because it is not causing problems, as well as if it be passed in any 
other common form.

And i ask you:
why integer is actually passed as a string (decimal notation) ?
why not to define your own unique more_sofisticated representation ?
(as for bytea is defined.)


And finally
Why so special textual-form defined for bytea ?
Why not to leave every byte unchanged, since user knows what kind of 
data he got.


P.S.
changing a format of a whole result-set is not a solution for a 
field-type-dependent problem.


Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Thu, 6 Apr 2006, Eugene E. wrote:

> Praescriptum:
> If my english is ugly and something is written unclear, please
> complaint, and i'll try to rephrase.
> anyway i am trying to be understood.
>
>
> I said
>
>  >> WHY SHOULD I RETRIVE A SINGLE BYTEA FIELD IN A SEPARATE QUERY ???

I didn't answer this because I didn't feel that it moved the argument
forward, but...

If you meant that you must retrieve them in a separate query, you're
incorrect, since you *could* use the binary form for the others. I can't
understand if you don't realize that there is one for all these various
types, or that you just don't wish to use it (for example, I believe using
%d on ntohl(value from pqgetvalue) or something similar will print your
integer).

If you are arguing that you don't *wish* to do use that binary form for
the other values, I don't see how that's relevant until you've proven the
rest of the argument (*).

> Stephan Szabo wrote:
>
> > What would you expect it to do given a single result format argument?
> >
> > If you want to propose a new function (set of functions) that have
> > different behavior, make a coherent proposal.
>  > Statements like it should
>  > do X because I want it to aren't coherent proposals.
>
> AFAIK, they convert each value before put it to a result set.
> I propose to do the following convertion to the textual-form for bytea
> values:
> X->X where X is byte [0..255]

Okay, now pass that to strcmp or a %s format. AFAIK, the "textual-form" of
values is meant to be a c-string. "ab\0cd\0" is not a c-string containing
ab\0cd, it's a c-string containing ab.

> >   Expect to get asked
> > why bytea is special
>
> _Because each type is special._
>
> And at the same time they made bytea MORE special than any other type.

I don't think that it's appreciably more special.

> Look:
> every type has many representations for its values, some are obvious
> some are more usefull, some are less useful.
> they define very useful and obvious representations for all the types
> but BYTEA.

There are two representations of (at least most) types. There's a binary
format and a textual format.

> (They call those representations "textual-form".)

I think I don't exactly agree with this description, but I'm unclear
exactly what you're saying.  Are you saying that textual-form is the
useful representation, or are you saying that textual-form is the
representation and it is useful?

> and the input of a value demands escaping (we all undersdand why)
> and for each type the following equality is TRUE:
>
> some_data == OUTPUT(INPUT(ESCAPE(some_data)))
>
> but for the BYTEA this equality is FALSE !
> Why BYTEA is so special ?
>
> every value of every type is expected to be given to a client UNCHANGED.

This is already false AFAICS. Leading or trailing spaces on a string
containing integer get trimmed during the input for example, the string
format of date comes back in a particular but other input formats are
supported.  I don't think the above equality is valid for textual
representation.

In addition, input could be binary and output textual or the other way
around, in some_data is different on both sides. There's no reason that
you can't be passing an integer that way.

> I expect a value of BYTEA to be unchaged too.

I think (as above) that your perception of the problem isn't correct.

> > why should integer be passed as a string given
>
> because it is not causing problems, as well as if it be passed in any
> other common form.
> And i ask you:
> why integer is actually passed as a string (decimal notation) ?

It's not always. It can be, just as bytea can be passed as a string
needing escaping, however it can be passed as effectively a binary blob
containing an integer value (in network order I believe) just as bytea can
be passed as a binary blob.

> why not to define your own unique more_sofisticated representation ?
> (as for bytea is defined.)

AFAICS, there is one, the binary format for integer.

> And finally
> Why so special textual-form defined for bytea ?
> Why not to leave every byte unchanged, since user knows what kind of
> data he got.

I think this is mostly answered by the above with a little bit of
connecting the dots.

> P.S.
> changing a format of a whole result-set is not a solution for a
> field-type-dependent problem.

Since we're still arguing about whether it's a field-type-dependent
problem or a field-use-dependent problem, I can't really argue this point
since it assumes the former and I don't believe that's been shown yet.

(*) Yes, it might be nice to have something that did it for you.  Having
one is not, in my mind, a requirement for the API but instead something to
make it easier.


Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Thu, 6 Apr 2006, Stephan Szabo wrote:

> On Thu, 6 Apr 2006, Eugene E. wrote:
>
> If you meant that you must retrieve them in a separate query, you're
> incorrect, since you *could* use the binary form for the others. I can't
> understand if you don't realize that there is one for all these various
> types, or that you just don't wish to use it (for example, I believe using
> %d on ntohl(value from pqgetvalue) or something similar will print your
> integer).

For notes sake, the example code with binary retrieval in the docs seems
to haventohl(*((uint32_t *) iptr))
for iptr being the result of PQgetvalue on an integer column.


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Stephan Szabo wrote:
>>>What would you expect it to do given a single result format argument?
>>>
>>>If you want to propose a new function (set of functions) that have
>>>different behavior, make a coherent proposal.
>>
>> > Statements like it should
>> > do X because I want it to aren't coherent proposals.
>>
>>AFAIK, they convert each value before put it to a result set.
>>I propose to do the following convertion to the textual-form for bytea
>>values:
>>X->X where X is byte [0..255]
> 
> 
> Okay, now pass that to strcmp or a %s format. AFAIK, the "textual-form" of
> values is meant to be a c-string. "ab\0cd\0" is not a c-string containing
> ab\0cd, it's a c-string containing ab.

WHY strcmp ?! do you really think the user is a fool ?
if the user declared something "binary", he obviously knows what he has 
done.

WHY c-string ? the user only wants to get PGresult structure.
Since this structure provides a length of each value, you have no need 
in c-string. Why do think the user needs it ?

"textual-form" is just a name of actually existent convertion rule.
i am not trying to find out a philosophy here.



> I think I don't exactly agree with this description, but I'm unclear
> exactly what you're saying.  Are you saying that textual-form is the
> useful representation, or are you saying that textual-form is the
> representation and it is useful?

the actual representasion of most types is pretty useful.




Re: have you feel anything when you read this ?

From
Markus Schaber
Date:
Hi, Stephan & Eugene,

Stephan Szabo wrote:

> This is already false AFAICS. Leading or trailing spaces on a string
> containing integer get trimmed during the input for example, the string
> format of date comes back in a particular but other input formats are
> supported.  I don't think the above equality is valid for textual
> representation.

It is even true that the server-internal storage format can be distinct
from both the textual and binary representation (aka canonical rep.).

This is e. G. how PostGIS handles their geometries. PostGIS geometries
have even more representations, available via conversion functions.

And for some unicode strings, it even happens that their textual
representation is different depending on the client encoding.

>>I expect a value of BYTEA to be unchaged too.
> I think (as above) that your perception of the problem isn't correct.

I agree. The value of the BYTEA is unchanged, it is just a different
representation of the BYTEA that allows handling its contents as text,
in non-binary safe environments.

Imagine having the text representation as simply HEXing the BYTEA
contents - it still is an unchanged value.

>>why not to define your own unique more_sofisticated representation ?
>>(as for bytea is defined.)
> AFAICS, there is one, the binary format for integer.

Exactly.

AFAICS, all built-in data types have both a text and binary
representation, as well as most extension types.


HTH,
Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


Re: have you feel anything when you read this ?

From
Stephan Szabo
Date:
On Thu, 6 Apr 2006, Eugene E. wrote:

> Stephan Szabo wrote:
> >>>What would you expect it to do given a single result format argument?
> >>>
> >>>If you want to propose a new function (set of functions) that have
> >>>different behavior, make a coherent proposal.
> >>
> >> > Statements like it should
> >> > do X because I want it to aren't coherent proposals.
> >>
> >>AFAIK, they convert each value before put it to a result set.
> >>I propose to do the following convertion to the textual-form for bytea
> >>values:
> >>X->X where X is byte [0..255]
> >
> >
> > Okay, now pass that to strcmp or a %s format. AFAIK, the "textual-form" of
> > values is meant to be a c-string. "ab\0cd\0" is not a c-string containing
> > ab\0cd, it's a c-string containing ab.
>
> WHY strcmp ?! do you really think the user is a fool ?
> if the user declared something "binary", he obviously knows what he has
> done.
>
> WHY c-string ? the user only wants to get PGresult structure.
> Since this structure provides a length of each value, you have no need
> in c-string. Why do think the user needs it ?
>
> "textual-form" is just a name of actually existent convertion rule.
> i am not trying to find out a philosophy here.

Then, honestly, nothing anyone can say will help, because you're not
willing to actually hold a conversation on the topic.


Re: have you feel anything when you read this ?

From
Markus Schaber
Date:
Hi, Eugene,

Eugene E. wrote:

>> Okay, now pass that to strcmp or a %s format. AFAIK, the
>> "textual-form" of
>> values is meant to be a c-string. "ab\0cd\0" is not a c-string containing
>> ab\0cd, it's a c-string containing ab.
> WHY strcmp ?! do you really think the user is a fool ?
> if the user declared something "binary", he obviously knows what he has
> done.

But when the user requests the canonical _text_ representation of a byte
area data type, why do you consider him declaring it "binary"?

> WHY c-string ? the user only wants to get PGresult structure.

And he does request the _text_ represenation of the datatypes in this
structure.

> Since this structure provides a length of each value, you have no need
> in c-string. Why do think the user needs it ?

A user that does not have a need in C-Strings can fetch the binary
representation, getting higher efficency for all datatypes.

> "textual-form" is just a name of actually existent convertion rule.
> i am not trying to find out a philosophy here.

There is no philosophy but orthogonality.

There's a textual and a binary form of datatypes. For varchar, byta,
int4, float, PostGIS geometries etc...

>> I think I don't exactly agree with this description, but I'm unclear
>> exactly what you're saying.  Are you saying that textual-form is the
>> useful representation, or are you saying that textual-form is the
>> representation and it is useful?
> the actual representasion of most types is pretty useful.

The text representation is pretty useful for human readers for _most_
datatypes, the binary representation is much easier to parse for programs.

So use the binary representation for everything if you don't want to
display the data to the user directly.

One could speculate that the textual representation is just a little
help for "generic" tools like pg_dump, pgadmin or psql that display data
to the user without having any knowledge of the specific datatypes (and
without the possibility to have such knowledge).

Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


Re: have you feel anything when you read this ?

From
"Eugene E."
Date:
Markus Schaber wrote:
> Hi, Eugene,
> 
> Eugene E. wrote:
> 
> 
>>>Okay, now pass that to strcmp or a %s format. AFAIK, the
>>>"textual-form" of
>>>values is meant to be a c-string. "ab\0cd\0" is not a c-string containing
>>>ab\0cd, it's a c-string containing ab.
>>
>>WHY strcmp ?! do you really think the user is a fool ?
>>if the user declared something "binary", he obviously knows what he has
>>done.
> 
> 
> But when the user requests the canonical _text_ representation of a byte
> area data type, why do you consider him declaring it "binary"?

he did not request this representation. it is _by_default_
if you wish to provide it by request, please do it.

i ask you to provide minimal convertion by default, mentioned the user 
wants his data unchanged.
and let the user interpret his own data himself.

>>Since this structure provides a length of each value, you have no need
>>in c-string. Why do think the user needs it ?
> 
> A user that does not have a need in C-Strings can fetch the binary
> representation, getting higher efficency for all datatypes.

and lose the pretty good representation of all other columns in the same 
request.

>>"textual-form" is just a name of actually existent convertion rule.
>>i am not trying to find out a philosophy here.
> 
> There is no philosophy but orthogonality.
> 
> There's a textual and a binary form of datatypes. For varchar, byta,
> int4, float, PostGIS geometries etc...

good. i ask you to slightly change "textual" representation of bytea.

> The text representation is pretty useful for human readers for _most_
> datatypes, the binary representation is much easier to parse for programs.

You are right.
but
Who said that i can not display something ?
i thougth, human-readability of some data depends completely on how 
CLIENT-SIDE interpret it.
server do not know and should not know
what data is human readable or printable... etc.


> So use the binary representation for everything if you don't want to
> display the data to the user directly.

The problem we discuss is not about displaing or printig at all.

Some applications want "textual-form" -- most applications
but not only to display
and in the _same_ query the same applications want bytea...





Re: have you feel anything when you read this ?

From
Markus Schaber
Date:
Hi, Eugene,

Eugene E. wrote:

> he did not request this representation. it is _by_default_

He used a function that provided it by default. He could use the other
function that allows him to select which representation he wants.

> if you wish to provide it by request, please do it.

I cannot provide anything, because I'm not a PostgreSQL developer. To be
honest, I can provide the user with nice Java Objects, because I'm the
maintainer of the PostGIS Java extension, but that's all.

>> A user that does not have a need in C-Strings can fetch the binary
>> representation, getting higher efficency for all datatypes.
> and lose the pretty good representation of all other columns in the same
> request.

It is not "pretty good". It is human readable, but it are C-Strings with
some text. It uses much more CPU power on both sides, Date Formats can
be misleading, and PostGIS even reverted to a binary like format for
their canonical text representation because the users complained about
coordinate drift due to rounding errors in the floating point
input/output routines, so pg_dump and restoring the database changed the
data.

So I cannot see that the textual from is superior for _any_ of the
datatypes. IMHO, its sole purpose is to have a generic way to represent
the data for generic tools as pg_dump -F p, pgadmin3, psql and such,
which cannot know the (possibly user-defined) datatypes in advance.

>> There is no philosophy but orthogonality.
>> There's a textual and a binary form of datatypes. For varchar, byta,
>> int4, float, PostGIS geometries etc...
> good. i ask you to slightly change "textual" representation of bytea.

This will achieve at least the following:

- It will break all libraries, tools and applications that rely on the
current behaviour.

- It will break the guarantee for generic applications that the text
representation of every datatype can be handled as text.

- It will break pg_dump -F p (which is the default for pg_dump by the
way), thus making it _impossible_ to have "plaintext" dumps of a
database, with no easy way of reinventing this behaviour. Those dumps
are to be edited with text editors, which don't cope easily with
nullbytes and other waste...

>> The text representation is pretty useful for human readers for _most_
>> datatypes, the binary representation is much easier to parse for
>> programs.
> You are right.
> but
> Who said that i can not display something ?
> i thougth, human-readability of some data depends completely on how
> CLIENT-SIDE interpret it.
> server do not know and should not know
> what data is human readable or printable... etc.

So what you say is that the canonical text representation should be
abandoned completely. Fine.

The problem is that all generic applications, that don't know about the
concrete datatypes, will get impossible. psql, pg_dump, pgadmin and others.

Different from "normal" applications which can have their specific
datatypes hardwired in the code or whatever, those applications cannot
be taught about how to present the data to an user in a generic way, if
there's nothing in the backend.

Users and extensions can invent new datatypes as they want, how do you
expect the authors of pgadmin or psql to cope with proprietary in-house
datatypes of a certain PostgreSQL user?

>> So use the binary representation for everything if you don't want to
>> display the data to the user directly.
> The problem we discuss is not about displaing or printig at all.
> Some applications want "textual-form" -- most applications
> but not only to display
> and in the _same_ query the same applications want bytea...

Why do you try so hard to resist understanding the whole point?

Those applications _get_ bytea. They just get a Cstring-safe
representation of it. It's just like you have to put "quotes" around and
\escapes into a string in your program sources if you use any of the
weird characters.

You have the decision between text and binary format for your query. As
libpq is a low level API, it does not abstract you from this difference.

You can you use a higher level API that abstracts over the whole issue
and gives you nice Objects (like the jdbc library), then you don't have
to cope with those representations at all.

It also may make sense to provide an extension for lipq that lets you
select binary and textual representation column-wise (which might need a
protocol extension, I don't have the specs in mind).

But it absolutely does not make any sense to break the whole concept of
text representations by making it binary for a single datatype.

HTH,
Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org