Thread: Using UTF strings in pg8.3 - storing hexadecimal values in bytea columns
I have this issue: postgres=# select E'\xc5\x53\x94\x96\x83\x29'; ERROR: invalid byte sequence for encoding "UTF8": 0xc553 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". postgres=# show client_encoding ; client_encoding ----------------- UTF8 (1 row) postgres=# show server_encoding ; server_encoding ----------------- UTF8 (1 row) postgres=# select version(); version ------------------------------------------------------------------------------------------------ PostgreSQL 8.3.5 on i686-pc-linux-gnu,compiled by GCC gcc (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu3) (1 row) On postgres 8.2 this worked: postgres=# select version(); version ---------------------------------------------------------------------------------------------------------------- PostgreSQL8.2.4 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) (1 row) postgres=# select E'\xc5\x53\x94\x96\x83\x29'; ?column? ---------- S) (1 row) postgres=# show client_encoding ; client_encoding ----------------- UTF8 (1 row) postgres=# show server_encoding ; server_encoding ----------------- UTF8 (1 row) I'm using the above mentioned string to store data into bytea column. I did pg_dump of the database on postgres 8.2, and then tried to restore it on postgres 8.3, and I got this error. The actuall line that produces error is like this: INSERT INTO vpn_payins_bitfield (vpn_id, payload_pattern, encription, encription_key, charset, amount_width, shop_width, counter_width) VALUES (3, E'\\W*(\\w+)(?:\\W+(.*))?', 'RC4', E'\xc5\x53\x94\x96\x83\x29'::bytea, 'ABCDEGHIKLMOPTWX', 16, 8, 16); The error is: ERROR: invalid byte sequence for encoding "UTF8": 0xc553 Now, I see that I can type: "SELECT E'\xFF'" in pg8.2, but can't do that in pg8.3. So, my question is, how do I specify hexadecimal value of C5 to be stored in bytea column, in an INSERT statement? Mike
Mario Splivalo <mario.splivalo@megafon.hr> writes: > I have this issue: > postgres=# select E'\xc5\x53\x94\x96\x83\x29'; > ERROR: invalid byte sequence for encoding "UTF8": 0xc553 This is expected since the string is not valid as text. > I'm using the above mentioned string to store data into bytea column. I > did pg_dump of the database on postgres 8.2, and then tried to restore > it on postgres 8.3, and I got this error. The actuall line that produces > error is like this: > INSERT INTO vpn_payins_bitfield (vpn_id, payload_pattern, encription, > encription_key, charset, amount_width, shop_width, counter_width) VALUES > (3, E'\\W*(\\w+)(?:\\W+(.*))?', 'RC4', > E'\xc5\x53\x94\x96\x83\x29'::bytea, 'ABCDEGHIKLMOPTWX', 16, 8, 16); Exactly what version of pg_dump are you using? What I get from pg_dump doesn't look like that. Bytea fields with -D look more like this: INSERT INTO foo (f1) VALUES ('\\305S\\224\\226\\203)'); regards, tom lane
Re: Using UTF strings in pg8.3 - storing hexadecimal values in bytea columns
From
Mario Splivalo
Date:
Tom Lane wrote: >> I'm using the above mentioned string to store data into bytea column. I >> did pg_dump of the database on postgres 8.2, and then tried to restore >> it on postgres 8.3, and I got this error. The actuall line that produces >> error is like this: > >> INSERT INTO vpn_payins_bitfield (vpn_id, payload_pattern, encription, >> encription_key, charset, amount_width, shop_width, counter_width) VALUES >> (3, E'\\W*(\\w+)(?:\\W+(.*))?', 'RC4', >> E'\xc5\x53\x94\x96\x83\x29'::bytea, 'ABCDEGHIKLMOPTWX', 16, 8, 16); > > Exactly what version of pg_dump are you using? What I get from pg_dump > doesn't look like that. Bytea fields with -D look more like this: > > INSERT INTO foo (f1) VALUES ('\\305S\\224\\226\\203)'); Yes, I mistakenly used pg8.2 pg_dump, when I use pg3.8 dump I get what you get. Btw, what is that S after 305? 305 octal is C5 hexadecimal. How do I enter hexadecimal C5 without UTF encoding errors? Mike
Mario Splivalo <mario.splivalo@megafon.hr> writes: > Tom Lane wrote: >> Exactly what version of pg_dump are you using? What I get from pg_dump >> doesn't look like that. Bytea fields with -D look more like this: >> >> INSERT INTO foo (f1) VALUES ('\\305S\\224\\226\\203)'); > Yes, I mistakenly used pg8.2 pg_dump, when I use pg3.8 dump I get what > you get. I was quoting the output of 8.2.latest pg_dump. Maybe you have a very old subrelease? But no version of pg_dump would've put an explicit cast to bytea in there. > Btw, what is that S after 305? Hex 53 is 'S' I believe. > 305 octal is C5 hexadecimal. How > do I enter hexadecimal C5 without UTF encoding errors? bytea only supports octal, so \\305 is the way to do it. The way you were doing it was guaranteed to fail on corner cases such as \0 and \ itself, because you were converting at the string-literal stage not byteain(). regards, tom lane
Re: Using UTF strings in pg8.3 - storing hexadecimal values in bytea columns
From
Richard Huxton
Date:
Mario Splivalo wrote: > Richard Huxton wrote: >> Mario Splivalo wrote: >>> I have this issue: >>> >>> postgres=# select E'\xc5\x53\x94\x96\x83\x29'; >>> ERROR: invalid byte sequence for encoding "UTF8": 0xc553 >> >> I think you want to be using octal escapes. That's text you're >> generating above. >> >> CREATE TABLE bytea_test (b bytea); >> INSERT INTO bytea_test (b) VALUES (E'\\305\\123\\224\\226'); >> SELECT * FROM bytea_test; >> b >> --------------- >> \305S\224\226 >> (1 row) > > That's true, but I'd still like to use hexadecimal notation. Manual > states that I could say '\xC5', but then I get those encoding errors. I think you're reading the "text" rather than "bytea" part of the manual. 4.1.2.1. String Constants "...and \xhexdigits, where hexdigits represents a hexadecimal byte value. (It is your responsibility that the byte sequences you create are valid characters in the server character set encoding.)" No mention of hex in the bytea section of the manual. -- Richard Huxton Archonet Ltd
Richard Huxton <dev@archonet.com> writes: > Mario Splivalo wrote: >> That's true, but I'd still like to use hexadecimal notation. You could use decode(): regression=# select decode('c5a4', 'hex'); decode ----------\305\244 (1 row) regards, tom lane
Re: Using UTF strings in pg8.3 - storing hexadecimal values in bytea columns
From
Mario Splivalo
Date:
Richard Huxton wrote: > Mario Splivalo wrote: >> Richard Huxton wrote: >>> Mario Splivalo wrote: >>>> I have this issue: >>>> >>>> postgres=# select E'\xc5\x53\x94\x96\x83\x29'; >>>> ERROR: invalid byte sequence for encoding "UTF8": 0xc553 >>> I think you want to be using octal escapes. That's text you're >>> generating above. >>> >>> CREATE TABLE bytea_test (b bytea); >>> INSERT INTO bytea_test (b) VALUES (E'\\305\\123\\224\\226'); >>> SELECT * FROM bytea_test; >>> b >>> --------------- >>> \305S\224\226 >>> (1 row) >> That's true, but I'd still like to use hexadecimal notation. Manual >> states that I could say '\xC5', but then I get those encoding errors. > > I think you're reading the "text" rather than "bytea" part of the manual. > > 4.1.2.1. String Constants > "...and \xhexdigits, where hexdigits represents a hexadecimal byte > value. (It is your responsibility that the byte sequences you create are > valid characters in the server character set encoding.)" > > No mention of hex in the bytea section of the manual. > Hm, you're right. I guess that part of manual confuses me. Nevertheless, I'd still like to be able to enter hexadecimal values to the bytea fields. As Tom Lane suggested, I can do this: test1=# select decode('C5', 'hex'); decode -------- \305 (1 row) But, if I want it other way around, I get the error: test1=# select envode(E'\305', 'hex'); ERROR: invalid byte sequence for encoding "UTF8": 0xc5 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". test1=# Is there a way to circumvent encoding when dealing with binary data? Or am I completely confused here and have no clue what am I talking about? Mike
Re: Using UTF strings in pg8.3 - storing hexadecimal values in bytea columns
From
Mario Splivalo
Date:
Tom Lane wrote: > Mario Splivalo <mario.splivalo@megafon.hr> writes: >> Tom Lane wrote: >>> Exactly what version of pg_dump are you using? What I get from pg_dump >>> doesn't look like that. Bytea fields with -D look more like this: >>> >>> INSERT INTO foo (f1) VALUES ('\\305S\\224\\226\\203)'); > >> Yes, I mistakenly used pg8.2 pg_dump, when I use pg3.8 dump I get what >> you get. > > I was quoting the output of 8.2.latest pg_dump. Maybe you have a very > old subrelease? But no version of pg_dump would've put an explicit > cast to bytea in there. mike@som:~$ pg_dump -V pg_dump (PostgreSQL) 8.2.4 mike@som:~$ Since I need to have my servers running 24/7 with NO downtime I seldom choose to upgrade minor versions, unless there is a major bug that affects me. This upgrade from 8.2 to 8.3 is planned, and I have liberty of having 3-4 hours of downtime. > >> Btw, what is that S after 305? > > Hex 53 is 'S' I believe. Still don't get it :) If I have hexadecimal value of C5, that is octal 305, and I don't get where that S came from. > >> 305 octal is C5 hexadecimal. How >> do I enter hexadecimal C5 without UTF encoding errors? > > bytea only supports octal, so \\305 is the way to do it. The way you > were doing it was guaranteed to fail on corner cases such as \0 and \ > itself, because you were converting at the string-literal stage not > byteain(). Ok, that makes sense. Since I just store that data into the database, maybe I could store them as strings (varchars), and then do the conversion on the client side (java). Mike
Re: Using UTF strings in pg8.3 - storing hexadecimal values in bytea columns
From
Mario Splivalo
Date:
Mario Splivalo wrote: > But, if I want it other way around, I get the error: > > test1=# select envode(E'\305', 'hex'); > ERROR: invalid byte sequence for encoding "UTF8": 0xc5 > HINT: This error can also happen if the byte sequence does not match > the encoding expected by the server, which is controlled by > "client_encoding". > test1=# > > Is there a way to circumvent encoding when dealing with binary data? Or > am I completely confused here and have no clue what am I talking about? > Now, this works: test1=# select encode(decode('C5', 'hex'), 'hex'); encode -------- c5 (1 row) I see that my client encoding has a problem interpreting '\xC5': test1=# select E'\xC5'; ERROR: invalid byte sequence for encoding "UTF8": 0xc5 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". So the only way is to use decode function, right? Mike
Hi all, This might be a stupid question, but I wasn't able to find any information on it, so here it goes: Oracle knows a view which contains information about indexed columns (all_ind_cols), is there something similar availablein Postgres? I want to be able to create an full overview of each column with the following information: - index name - table name the index is on - column name the index is on - column position in the index - sort order Any help would be much appreciated :) Regards, Bart van Houdt Syfact International B.V. Database developer
Bart van Houdt wrote: > Hi all, > > This might be a stupid question, but I wasn't able to find any information on it, so here it goes: > Oracle knows a view which contains information about indexed columns (all_ind_cols), is there something similar availablein Postgres? > > I want to be able to create an full overview of each column with the following information: > - index name > - table name the index is on > - column name the index is on > - column position in the index > - sort order > > Any help would be much appreciated :) > You should check this, for instance: http://www.postgresql.org/docs/8.3/interactive/monitoring-stats.html#MONITORING-STATS-VIEWS There is, also, information_schema in postgres, as well as pg_catalog, you may wish to check those. Mike
Cool! Thx :D Bart van Houdt Syfact International B.V. Database developer -----Original Message----- From: Mario Splivalo [mailto:mario.splivalo@megafon.hr] Sent: dinsdag 11 november 2008 15:03 To: pgsql-sql@postgresql.org Cc: Bart van Houdt Subject: Re: [SQL] Postgres version of all_ind_cols Bart van Houdt wrote: > Hi all, > > This might be a stupid question, but I wasn't able to find any information on it, so here it goes: > Oracle knows a view which contains information about indexed columns (all_ind_cols), is there something similar availablein Postgres? > > I want to be able to create an full overview of each column with the following information: > - index name > - table name the index is on > - column name the index is on > - column position in the index > - sort order > > Any help would be much appreciated :) > You should check this, for instance: http://www.postgresql.org/docs/8.3/interactive/monitoring-stats.html#MONITORING-STATS-VIEWS There is, also, information_schema in postgres, as well as pg_catalog, you may wish to check those. Mike
am Tue, dem 11.11.2008, um 14:57:20 +0100 mailte Bart van Houdt folgendes: > Hi all, Please, don't hijack other threads. > > This might be a stupid question, but I wasn't able to find any information on it, so here it goes: > Oracle knows a view which contains information about indexed columns (all_ind_cols), is there something similar availablein Postgres? > > I want to be able to create an full overview of each column with the following information: > - index name > - table name the index is on > - column name the index is on > - column position in the index > - sort order > > Any help would be much appreciated :) Take a look at: http://www.postgresql.org/docs/current/interactive/catalog-pg-index.html Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net