Re: Toast, Text, blob bytea Huh? - Mailing list pgsql-hackers
From | Joe Conway |
---|---|
Subject | Re: Toast, Text, blob bytea Huh? |
Date | |
Msg-id | 02a101c12c0e$37856d60$48d210ac@jecw2k1 Whole thread Raw |
In response to | Toast, Text, blob bytea Huh? (jason.ory@ndchealth.com) |
Responses |
Re: Toast, Text, blob bytea Huh?
|
List | pgsql-hackers |
> I'm trying my best to convert from MySQL to PgSQL but I cant get a good > clear answer about > certian issures.Mainly TEXT, TOAST,BLOB , BYTEA etc. > It was an easy task in mysql but everything in the archives about , text , > toast and bytea is just > confusing me with postgresql. I have Bruces's book and I've searched the Quick glossary: TEXT is a datatype which stores character data of unspecified length (up to the max value of a 4 byte integer in length, although I've seen comments indicating that the practical limit is closer to 1 GB -- not sure why). TEXT is not intended to hold arbitrary binary data. If you want to store binary in a text column, encode it to hex or base64 or something first. TOAST is an internal database concept. If I understand it correctly, it refers to a combination of compression and out-of-line storage for large length values of a charater datatype. This happens transparently to you. BLOB is otherwise known as LO or Large Object datatype in PostgreSQL. These are always stored out-of-line, I don't believe they are compressed, and they have their own special access methods (for dealing with data a "chunk" at a time). BYTEA is very similar to TEXT, except that it is intended for binary data. I recently posted a PHP function which escapes binary in order to allow inserting it into a bytea column (on the pgsql-general list). At a minimum there are 4 characters which must be escaped. They are ACSII 0, 10, 39, and 92. They must reach PostgreSQL looking like \\000, \\012, \\047, and \\134 respectively (actually 39 could be \' and 92 could be \\\\, but it is simpler to be consistent). > THIS IS WHAT I CANT SEEM TO FIGURE OUT IN POSTGRESQL > 1. I cant get a clear answer on what kind of data type to use for my large > text string? TEXT, ???, ??? or something about TOAST > I have seen in the e-mail archive but cant find any documentaion? So, you can use TEXT if you encode to hex or base64 in your app first, or you bytea if you escape as I described above in your app. Or you can use the LO functions to manipulate large objects (see http://www.postgresql.org/idocs/index.php?lo-interfaces.html). > > 2. I've written my own escape method ,("cant find one for Pgsql") , BUT i > don't know what > to escape and not to escape. So it keeps failing. I cand find any docs. on > what to escape either? See above. > > > SUMMARY > What is the best datatype to use, for large raw text and/or binary if i > choose? > Once I know this,how Im a supposed to escape my string and get it through > the parser correctly so i can retrieve it correctly? If you use TEXT, you will have to decode the hex/base64 back into binary in your app. Similarly, if you use bytea, although stored as binary, the data is returned with "unprintable" values escaped as octals*. Your app will have to decode the octals back into binary. *NOTE to hackers: is there a good reason for this? ISTM that the client should be responsible for any encoding needed when bytea is returned. The server should return bytea as straight varlena. If you use LO, you have to use the interface functions instead of standard SQL. Hope this helps, -- Joe
pgsql-hackers by date: