Re: bytea - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: bytea
Date
Msg-id 200009300235.WAA03083@candle.pha.pa.us
Whole thread Raw
List pgsql-hackers
This brings up some good issues for the 7.2 release.  Will large objects
become just an API on top of toast, or should they remain as a separate
physical storage format?


> At 08:30 PM 3/15/00 -0500, Bruce Momjian wrote:
> 
> >Yes, we should keep it.  I see now it is for purely binary data, while
> >text is for null-terminated strings.
> 
> donb=# create table foo (b bytea);
> CREATE
> donb=# insert into foo values('ab\0cd');
> INSERT 107497 1
> donb=# select * from foo;
>  b  
> ----
>  ab
> (1 row)
> 
> donb=# 
> 
> Thus my comment "maybe they should be made to work" :)
> 
> I don't know what's actually inside attr b, but the "cd" is at least
> dropped on output.
> 
> For the BLOB hack I did for our toolkit I did the equivalent of
> uuencoding the input, which costs a predictable 4/3 expansion of
> the binary data (this is a segmented type, all done outside PG
> via SQL, triggers, and AOLserver driver magic but lets us stuff
> binary data such as photos etc, and pg_dump/restore them).
> 
> If TOAST weren't on the way, I'd sit down and do a proper BLOB,
> as I explained to the folks on our web toolkit team lo is 
> tantilizingly close to being useful for folks like us, without
> actually being useful.
> 
> BLOBs should sit atop TOAST, though, and perhaps specialized I/O
> routines for a BLOB type could be made.  Those for bytea could
> be changed, too, at risk of breaking existing code?  But since
> bytea really acts like text perhaps there is no real existing code
> that exists that couldn't just operate on text instead, so there
> could be freedom to change it?
> 
> For real binary data, uuencoded strings are a better choice for
> a printable output form that the text+\nnn form (since a high
> proportion of bytes will be emitted in the lengthy \nnn form).
> 
> But normally with BLOB one would like a way to just stuff a file
> or data in a buffer into it, etc, much like current lo.  The printable
> dump of data is mostly useful for pg_dump, IMO - a binary backup would
> remove the need for such a hack, too.
> 
> Standard BLOBs provide a way to stuff segments into the db...
> 
> BLOBs, as done by TOAST or my current segmented table hack used in
> our toolkit, only require a single table (or a single table per
> underlying user table in the case of TOAST) so don't clutter the
> way lo does.
> 
> But lo allows each binary object to be 2GB in length.
> 
> So they kind of fit different needs.  lo seems fine for those who
> need really huge objects, and probably not a bazillion (since each
> generates a file + index).  My hack, or TOAST which will be similar
> in table usage (both being segmented types in common tables), is
> good for binary data of moderately large size not to exceed 2GB
> in aggregate.
> 
> Of course, with 64-bit systems on the horizon, the 2GB aggregate
> limit will slowly begin to disappear, too.  'Til then, providing
> a "real BLOB" while retaining lo for those who need single REALLY
> huge data objects would seem best.
> 
> 
> 
> 
> 
> - Don Baccus, Portland OR <dhogaza@pacifier.com>
>   Nature photos, on-line guides, Pacific Northwest
>   Rare Bird Alert Service and other goodies at
>   http://donb.photo.net.
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: ALTER TABLE DROP COLUMN
Next
From: Bruce Momjian
Date:
Subject: Re: Suggested change in include/utils/elog.h