Re: bytea - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: bytea |
Date | |
Msg-id | 200009300235.WAA03083@candle.pha.pa.us Whole thread Raw |
List | pgsql-hackers |
This brings up some good issues for the 7.2 release. Will large objects become just an API on top of toast, or should they remain as a separate physical storage format? > At 08:30 PM 3/15/00 -0500, Bruce Momjian wrote: > > >Yes, we should keep it. I see now it is for purely binary data, while > >text is for null-terminated strings. > > donb=# create table foo (b bytea); > CREATE > donb=# insert into foo values('ab\0cd'); > INSERT 107497 1 > donb=# select * from foo; > b > ---- > ab > (1 row) > > donb=# > > Thus my comment "maybe they should be made to work" :) > > I don't know what's actually inside attr b, but the "cd" is at least > dropped on output. > > For the BLOB hack I did for our toolkit I did the equivalent of > uuencoding the input, which costs a predictable 4/3 expansion of > the binary data (this is a segmented type, all done outside PG > via SQL, triggers, and AOLserver driver magic but lets us stuff > binary data such as photos etc, and pg_dump/restore them). > > If TOAST weren't on the way, I'd sit down and do a proper BLOB, > as I explained to the folks on our web toolkit team lo is > tantilizingly close to being useful for folks like us, without > actually being useful. > > BLOBs should sit atop TOAST, though, and perhaps specialized I/O > routines for a BLOB type could be made. Those for bytea could > be changed, too, at risk of breaking existing code? But since > bytea really acts like text perhaps there is no real existing code > that exists that couldn't just operate on text instead, so there > could be freedom to change it? > > For real binary data, uuencoded strings are a better choice for > a printable output form that the text+\nnn form (since a high > proportion of bytes will be emitted in the lengthy \nnn form). > > But normally with BLOB one would like a way to just stuff a file > or data in a buffer into it, etc, much like current lo. The printable > dump of data is mostly useful for pg_dump, IMO - a binary backup would > remove the need for such a hack, too. > > Standard BLOBs provide a way to stuff segments into the db... > > BLOBs, as done by TOAST or my current segmented table hack used in > our toolkit, only require a single table (or a single table per > underlying user table in the case of TOAST) so don't clutter the > way lo does. > > But lo allows each binary object to be 2GB in length. > > So they kind of fit different needs. lo seems fine for those who > need really huge objects, and probably not a bazillion (since each > generates a file + index). My hack, or TOAST which will be similar > in table usage (both being segmented types in common tables), is > good for binary data of moderately large size not to exceed 2GB > in aggregate. > > Of course, with 64-bit systems on the horizon, the 2GB aggregate > limit will slowly begin to disappear, too. 'Til then, providing > a "real BLOB" while retaining lo for those who need single REALLY > huge data objects would seem best. > > > > > > - Don Baccus, Portland OR <dhogaza@pacifier.com> > Nature photos, on-line guides, Pacific Northwest > Rare Bird Alert Service and other goodies at > http://donb.photo.net. > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: