Re: Large objects [BLOB] again - general howto - Mailing list pgsql-general

From Randolf Richardson, DevNet SysOp 29
Subject Re: Large objects [BLOB] again - general howto
Date
Msg-id Xns943C5778D7F72rr8xca@200.46.204.72
Whole thread Raw
In response to array iteration?  (CSN <cool_screen_name90001@yahoo.com>)
Responses Re: Large objects [BLOB] again - general howto  (Jeremiah Jahn <jeremiah@cs.earlham.edu>)
Re: Large objects [BLOB] again - general howto  (Jeremiah Jahn <jeremiah@goodinassociates.com>)
List pgsql-general
> Someone asks about performance of Large objects [LO] in PgSql [PG].  It
> was interesting for me, because I didn't work yet with them yet and I
> will have to soon. I tried search the web, doc and mailinglists, but I
> didn't found any adequate reply. I would be happy, of someone, who have
> experimence with them (or who know PG internals :) can make it clean. I
> think I am not alone, who is interested in this topic.

        You're certainly not alone.  I haven't done this yet, but will need to
in the near future with a web-based database project I'm in the midst of
planning at the moment which will involve allowing users to upload PNG, GIF
and JPeG images and having them displayed in a sort of "picture album" type
layout.

> In past time, where row data was limited by single tupple (8K-32K), LO
> was really needed, if you wanted to store more than 8K per row. Older
> implementation issues are described in documentation.

        I'm somewhat concerned about this, but not because of the large object
support (which I understand is handled by not storing the object data
directly in the row, thus circumventing this limitation altogether), rather
I think about scalability with large rows that have a lot of columns that
sum up to more than 32,768 bytes in size.

> Today, there is 1G limit per row. Is there reason to use LO instead of
> storing simple 'text' into database table ? (let's ignore 2G limit on

        That depends on the needs of your application.  If all you're storing
is text data, then the choice is yours.  If you're storing binary data,
then simple text probably won't be appropriate.

> LO, which can make LO interesting in some cases :)Documentation says it
> is obsolvete. But documentation also doesn't tell anything about
> performance, suggestion about using, etc.
>
>     By 'access' to binary file, I understand:
>
>         - reading (and then working with its contents on server side)
>         - sending (read and send as-is to client, without workaround)
>         - updating (from contents which I get from somewhere)
>
>     LO means storing file using large objects,
>     column means storing file as 'text' column (standard way).

        I don't know about this.  Hopefully someone else who has experience in
this area will jump in here and clear this matter up.

> Let's say I am having 'file' - smaller (tenhts of Kb) - let's say some
> JPG image (photo of beer :) The file is 'readed' as is and passed to
> client (browser, app, whatever). In this case, would be probably better
> to store file as normal column, because it is easier to access it. This
> is one of typical use. Every time I will write to that image, SQL parser
> need to parse all datas, allocate proper memory, .... I think, real
> differnece will not be too big, since picture having just tenths of Kb.

        If you can ensure that the image size will never exceed the width of
your column, then you'll be okay, but note that image sizes can vary widely
depending on a number of factors in addition to image dimensions, one of
which that is often overlooked is the possibility of poorly designed
compression algorithms in the applications used to generate the JPeG image
or the user selecting minimal compression because they want a better
quality image (I do this sometimes in the web development I do, but only
when it's appropriate).

        Remember, however, that storing binary data in a column intended for
text might cause some problems down the road (even if it doesn't now) since
it will contain unexpected data.  You probably should look into the text
encoding systems out there to see if there could be some potential problems
with, for example, your data being misinterpreted as Chinese text.

> But let's have the same picture, but in pretty fine resolution for
> printing. File size will be megabytes and more [let's say 10Mb].
> Ofcourse, file should be stored in DB because of data integrity. In this
> case, we will need pretty much memory while reading/updating. All the
> time. And if I will get many similar requests - and they will go
> directly to DB, not throw sime 'local cache', it can raise problem.

        If you can't trust your underlying file system to store files
reliably, then you really shouldn't store your database their either.

        If you're concerned about the transactional side of things, then your
concern is definitely warranted.

[sNip -- other questions I don't know the answers for, but would like to]
> I wish new PgSql 7.4 documentation (as far I know it was not yet
> released) contain small chapter / paragraph at LO about this. If someone
> is new to PG (like me), there is no enough information about 'using' LO.

        I'd like to see this too.

> Thank you for corrections and comments about LO/columns binary storing.
[sNip]

        Thanks for asking.  I've been meaning to ask about some of this
myself, but you've saved me the trouble.  =)

--
Randolf Richardson - rr@8x.ca
Inter-Corporate Computer & Network Services, Inc.
Vancouver, British Columbia, Canada
http://www.8x.ca/

This message originated from within a secure, reliable,
high-performance network ... a Novell NetWare network.


pgsql-general by date:

Previous
From: "Randolf Richardson, DevNet SysOp 29"
Date:
Subject: Re: Picture with Postgres and Delphi
Next
From: "Randolf Richardson, DevNet SysOp 29"
Date:
Subject: Re: Humor me: Postgresql vs. MySql (esp. licensing)