Thread: Images in Database

Re: Images in Database

From

Aarni Ruuhimäki

Date:

26 March 2004, 19:02:26

Hi,

Possible, and this is called BLOB data type. Binary large object.

Some people use it with the possible cost in db size, the images as data in
the db.

Some prefer to store images as paths ie. text instead.

BR,

Aarni

On Friday 19 March 2004 14:06, you wrote:
> Hi all,
>
> Is it possible to have an image object in a database, like a gif, bmp,
> jpeg, no matter which format?
>
> How? Couldn't find its datatype on the docs.
>
> Cheers,

--
-------------------------------------------------
Aarni Ruuhimäki | Megative Tmi | KYMI.com
Pääsintie 26 | 45100 Kouvola | FINLAND
www.kymi.com | cfm.kymi.com
aarni.ruuhimaki@kymi.com | info@kymi.com
+358-5-3755 035 | +358-50-4910 037
-------------------------------------------------
This is a bugfree broadcast to you from a linux system.

Re: Images in Database

From

"M. Bastin"

Date:

27 March 2004, 08:52:20

You have 2 possibilities, the bytea data type or the large object
mechanism.  The bytea is generally the best because it is the most
flexible.  For a discussion comparing the two you can read the
relevant part of the pgSQL4RB manual which you can get here:
<http://aliacta.com/download>

Marc

>Possible, and this is called BLOB data type. Binary large object.

>  > Is it possible to have an image object in a database, like a gif, bmp,
>  > jpeg, no matter which format?

Re: Images in Database

From

"M. Bastin"

Date:

27 March 2004, 15:11:56

At 12:34 PM -0600 3/27/04, Reshat Sabiq wrote:
>I think unless the DB provides some image-searching capabilities,
>it's better to store them as paths to save the space. A lot of
>storage will be wasted otherwise. Isn't that so?

PostgreSQL has no limit on storage except for the hard disk's limit.
You shouldn't worry about that.

The trouble with paths is that you must be able to resolve them from
any client that connects to your database.   It's also harder to move
the entire database afterwards if you must, or to otherwise
reorganize your directories.

Having everything in your database is much much cleaner and will save
you a lot of headache when you modify your solution in a next
programming round.

Cheers,

Marc

Re: Images in Database

From

"Rod K"

Date:

27 March 2004, 16:03:59

I couldn't agree more.  I used to subscribe to the notion that there wasn't
a benefit to storing images in the DB.  After some heartache, I've had to
eat my words.

We have a solution where we receive hourly updates from an external source.
The updates include CSV files that are parsed into the DB and tar'd jpgs.
The original procedure called for the images to be stored in the filesystem,
and that worked fine for awhile.  Unfortunately, it didn't scale very well.
Now, we have multiple clients using the same data/images on their websites.
For now, all sites are served off the same server so a symbolic link to the
directory where the images exist for each site would work, but we'll most
probably not have all the sites on one web server as we grow.  Now, we're
talking about over 4GB of images and growing.  Maintaining multiple copies
would be a nightmare.  Moving the images to the RDBMS was the only scalable
solution.

Furthermore, having the images in the DB means they get backed up with the
DB.  Since the web site pages/scripts/etc are very static, we can do with
less frequent backups of that system now that the images are gone from
there.

> -----Original Message-----
> From: pgsql-novice-owner@postgresql.org
> [mailto:pgsql-novice-owner@postgresql.org]On Behalf Of M. Bastin
> Sent: Saturday, March 27, 2004 2:12 PM
> To: Reshat Sabiq
> Cc: pgsql-novice@postgresql.org
> Subject: Re: [NOVICE] Images in Database
>
>
> At 12:34 PM -0600 3/27/04, Reshat Sabiq wrote:
> >I think unless the DB provides some image-searching capabilities,
> >it's better to store them as paths to save the space. A lot of
> >storage will be wasted otherwise. Isn't that so?
>
> PostgreSQL has no limit on storage except for the hard disk's limit.
> You shouldn't worry about that.
>
> The trouble with paths is that you must be able to resolve them from
> any client that connects to your database.   It's also harder to move
> the entire database afterwards if you must, or to otherwise
> reorganize your directories.
>
> Having everything in your database is much much cleaner and will save
> you a lot of headache when you modify your solution in a next
> programming round.
>
> Cheers,
>
> Marc
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>

Re: Images in Database

From

Reshat Sabiq

Date:

27 March 2004, 19:33:16

Valid point. Maintenance might be easier (although i don't think you can dump and move images easily to another DB; you'd probably have to do some kind of direct connection to another DB to move them). When in a filesystem, you could bzip everything, and move them easily. There could also be an URL-prefix field for each client, followed by a URL suffix field for each image. Thus maintenance most of the time would be as easy as changing the prefix.
However, no matter how small an image is, it takes the same amount of BLOB space, doesn't it. This, IMHO, means a lot of wasted storage. Not sure if that also affects performance to some small degree. Storage is cheap, but still it costs money.

So, still it looks to me storing multi-media w/o additional benefits isn't quite worthwhile. But if there was something like find image LIKE another image, then i'd change my opinion in a sec. :)
But then again, i'm not against it. I just think it doesn't buy much, and wastes storage space.

-- 
Sincerely,
Reshat.
---
If you see my certificate with this message, you should be able to send me encrypted e-mail. 
Please consult your e-mail client for details if you would like to do that.

Rod K wrote:

I couldn't agree more.  I used to subscribe to the notion that there wasn't
a benefit to storing images in the DB.  After some heartache, I've had to
eat my words.

We have a solution where we receive hourly updates from an external source.
The updates include CSV files that are parsed into the DB and tar'd jpgs.
The original procedure called for the images to be stored in the filesystem,
and that worked fine for awhile.  Unfortunately, it didn't scale very well.
Now, we have multiple clients using the same data/images on their websites.
For now, all sites are served off the same server so a symbolic link to the
directory where the images exist for each site would work, but we'll most
probably not have all the sites on one web server as we grow.  Now, we're
talking about over 4GB of images and growing.  Maintaining multiple copies
would be a nightmare.  Moving the images to the RDBMS was the only scalable
solution.

Furthermore, having the images in the DB means they get backed up with the
DB.  Since the web site pages/scripts/etc are very static, we can do with
less frequent backups of that system now that the images are gone from
there.

-----Original Message-----
From: pgsql-novice-owner@postgresql.org
[mailto:pgsql-novice-owner@postgresql.org]On Behalf Of M. Bastin
Sent: Saturday, March 27, 2004 2:12 PM
To: Reshat Sabiq
Cc: pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Images in Database


At 12:34 PM -0600 3/27/04, Reshat Sabiq wrote:

I think unless the DB provides some image-searching capabilities,
it's better to store them as paths to save the space. A lot of
storage will be wasted otherwise. Isn't that so?

PostgreSQL has no limit on storage except for the hard disk's limit.
You shouldn't worry about that.

The trouble with paths is that you must be able to resolve them from
any client that connects to your database.   It's also harder to move
the entire database afterwards if you must, or to otherwise
reorganize your directories.

Having everything in your database is much much cleaner and will save
you a lot of headache when you modify your solution in a next
programming round.

Cheers,

Marc

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Attachment

smime.p7s

Re: Images in Database

From

"Rod K"

Date:

27 March 2004, 21:03:30

Yes, they can be dumped just fine.

I also don't understand what information you have that leads you to make the claim that space is wasted. I guess it is due to your assumption that a BLOB is used. A bytea type takes 4 extra bytes than the actual size of the data. Furthermore, the field becomes part of a bigger file and therefore will have less space loss due to differences between block size and file size. Since that difference is a maxium per file, the fewer filesystem files you have, the less space is lost.

As far as image searching, there are utilities that compare image files already. If you need this for your app, I'm sure the procedure could be duplicated in the RDBMS.

-----Original Message-----
From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-owner@postgresql.org]On Behalf Of Reshat Sabiq
Sent: Saturday, March 27, 2004 6:33 PM
To: Rod K
Cc: M. Bastin; pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Images in Database

Valid point. Maintenance might be easier (although i don't think you can dump and move images easily to another DB; you'd probably have to do some kind of direct connection to another DB to move them). When in a filesystem, you could bzip everything, and move them easily. There could also be an URL-prefix field for each client, followed by a URL suffix field for each image. Thus maintenance most of the time would be as easy as changing the prefix.
However, no matter how small an image is, it takes the same amount of BLOB space, doesn't it. This, IMHO, means a lot of wasted storage. Not sure if that also affects performance to some small degree. Storage is cheap, but still it costs money.

So, still it looks to me storing multi-media w/o additional benefits isn't quite worthwhile. But if there was something like find image LIKE another image, then i'd change my opinion in a sec. :)
But then again, i'm not against it. I just think it doesn't buy much, and wastes storage space.
-- 
Sincerely,
Reshat.

Re: Images in Database

From

Tom Lane

Date:

27 March 2004, 21:16:32

"Rod K" <rod@23net.net> writes:
> I also don't understand what information you have that leads you to make the
> claim that space is wasted.  I guess it is due to your assumption that a
> BLOB is used.  A bytea type takes 4 extra bytes than the actual size of the
> data.

Even when using BLOBs, there's not very significant overhead ---
certainly much less than the overhead of putting a small image into a
file, which is probably going to occupy at least 1K on disk.  I would
guess that the separate-files approach would be the most bulky of the
three choices.

            regards, tom lane

Re: Images in Database

From

"M. Bastin"

Date:

27 March 2004, 21:21:24

At 5:33 PM -0600 3/27/04, Reshat Sabiq wrote:
>Valid point. Maintenance might be easier (although i don't think you
>can dump and move images easily to another DB;

Yes you can with COPY or pg_dump.

>  you'd probably have to do some kind of direct connection to another
>DB to move them). When in a filesystem, you could bzip everything,
>and move them easily. There could also be an URL-prefix field for
>each client, followed by a URL suffix field for each image. Thus
>maintenance most of the time would be as easy as changing the prefix.

Like you say: "most of the time" and of course you'll run into the
other situations sooner or later...

>However, no matter how small an image is, it takes the same amount
>of BLOB space, doesn't it. This, IMHO, means a lot of wasted storage.

There is no wasted storage.  You'll use up as much disk space whether
you store separate files or whether you put those bytes inside the
database.  There's no minimum size for bytea data.

>  Not sure if that also affects performance to some small degree.

Nope, your searches are done on indexes (which don't contain the image data).

Marc

Re: Images in Database

From

Reshat Sabiq

Date:

28 March 2004, 13:46:48

I guess i had wrong assumptions. Since the following points raised are indeed true (although i'm not sure about x.a's), then yes storing them in a DB will not waste storage, probably even save some:
1. Images use a bytea type, which is not fixed size and is proportional (and close) to the image size. I see that it uses 4 bytes more than the value size (apparently to store the size in them).
1.a. Even a BLOB type, is not fixed size, and is proportional to the value stored.
2. Regular DB dump utilities can be used to transfer databases containing bytea fields. I see that it uses byte escapes to accomplish that.
2.a. I assume the same is true for BLOBs, although apparently it's a little more difficult.

Then the only reasons i can see to not store images in a DB, and the following:
i. If the same images also need to be accessed from a file system, such as by a web server, as regular URLs.
ii. If there are values that exceed the type limit (1GB for bytea); this is unlikely.
iii. I'm also seeing that BLOBs have some security concerns, which probably can be worked around, but an alternative would be to rely on the file system for permissions, etc.

Of these 3, only i. appears to be a realistic concern for images. It's a lot easier to list a URL on an [X][HT]ML page. But fat clients would probably be fine either way.

Thanks for correcting me. Learning never stops. :)

-- 
Sincerely,
Reshat.
---
If you see my certificate with this message, you should be able to send me encrypted e-mail. 
Please consult your e-mail client for details if you would like to do that.

Rod K wrote:

Yes, they can be dumped just fine.

As far as image searching, there are utilities that compare image files already. If you need this for your app, I'm sure the procedure could be duplicated in the RDBMS.

-----Original Message-----
From: pgsql-novice-owner@postgresql.org [mailto:pgsql-novice-owner@postgresql.org]On Behalf Of Reshat Sabiq
Sent: Saturday, March 27, 2004 6:33 PM
To: Rod K
Cc: M. Bastin; pgsql-novice@postgresql.org
Subject: Re: [NOVICE] Images in Database

Valid point. Maintenance might be easier (although i don't think you can dump and move images easily to another DB; you'd probably have to do some kind of direct connection to another DB to move them). When in a filesystem, you could bzip everything, and move them easily. There could also be an URL-prefix field for each client, followed by a URL suffix field for each image. Thus maintenance most of the time would be as easy as changing the prefix.
However, no matter how small an image is, it takes the same amount of BLOB space, doesn't it. This, IMHO, means a lot of wasted storage. Not sure if that also affects performance to some small degree. Storage is cheap, but still it costs money.

So, still it looks to me storing multi-media w/o additional benefits isn't quite worthwhile. But if there was something like find image LIKE another image, then i'd change my opinion in a sec. :)
But then again, i'm not against it. I just think it doesn't buy much, and wastes storage space.
-- 
Sincerely,
Reshat.   

Attachment

smime.p7s

Re: Images in Database

From

"M. Bastin"

Date:

28 March 2004, 19:36:03

>Then the only reasons i can see to not store images in a DB, and the
>following:
>...
>ii. If there are values that exceed the type limit (1GB for bytea);
>this is unlikely.
>iii. I'm also seeing that BLOBs have some security concerns, which
>probably can be worked around, but an alternative would be to rely
>on the file system for permissions, etc.

ii isn't really a concern since you can store your bytea data in
several chunks over several related records.  Personally I use chunks
of 500k to minimize RAM consumption when escaping.

For the same reason I don't recommend using the large object
mechanism since it is so easily imitated with bytea in chunks spread
over several related records, which avoids the security concerns--and
has a few more advantages.

Both of these remarks are in the pgSQL4RB manual (aliacta.com) in the
discussion on binary data.  Didn't I start out on this thread by
recommending you to read that?  ;-)  (Well, the long way works too.)

Marc

Re: Images in Database

From

Oliver Fromme

Date:

29 March 2004, 04:01:49

Reshat Sabiq wrote:
 > [...]
 > Then the only reasons i can see to not store images in a DB, and the
 > following:
 > i. If the same images also need to be accessed from a file system, such
 > as by a web server, as regular URLs.
 > [...]
 > Of these 3, only i. appears to be a realistic concern for images. It's a
 > lot easier to list a URL on an [X][HT]ML page.

That's not really a problem.  You can write a trivial CGI
program in your favourite language (Python, PHP, Perl or
whatever, even /bin/sh) that fetches an image from the DB,
so you can write URLs like http://.../imgfetch.cgi?foo.jpg.

You're even a lot more flexible that way, because you can
fetch images by various criteria, not just by filename.

Best regards
   Oliver

PS:  I've trimmed the "Cc:" list a bit.

--
Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"Unix gives you just enough rope to hang yourself --
and then a couple of more feet, just to be sure."
        -- Eric Allman

Re: Images in Database

From

Reshat Sabiq

Date:

30 March 2004, 00:05:11


Oliver Fromme wrote:

>Reshat Sabiq wrote:
> > [...]
> > Then the only reasons i can see to not store images in a DB, and the
> > following:
> > i. If the same images also need to be accessed from a file system, such
> > as by a web server, as regular URLs.
> > [...]
> > Of these 3, only i. appears to be a realistic concern for images. It's a
> > lot easier to list a URL on an [X][HT]ML page.
>
>That's not really a problem.  You can write a trivial CGI
>program in your favourite language (Python, PHP, Perl or
>whatever, even /bin/sh) that fetches an image from the DB,
>so you can write URLs like http://.../imgfetch.cgi?foo.jpg.
>
>You're even a lot more flexible that way, because you can
>fetch images by various criteria, not just by filename.
>
>Best regards
>   Oliver
>
>PS:  I've trimmed the "Cc:" list a bit.
>
>
>
Thanks. Good point. However, this would has a performance penalty. I'd
prefer not to use this approach for images accessed frequently,
especially if they are large, and even more so if the DB is not on the
same box as the web server.
But in general, you guys are right, most of the time storing images in
the DB is better. IMHO, there are however exceptions to that approach,
where referring to the file system is better.

--
Sincerely,
Reshat.
---
My email address has changed to sabiq@csociety.org. Current one, sabiq@purdue.edu, will be discontinued sometime after
April5, 2004. 
---
If you see my certificate with this message, you should be able to send me encrypted e-mail.
Please consult your e-mail client for details if you would like to do that.

Attachment

smime.p7s

Re: Images in Database

From

Oliver Fromme

Date:

30 March 2004, 10:50:43

Reshat Sabiq wrote:
 > Oliver Fromme wrote:
 > > That's not really a problem.  You can write a trivial CGI
 > > program in your favourite language (Python, PHP, Perl or
 > > whatever, even /bin/sh) that fetches an image from the DB,
 > > so you can write URLs like http://.../imgfetch.cgi?foo.jpg.
 > >
 > > You're even a lot more flexible that way, because you can
 > > fetch images by various criteria, not just by filename.
 >
 > Thanks. Good point. However, this would has a performance penalty.

No, not necessarily.  If you do it right, then it can be as
fast as filesystem access, or even faster.

You should take into account that a filesystem (like UFS)
is some kind of a "database", too.  It is accessed via
path names, which can be more or less efficient than
indices inside a "real" database (such as PostgreSQL).

Apart from that, if you require the performance, then you
can put reverse proxies in front of your web servers to
accelerate them, or let your CGI cache the images.  That's
common practice.

 > I'd
 > prefer not to use this approach for images accessed frequently,
 > especially if they are large, and even more so if the DB is not on the
 > same box as the web server.

Nope.  If they're accesses frequently, a reverse proxy will
help, and also your CGI can easily cache such images (also
take into account that a database performs some caching on
its own).

It shouldn't matter at all whether they're large or not.
The database operation which takes the most time is to find
the actual location of the object (image).  That's also
true for a filesystem.  Once the object is located, it can
be fetched at whatever speed the hardware allows.

It also doesn't make a big difference whether the DB is on
the same physical machine as the web server.  On seriously
big installations, the storage is on a separate file server
anyway (e.g. a NetApp Filer or some other RAID box).  The
network overhead is small if you use gigabit ethernet,
which is standard nowadays.

 > But in general, you guys are right, most of the time storing images in
 > the DB is better. IMHO, there are however exceptions to that approach,
 > where referring to the file system is better.

Well, everything has advantages as well as disadvantages,
and everyone has to make his own decision.  But in this
particular case I think the disadvantages (of not storing
images in the DB itself) are negligible, or can be worked
around without problems.

Just my two cents, and YMMV.  :-)

Best regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

Passwords are like underwear.  You don't share them,
you don't hang them on your monitor or under your keyboard,
you don't email them, or put them on a web site,
and you must change them very often.