Thread: not really SQL but I need info on BLOBs
Hi, I am starting a new project where I need to store a lot of tiff images. So the question beckons... which is better (not correct)? store a blob in the database and let the db manage the data or store the image in a directory and store the path to the data in the db. What are your opinions? Is one method better than the other or are there simply different trade offs? If I store the tiff on the computer and only store the path in the database it would be easier to add more disk space to the computer without any changes to the db. Also, there would be 'virtually' unlimited storage space (only limited by the OS) However, it looks like it would be easier to create a connection to the database to access the blobs. I would love to hear not just what others have done but why they did it? Thanks for listening... BTW, I really like the zeroconf (rendezvous) support in postgresql. Ted __________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
I suggest to store the images on the disc and write in the db only the path. I am using the same solution. The only problem is that I have to make the backups separately(in my case). Also this helps more in accessing the images, you don't have to access for each image the database, you just have to get the file path. Think also at the dump size of the database (with blobs). (if you have 100GB of photos..the dump will be also that big) I never tied to put the database on 2 disks, I read somewhere that is possible but I never tried it. In my situation I don't need this, because the size of the DB is not quite that big. But for the photos we will use more discs (we are not using 2 yet) and there is no difference in this case how many discs you use, because in the DB you store only the path. I see it this way more simple, maybe other have other opinions. Best regards, Andy. ----- Original Message ----- From: "Theodore Petrosky" <tedpet5@yahoo.com> To: <pgsql-sql@postgresql.org> Sent: Wednesday, May 05, 2004 6:10 PM Subject: [SQL] not really SQL but I need info on BLOBs > Hi, > > I am starting a new project where I need to store a > lot of tiff images. So the question beckons... which > is better (not correct)? > store a blob in the database and let the db manage the > data or > store the image in a directory and store the path to > the data in the db. > > What are your opinions? Is one method better than the > other or are there simply different trade offs? > > If I store the tiff on the computer and only store the > path in the database it would be easier to add more > disk space to the computer without any changes to the > db. Also, there would be 'virtually' unlimited storage > space (only limited by the OS) > > However, it looks like it would be easier to create a > connection to the database to access the blobs. > > I would love to hear not just what others have done > but why they did it? > > Thanks for listening... > > BTW, I really like the zeroconf (rendezvous) support > in postgresql. > > Ted > > > > > __________________________________ > Do you Yahoo!? > Win a $20,000 Career Makeover at Yahoo! HotJobs > http://hotjobs.sweepstakes.yahoo.com/careermakeover > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend >
Thanks for the reply. Are there (in your opinion) reasons why you would choose to store the images in the db? My feelings are that it comes down to access. If I store the images in the db, my only access to these images is to retrieve the image do some work on it and reupload it. If I store the image as a file and link to it, I can edit the image to my hearts content by mounting the volume and opening it in an image editor and saving. there is nothing more to do. archiving and backups are easier also. my tape system talks to my server and backs up the images. I believe it would be easier to recover from a problem. in your solution, how do you retrieve your image? do you need to mount the volume where the images live? I guess I am asking "If your user/client queries the db for an image, they may want to view it? How do you send them the data?" I am working on a project where the IT department is pushing really hard to have all the images in the db. I don't know what the agenda is. I am hopeful to come up with reasons either why this is good or not good. Thanks, Ted --- Andrei Bintintan <klodoma@ar-sd.net> wrote: > I suggest to store the images on the disc and write > in the db only the path. > I am using the same solution. The only problem is > that I have to make the > backups separately(in my case). Also this helps more > in accessing the > images, you don't have to access for each image the > database, you just have > to get the file path. > > Think also at the dump size of the database (with > blobs). (if you have 100GB > of photos..the dump will be also that big) > > I never tied to put the database on 2 disks, I read > somewhere that is > possible but I never tried it. In my situation I > don't need this, because > the size of the DB is not quite that big. But for > the photos we will use > more discs (we are not using 2 yet) and there is no > difference in this case > how many discs you use, because in the DB you store > only the path. > > I see it this way more simple, maybe other have > other opinions. > > Best regards, > Andy. > > ----- Original Message ----- > From: "Theodore Petrosky" <tedpet5@yahoo.com> > To: <pgsql-sql@postgresql.org> > Sent: Wednesday, May 05, 2004 6:10 PM > Subject: [SQL] not really SQL but I need info on > BLOBs > > > > Hi, > > > > I am starting a new project where I need to store > a > > lot of tiff images. So the question beckons... > which > > is better (not correct)? > > store a blob in the database and let the db manage > the > > data or > > store the image in a directory and store the path > to > > the data in the db. > > > > What are your opinions? Is one method better than > the > > other or are there simply different trade offs? > > > > If I store the tiff on the computer and only store > the > > path in the database it would be easier to add > more > > disk space to the computer without any changes to > the > > db. Also, there would be 'virtually' unlimited > storage > > space (only limited by the OS) > > > > However, it looks like it would be easier to > create a > > connection to the database to access the blobs. > > > > I would love to hear not just what others have > done > > but why they did it? > > > > Thanks for listening... > > > > BTW, I really like the zeroconf (rendezvous) > support > > in postgresql. > > > > Ted > > > > > > > > > > __________________________________ > > Do you Yahoo!? > > Win a $20,000 Career Makeover at Yahoo! HotJobs > > > http://hotjobs.sweepstakes.yahoo.com/careermakeover > > > > ---------------------------(end of > broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to > majordomo@postgresql.org so that your > message can get through to the mailing list cleanly __________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
On Thu, May 06, 2004 at 04:46:22AM -0700, Theodore Petrosky wrote: > Thanks for the reply. Are there (in your opinion) > reasons why you would choose to store the images in > the db? Transactional integrity. If there's a risk that people are going to be deleting, &c. these images, then you can end up with references in the database to files that don't exist, because the filesystem operations can't be made subject to the transactions of the database. A -- Andrew Sullivan | ajs@crankycanuck.ca
Now of corse it depends on the application that you are making, where you put the files. In my situation it is not so complicated, because we're not working on the images, so from this point of view they are "readonly". Out clients that are asking for photos have webaccess to a folder where we put the the photos and we just give the right link, or they receive the photos in pdf documents. I have the images on the same server as the database, so this is not a problem to access them. But also if you have to make a mount on other system I see no problem to this. Only if you will have to write the image file back, then you will have to give write access to that file. In our situation the image files are "web accessed" so there is NO way that we give write access to that folder(I mean for the clients). Now honestly the size of the DB makes me to worry at the point when you put the images in the database. You cannot make simple dumps. I have at the moment 12 GB of photos and the backup script for these takes quite some time. Now there is another risk when you put the photos on the disk... there is the possibility that you have the link in the DB but you don't have the photo, or the file is corrupted. This can happen. It is up to you to decide. I don't know if there are also some performance losses or not, but from what I can imagine to read an image from DB takes more time as to read it from a file. I know a lot of people that are storing the images on disk and not in the database. If you store the images in the DB the handling becomes more complicated. You will have always have to read the image and create a temporary file in order to work with it. In file cases you just copy it. I think it is more flexible the storage on the disk. But... search the forums, or hope that somedoby else answers also. Best regards, Andy. ----- Original Message ----- From: "Theodore Petrosky" <tedpet5@yahoo.com> To: "Andrei Bintintan" <klodoma@ar-sd.net>; <pgsql-sql@postgresql.org> Sent: Thursday, May 06, 2004 2:46 PM Subject: Re: [SQL] not really SQL but I need info on BLOBs > Thanks for the reply. Are there (in your opinion) > reasons why you would choose to store the images in > the db? My feelings are that it comes down to access. > > If I store the images in the db, my only access to > these images is to retrieve the image do some work on > it and reupload it. If I store the image as a file and > link to it, I can edit the image to my hearts content > by mounting the volume and opening it in an image > editor and saving. there is nothing more to do. > > archiving and backups are easier also. my tape system > talks to my server and backs up the images. I believe > it would be easier to recover from a problem. > > in your solution, how do you retrieve your image? do > you need to mount the volume where the images live? I > guess I am asking "If your user/client queries the db > for an image, they may want to view it? How do you > send them the data?" > > I am working on a project where the IT department is > pushing really hard to have all the images in the db. > I don't know what the agenda is. I am hopeful to come > up with reasons either why this is good or not good. > > Thanks, > > Ted > > --- Andrei Bintintan <klodoma@ar-sd.net> wrote: > > I suggest to store the images on the disc and write > > in the db only the path. > > I am using the same solution. The only problem is > > that I have to make the > > backups separately(in my case). Also this helps more > > in accessing the > > images, you don't have to access for each image the > > database, you just have > > to get the file path. > > > > Think also at the dump size of the database (with > > blobs). (if you have 100GB > > of photos..the dump will be also that big) > > > > I never tied to put the database on 2 disks, I read > > somewhere that is > > possible but I never tried it. In my situation I > > don't need this, because > > the size of the DB is not quite that big. But for > > the photos we will use > > more discs (we are not using 2 yet) and there is no > > difference in this case > > how many discs you use, because in the DB you store > > only the path. > > > > I see it this way more simple, maybe other have > > other opinions. > > > > Best regards, > > Andy. > > > > ----- Original Message ----- > > From: "Theodore Petrosky" <tedpet5@yahoo.com> > > To: <pgsql-sql@postgresql.org> > > Sent: Wednesday, May 05, 2004 6:10 PM > > Subject: [SQL] not really SQL but I need info on > > BLOBs > > > > > > > Hi, > > > > > > I am starting a new project where I need to store > > a > > > lot of tiff images. So the question beckons... > > which > > > is better (not correct)? > > > store a blob in the database and let the db manage > > the > > > data or > > > store the image in a directory and store the path > > to > > > the data in the db. > > > > > > What are your opinions? Is one method better than > > the > > > other or are there simply different trade offs? > > > > > > If I store the tiff on the computer and only store > > the > > > path in the database it would be easier to add > > more > > > disk space to the computer without any changes to > > the > > > db. Also, there would be 'virtually' unlimited > > storage > > > space (only limited by the OS) > > > > > > However, it looks like it would be easier to > > create a > > > connection to the database to access the blobs. > > > > > > I would love to hear not just what others have > > done > > > but why they did it? > > > > > > Thanks for listening... > > > > > > BTW, I really like the zeroconf (rendezvous) > > support > > > in postgresql. > > > > > > Ted > > > > > > > > > > > > > > > __________________________________ > > > Do you Yahoo!? > > > Win a $20,000 Career Makeover at Yahoo! HotJobs > > > > > http://hotjobs.sweepstakes.yahoo.com/careermakeover > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > > TIP 8: explain analyze is your friend > > > > > > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please > > send an appropriate > > subscribe-nomail command to > > majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > > > __________________________________ > Do you Yahoo!? > Win a $20,000 Career Makeover at Yahoo! HotJobs > http://hotjobs.sweepstakes.yahoo.com/careermakeover >
In the same vein ... We are at present using BLOBs to store images (but not very many) and generated output (mostly HTML and XML). The data being stored doesn't require BLOB use because of size; it's mostly the binary nature of the data. Lack of satisfaction with the BLOB support (generally difficult to dump and reload the database; strange problems such as OIDs exceeding 2G causing some interface code to break; very difficult to search the BLOB for specific strings) has led us to consider whether we might be better off with a different solution. The one being considered is to convert the BLOBs to BYTEA fields. Does anyone have any experience with this approach? Is there anything in upcoming versions of Pg that would make BLOBs more useful for us?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Theodore Petrosky wrote: | I am working on a project where the IT department is | pushing really hard to have all the images in the db. | I don't know what the agenda is. I am hopeful to come | up with reasons either why this is good or not good. We have never used BLOB support for images, however one additional feature of using BLOBs vs files on a disk would be having easier setup if multiple servers/clients need to access those images. In the case of using a filesystem you would either have to provide access to the images over a network filesystem or replication, whereas using the DB for storage does not require anything more than access to the DB.. We are using filesystem replication for situations like this (multiple webservers needing access to exactly the same images), however there are multiple other options for this situation. Of course needing to do editing on the images aswell provides additional hurdles. Bottom line seems to me to be that if this is all to be located on one server there is no good reason to not use filesystem storage for images, whereas if you need multiple servers and generally a more complex setup you should weigh the pros to the cons.. Regards - -- Denis -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2-nr2 (Windows XP) iD8DBQFAnieIvsCA6eRGOOARAiwYAKCeaMfnq35nGoQRixKAsec/+k4kwwCdHy91 EyIqpTqWbZimUFdOjaFdpbI= =Uzfm -----END PGP SIGNATURE-----
Denis, Interesting, have you had any experience using a SAN for the images? Do you know anyone that has successfully used a SAN to store images that are accessable to multiple servers? I do not, however the last place I worked was in the beginning stages of putting in a SAN. As it was explained, the raid that would store the images would be seen as 'local' to as many servers (running different client services) as the SAN could support. Thanks for the reply. Ted --- Denis Braekhus <denis@startsiden.no> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Theodore Petrosky wrote: > | I am working on a project where the IT department > is > | pushing really hard to have all the images in the > db. > | I don't know what the agenda is. I am hopeful to > come > | up with reasons either why this is good or not > good. > > We have never used BLOB support for images, however > one additional > feature of using BLOBs vs files on a disk would be > having easier setup > if multiple servers/clients need to access those > images. > In the case of using a filesystem you would either > have to provide > access to the images over a network filesystem or > replication, whereas > using the DB for storage does not require anything > more than access to > the DB.. > > We are using filesystem replication for situations > like this (multiple > webservers needing access to exactly the same > images), however there are > multiple other options for this situation. Of course > needing to do > editing on the images aswell provides additional > hurdles. > > Bottom line seems to me to be that if this is all to > be located on one > server there is no good reason to not use filesystem > storage for images, > whereas if you need multiple servers and generally a > more complex setup > you should weigh the pros to the cons.. > > Regards > - -- > Denis > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.2-nr2 (Windows XP) > > iD8DBQFAnieIvsCA6eRGOOARAiwYAKCeaMfnq35nGoQRixKAsec/+k4kwwCdHy91 > EyIqpTqWbZimUFdOjaFdpbI= > =Uzfm > -----END PGP SIGNATURE----- __________________________________ Do you Yahoo!? Win a $20,000 Career Makeover at Yahoo! HotJobs http://hotjobs.sweepstakes.yahoo.com/careermakeover
I did something similar- a web application for online employer picture directory along with dynamic graphical output of the desk locations throughout the company (using the GD package). I kept the pictures in a separate directory and used links in the database. I set up a DB trigger that took care of the respective picture if an employer left. It worked great and was fantastic from a performance perspective. The only reason I could come up with keeping images in the DB as BLOBs is if you developed a specialized function that would do actual image analysis on the pictures themselves. For example, flagging certain images after weeding out faces of terrorists in a picture of a crowd of random people. But what do I know, I don't work at the Department of Homeland Security :-) Jared