Thread: Filesystem vs. Postgres for images
Hello, I am working on web portal. There are some ads. We have about 200 000 ads. Every ad have own directory called ID, where is 5 subdirectories with various sizes of 5 images. Filesystem is too slow. But I don't know, if I store these images into postgres, performace will grow. Second question is, what kind of hardware I need for storing in DB. Now I have Intel(R) Pentium(R) 4 CPU 1.70GHz with 512MB RAM and 120GB HDD. thanx for advices... miso
Hello, >Filesystem is too slow. But I don't know, if I store these images into >postgres, performace will grow. but postres also stores its data on the filesystem. maybe take a better FS like XFS (xfs is very nice and performes good), imho other filesystems like reiser have some version-problems ;) so storing images in postres as blob imho is not as fast as XFS, but maybe you have to do some performance tests. maybe you can store an index table for your images with path and filenames - if did not so far, this should speed up your file-search. also having a lot of ram, at least 1Gb for such a big portal, and scsi is a good idea, if you want to boost it up, take a dual proz system with scsi-raid and a lot of ram - and costs a lot of money also :((. another tip is to activate proxy or httpd-cache and compression or other caching stuff in php, java etc. sorry for my broken english. volker Michal Hlavac wrote: > Hello, > > I am working on web portal. There are some ads. We have about 200 000 > ads. Every ad have own directory called ID, where is 5 subdirectories > with various sizes of 5 images. > > Filesystem is too slow. But I don't know, if I store these images into > postgres, performace will grow. > > Second question is, what kind of hardware I need for storing in DB. Now > I have Intel(R) Pentium(R) 4 CPU 1.70GHz with 512MB RAM and 120GB HDD. > > thanx for advices... > > miso > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, > imho other filesystems like reiser have some version-problems ;) Oh please. Reiser is as unstable as postgres is slow - in other words, both have to suffer prejudice which used to be true loooong ago. ;-) In cases of large directories ext2/3 perform extremely bad (as in the original post) So this guy will be better off with anything but ext2/3. That's why I switched from ext2 to reiser ~2 years ago (without any problems since). Mit freundlichem Gruß / With kind regards Holger Klawitter - -- lists <at> klawitter <dot> de -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQFAe8dP1Xdt0HKSwgYRAkLqAJ4vmqkDGkFYDL67aPMAK6qGAavEQgCfekvV JCht52XAoXE8DrbXX24B8gc= =XjOz -----END PGP SIGNATURE-----
On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote: > Hello, > > I am working on web portal. There are some ads. We have about 200 000 > ads. Every ad have own directory called ID, where is 5 subdirectories > with various sizes of 5 images. > > Filesystem is too slow. But I don't know, if I store these images into > postgres, performace will grow. Consider breaking your directories up, i.e.: /ads/(ID % 1000)/ID I use that for a system with several million images, works great. I really don't think putting them in the database will do anything positive for you. :)
There has got to be some sort of standard way to do this. We have the same problem where I work. Terabytes of images, but the question is still sort of around "BLOBs or Files?" Our final decision was to use the file system. We found that you didn't really gain anything by storing the images in the DB, other than having one place to get the data from. The file system approach is much easier to backup, because each image can be archived separately as well as browsed by 3rd party tools. -jj- On Tue, 2004-04-13 at 07:40, Cott Lang wrote: > On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote: > > Hello, > > > > I am working on web portal. There are some ads. We have about 200 000 > > ads. Every ad have own directory called ID, where is 5 subdirectories > > with various sizes of 5 images. > > > > Filesystem is too slow. But I don't know, if I store these images into > > postgres, performace will grow. > > > Consider breaking your directories up, i.e.: > > /ads/(ID % 1000)/ID > > I use that for a system with several million images, works great. I > really don't think putting them in the database will do anything > positive for you. :) > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Jeremiah Jahn <jeremiah@cs.earlham.edu>
Hello, No standard way that I know of :). We tend to use BLOBS because we can have associated tables with metadata about the images that can be searched etc.... Of course you could that with the filesystem as well but we find blobs easier. I will say we tend to use BLOBS or Bytea. J Jeremiah Jahn wrote: >There has got to be some sort of standard way to do this. We have the >same problem where I work. Terabytes of images, but the question is >still sort of around "BLOBs or Files?" Our final decision was to use the >file system. We found that you didn't really gain anything by storing >the images in the DB, other than having one place to get the data from. >The file system approach is much easier to backup, because each image >can be archived separately as well as browsed by 3rd party tools. > >-jj- > > >On Tue, 2004-04-13 at 07:40, Cott Lang wrote: > > >>On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote: >> >> >>>Hello, >>> >>>I am working on web portal. There are some ads. We have about 200 000 >>>ads. Every ad have own directory called ID, where is 5 subdirectories >>>with various sizes of 5 images. >>> >>>Filesystem is too slow. But I don't know, if I store these images into >>>postgres, performace will grow. >>> >>> >>Consider breaking your directories up, i.e.: >> >>/ads/(ID % 1000)/ID >> >>I use that for a system with several million images, works great. I >>really don't think putting them in the database will do anything >>positive for you. :) >> >> >> >> >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 3: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> >> -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
I tried the bytea types, but the parsing done by the system on insert etc. was so bad that it made it usable for me. Our solution is to keep all of the metadata in the db plus an id and then a web service that gets the image from the FS. On Tue, 2004-04-13 at 09:05, Joshua D. Drake wrote: > Hello, > > No standard way that I know of :). We tend to use BLOBS because we can > have associated tables > with metadata about the images that can be searched etc.... Of course > you could that with the filesystem > as well but we find blobs easier. > > I will say we tend to use BLOBS or Bytea. > > J > > > Jeremiah Jahn wrote: > > >There has got to be some sort of standard way to do this. We have the > >same problem where I work. Terabytes of images, but the question is > >still sort of around "BLOBs or Files?" Our final decision was to use the > >file system. We found that you didn't really gain anything by storing > >the images in the DB, other than having one place to get the data from. > >The file system approach is much easier to backup, because each image > >can be archived separately as well as browsed by 3rd party tools. > > > >-jj- > > > > > >On Tue, 2004-04-13 at 07:40, Cott Lang wrote: > > > > > >>On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote: > >> > >> > >>>Hello, > >>> > >>>I am working on web portal. There are some ads. We have about 200 000 > >>>ads. Every ad have own directory called ID, where is 5 subdirectories > >>>with various sizes of 5 images. > >>> > >>>Filesystem is too slow. But I don't know, if I store these images into > >>>postgres, performace will grow. > >>> > >>> > >>Consider breaking your directories up, i.e.: > >> > >>/ads/(ID % 1000)/ID > >> > >>I use that for a system with several million images, works great. I > >>really don't think putting them in the database will do anything > >>positive for you. :) > >> > >> > >> > >> > >> > >> > >>---------------------------(end of broadcast)--------------------------- > >>TIP 3: if posting/reading through Usenet, please send an appropriate > >> subscribe-nomail command to majordomo@postgresql.org so that your > >> message can get through to the mailing list cleanly > >> > >> -- Jeremiah Jahn <jeremiah@cs.earlham.edu>
On Apr 13, 2004, at 9:40 AM, Jeremiah Jahn wrote: > There has got to be some sort of standard way to do this. We have the > same problem where I work. Terabytes of images, but the question is > still sort of around "BLOBs or Files?" Our final decision was to use > the > file system. We found that you didn't really gain anything by storing > the images in the DB, other than having one place to get the data from. > The file system approach is much easier to backup, because each image > can be archived separately as well as browsed by 3rd party tools. This is a pretty "classic problem," of performance modeling. While it wasn't images, I worked on a system that had several million small files (5-100K) that needed to be stored. The performance bottleneck was a couple of things, in storing them in the FS (the bottleneck is similar in PostgreSQL): 1. Directory name lookups do not scale well, so keep the number of files in a directory to a manageable number (100-500). 2. Retrieval time is limited not by disk bandwidth, but by I/O seek performance. More spindles = more concurrent I/O in flight. Also, this is where SCSI takes a massive lead with tag-command-queuing. In our case, we ended up using a three-tier directory structure, so that we could manage the number of files per directory, and then because load was relatively even across the top 20 "directories", we split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is a substantial performance increase. Hope this helps. Some of these things apply to PostgreSQL, except until there's better manageability of TABLESPACE, and the ability to split tables across multiple spaces, it's going to be hard to hit those numbers. This is a place where the "big databases" are better. But then, that's the top 5% of installs. Tradeoffs. Chris -- | Christopher Petrilli | petrilli (at) amber.org
Michal Hlavac пишет: > Hello, > > I am working on web portal. There are some ads. We have about 200 000 > ads. Every ad have own directory called ID, where is 5 subdirectories > with various sizes of 5 images. > > Filesystem is too slow. But I don't know, if I store these images into > postgres, performace will grow. > > Second question is, what kind of hardware I need for storing in DB. Now > I have Intel(R) Pentium(R) 4 CPU 1.70GHz with 512MB RAM and 120GB HDD. > > thanx for advices... > > miso Hello Miso. I used to have the same problem with web hosting and storing/sorting/retreiving images for banner exchange and for user "sites/pages". My tests was done on FreeBSD 4.5 with postgreSQL 7.3.2 (or soething) and with Mysql (i do not remember it's version. And we found out that only storing of filenames in the database and getting the actual binary data from filesystem giving some performance. -- Best regads, Anton Nikiforov
Attachment
> Hi, > > > imho other filesystems like reiser have some version-problems ;) > > Oh please. Reiser is as unstable as postgres is slow - in other words, both > have to suffer prejudice which used to be true loooong ago. ;-) > > In cases of large directories ext2/3 perform extremely bad (as in the original > post) So this guy will be better off with anything but ext2/3. That's why I > switched from ext2 to reiser ~2 years ago (without any problems since). > > Mit freundlichem Gruß / With kind regards > Holger Klawitter > - -- I use reiserfs, too. Large directories (hundreds of thousand files) does not slow down file retrival, and i never had anyproblems with stability. /Mattias
On Tue, 13 Apr 2004, Christopher Petrilli wrote: > 2. Retrieval time is limited not by disk bandwidth, but by I/O seek > performance. More spindles = more concurrent I/O in flight. Also, this > is where SCSI takes a massive lead with tag-command-queuing. > > In our case, we ended up using a three-tier directory structure, so > that we could manage the number of files per directory, and then > because load was relatively even across the top 20 "directories", we > split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where > RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is > a substantial performance increase. Please explain why RAID 5 is so bad here. I would think that on a not very heavily updated fs, RAID-5 would be the functional equivalent of a RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a bad idea (other than the unreliability of it) because it only puts the data on one spindle, unlike RAID-1 which puts it on many. In that case >2 drive RAID 1 setups might be a huge win. The linux kernel certainly supports them, and I think some RAID cards do too. Just wondering.
> I am working on web portal. There are some ads. We have about 200 000 > ads. Every ad have own directory called ID, where is 5 subdirectories > with various sizes of 5 images. > > Filesystem is too slow. But I don't know, if I store these images into > postgres, performace will grow. Certainly the problem you are experiencing is because you have 200,000 directories, and directory lookups are not scaling well. I had a look at this a few weeks ago for an email storage application. Using a filesystem with better directory lookup performance (? xfs, resiserfs, jfs)is one obvious solution, as is storing the data in the database. If you want to use files in an ext2/3 filesystem, you need to break up the directories into a hierarchy. I did some web research trying to find numbers for how many entries you can get away with in an ext2/3 filesystem before the lookup time starts to bite. I didn't find very much useful data. The best answer I got was "between 100 and 1000". Since my identifiers are decimail numbers, I had a choice of breaking them up into groups of two or three (i.e. 12/34/56/78 or 012/345/678). I went for groups of two and it works well. Certainly this is not the limiting factor in the system as a whole. Looking back, I wonder if I should have gone for groups of three. Is the lookup time a function of the number of entries in the directory, or the size (in bytes) of the directory? Since my directory names are short in this scheme, I get more directory entries per disk block. One other thing to mention: have you turned off access time (atime) logging for the filesystem? (man mount) --Phil.
On Apr 13, 2004, at 11:27 AM, scott.marlowe wrote: > On Tue, 13 Apr 2004, Christopher Petrilli wrote: > >> 2. Retrieval time is limited not by disk bandwidth, but by I/O seek >> performance. More spindles = more concurrent I/O in flight. Also, this >> is where SCSI takes a massive lead with tag-command-queuing. >> >> In our case, we ended up using a three-tier directory structure, so >> that we could manage the number of files per directory, and then >> because load was relatively even across the top 20 "directories", we >> split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where >> RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is >> a substantial performance increase. > > Please explain why RAID 5 is so bad here. I would think that on a not > very heavily updated fs, RAID-5 would be the functional equivalent of a > RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a > bad > idea (other than the unreliability of it) because it only puts the > data on > one spindle, unlike RAID-1 which puts it on many. > > In that case >2 drive RAID 1 setups might be a huge win. The linux > kernel > certainly supports them, and I think some RAID cards do too. The issue comes down to read and write strategies. If your files are bigger than the stripe size and begin to involve multiple drives, then the rotational latency of each drive can come into play. This is often hidden under caching during those wonderful comparison reviews, but when you're talking about near random distributed access of more information than could fit in the cache, then you have to face the rotational issues of drives. Since the spindles are not locked together, they drift apart in location, and you often end up with worst-case latency in the drive subsystem. Mirroring doesn't face this, especially when you can distribute the READS across all the drives. For example, if you ran triplex RAID-0, meaning 3 copies of the data, which is often done in large environments so that you can take one copy offline for a backup, while maintaining 2 copies online, then you can basically handle 3 reads for the cost of 1, increasing the number of read ops you can handle. This doesn't work with RAID-0, or RAID-5. Chris -- | Christopher Petrilli | petrilli (at) amber.org
scott.marlowe пишет: >On Tue, 13 Apr 2004, Christopher Petrilli wrote: > > > >>2. Retrieval time is limited not by disk bandwidth, but by I/O seek >>performance. More spindles = more concurrent I/O in flight. Also, this >>is where SCSI takes a massive lead with tag-command-queuing. >> >>In our case, we ended up using a three-tier directory structure, so >>that we could manage the number of files per directory, and then >>because load was relatively even across the top 20 "directories", we >>split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where >>RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is >>a substantial performance increase. >> >> > >Please explain why RAID 5 is so bad here. I would think that on a not >very heavily updated fs, RAID-5 would be the functional equivalent of a >RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a bad >idea (other than the unreliability of it) because it only puts the data on >one spindle, unlike RAID-1 which puts it on many. > >In that case >2 drive RAID 1 setups might be a huge win. The linux kernel >certainly supports them, and I think some RAID cards do too. > >Just wondering. > > >---------------------------(end of broadcast)--------------------------- >TIP 7: don't forget to increase your free space map settings > > Hello All. I'll try to explain the raid scheme First of all the head movement takes 99% of all data retrival time in case you would like to get a small block of data (actualy one FS block). You need to move HDD's heads something like 4-20ms when reading of a block of data (actualy one cilinder will hit hte disk's cache) takes 1000 times less time. Now to the RAIDs: It is true that the only RAID that allow increasing of record speed is RAID0. That is why all database developers recomend RAID0+1 or RAID 10 (They are different, but it is not the topic here). So if you need record speed - you know the way. At most all RAIDs give you a read performance goal. The matter is when RAID5 is slower than RAID1 (whatever else) is the matter of disk subsystem planning and configuration. If you have the FS block size is 4K, then all disk IO from the OS point of view is reading 4k blocks. While in the RAID you could have a block size configured to 4,8,16,32,64,128k. Lets imagine three situations: 1. Raid bock size is 4k and we have 3 disks in RAID5 The controller will read data by blocks, so it could get 2 blocks at a time (3rd disk stores redundancy information). The situation is exactly like when using RAID1 with 2 disks. 2. Raid block is 128k, and we have 3 disks in the RAID5 The controller will read the whole block even if you have asked to read only 4k (and you did, because of FS request size). And as you could see 124k will hit cache but will be useless. But if you have files that of comparable size with the block or much more in size than a block you will increase reading performance drammaticaly (like for video files which were put on the disks contineously and are being read block by block). So, if you have some time try to "play" with your raid 5 and you will see the differences when you change block size of you FS or RAID's stripe size. But you will see that single disk writes data always faster than RAID 5. If you are talking about software raid (supported by the kernel) - it will be always slower than hardware one (you will loose at least 30% of your system bus and CPU power for calculations and internal RAID5 data computing). With RAID 0/1 it is not so drammatical but remember that you have RAID1 support in the kernel not for the productivity improvement of IO but for redundancy. And software raids does not decrease your system downtime. -- Best regads, Anton Nikiforov
Attachment
Hi, is the file system approach really easier and faster? What if you need to protect the image data e.g. you dont want users just to just dowload the pictures directly from your website? -a Jeremiah Jahn wrote: >There has got to be some sort of standard way to do this. We have the >same problem where I work. Terabytes of images, but the question is >still sort of around "BLOBs or Files?" Our final decision was to use the >file system. We found that you didn't really gain anything by storing >the images in the DB, other than having one place to get the data from. >The file system approach is much easier to backup, because each image >can be archived separately as well as browsed by 3rd party tools. > >-jj- > > >On Tue, 2004-04-13 at 07:40, Cott Lang wrote: > > >>On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote: >> >> >>>Hello, >>> >>>I am working on web portal. There are some ads. We have about 200 000 >>>ads. Every ad have own directory called ID, where is 5 subdirectories >>>with various sizes of 5 images. >>> >>>Filesystem is too slow. But I don't know, if I store these images into >>>postgres, performace will grow. >>> >>> >>Consider breaking your directories up, i.e.: >> >>/ads/(ID % 1000)/ID >> >>I use that for a system with several million images, works great. I >>really don't think putting them in the database will do anything >>positive for you. :) >> >> >> >> >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 3: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> >>
Your code is retrieving the file from the file system. It doesn't have to be accessible from the web server at all. Our current design uses a JDBC connection to the database for the metadata (digital signature,path,name,file type, etc..) and a SOAP call to the same server (but doesn't have to be) to retrieve/store the image data. -jj- On Wed, 2004-04-14 at 08:15, Alex wrote: > Hi, > is the file system approach really easier and faster? What if you need > to protect the image data e.g. you dont want users just to just dowload > the pictures directly from your website? > > -a > > Jeremiah Jahn wrote: > > >There has got to be some sort of standard way to do this. We have the > >same problem where I work. Terabytes of images, but the question is > >still sort of around "BLOBs or Files?" Our final decision was to use the > >file system. We found that you didn't really gain anything by storing > >the images in the DB, other than having one place to get the data from. > >The file system approach is much easier to backup, because each image > >can be archived separately as well as browsed by 3rd party tools. > > > >-jj- > > > > > >On Tue, 2004-04-13 at 07:40, Cott Lang wrote: > > > > > >>On Tue, 2004-04-13 at 01:44, Michal Hlavac wrote: > >> > >> > >>>Hello, > >>> > >>>I am working on web portal. There are some ads. We have about 200 000 > >>>ads. Every ad have own directory called ID, where is 5 subdirectories > >>>with various sizes of 5 images. > >>> > >>>Filesystem is too slow. But I don't know, if I store these images into > >>>postgres, performace will grow. > >>> > >>> > >>Consider breaking your directories up, i.e.: > >> > >>/ads/(ID % 1000)/ID > >> > >>I use that for a system with several million images, works great. I > >>really don't think putting them in the database will do anything > >>positive for you. :) > >> > >> > >> > >> > >> > >> > >>---------------------------(end of broadcast)--------------------------- > >>TIP 3: if posting/reading through Usenet, please send an appropriate > >> subscribe-nomail command to majordomo@postgresql.org so that your > >> message can get through to the mailing list cleanly > >> > >> -- Jeremiah Jahn <jeremiah@cs.earlham.edu>
On Wed, Apr 14, 2004 at 10:15:51PM +0900, Alex wrote: > Hi, > is the file system approach really easier and faster? What if you need > to protect the image data e.g. you dont want users just to just dowload > the pictures directly from your website? It can be much faster, if implemented correctly, to put the large files directly on the filesystem. It makes it a little harder to cluster, but it can significantly reduce DB overhead. There's no issue with the users downloading images directly, as you normally wouldn't mount them directly into the URL namespace. Instead the URL would point to a script that would lookup the image in the database, and check permissions. If the user is allowed to load the image the script will close it's connection to the database, and start shoveling bytes from the filesystem to the http connection. Most decent web application platforms have some amount of support for this sort of thing built in. That has a number of other advantages too - it can take a long time for a user to download a large file, and you really don't want the thread handling them to tie up a database connection for all that time. If you're on a platform that supports nice things like sendfile(2) you can even have the kernel do almost all the work. Cheers, Steve
hello all, if somebody is interested in a script using blobs on pgsql and php, leave me a mail at kalle@erdtrabant.de this little skript can upload files to filesystem and directly into db, release files from db to filesystem, and store files from filesystem to database. its tested with php 4.3 and postgres 7.1 greetings, volker
hi all, maybe somewhat off-topic , but here it is.... http://www.erdtrabant.de/index.php?i=500200104 a little php-script to demonstrate how to store files as blobs into postgres (tested with v7.1) as a base for testing etc., not very beautiful script-style, but useable feel free to download and change what and use where ever you want thanks for looking volker Development - multi.art.studio wrote: > hello all, > > if somebody is interested in a script using blobs on pgsql and php, > leave me a mail at kalle@erdtrabant.de > this little skript can upload files to filesystem and directly into > db, release files from db to filesystem, and store files from > filesystem to database. > its tested with php 4.3 and postgres 7.1 > > greetings, > volker > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
On Tue, 13 Apr 2004 12:56:09 +0200, Holger Klawitter wrote: > Oh please. Reiser is as unstable as postgres is slow - in other words, > both have to suffer prejudice which used to be true loooong ago. ;-) True. > In cases of large directories ext2/3 perform extremely bad (as in the > original post) So this guy will be better off with anything but ext2/3. "Oh please." - umount /dev/xxx - tune2fs -O dir_index dev/xxx - fsck -f -D /dev/hdxx - mount /dev/xxx > That's why I switched from ext2 to reiser ~2 years ago (without any > problems since). Time marches on. :-) Holger