Thread: using Postgres to store many small files
I am currently working on a Java web application in which we are making use of the JDBC driver for Postgres 7.4.1. Part of our application allows the administrators to manage a large number of small images, most of them not exceeding 5KB. There is about a gigabyte of these small files. We're currently storing the files on disk and the other information about the file in the database (historical reasons that I won't complain about here). I recently discovered the Hibernate project and was pleasantly surprised how simple it was to store an image in Postgres as a bytea using Hibernate's BLOB support. I'm wondering if Postgres would have any problem handling all of our files if we were to put them into Postgres as bytea data. And how well would Postgres scale as the number of files increased? Our Java application and Postgres are currently running on the same machine, a dual Xeon 2.6Ghz with 1GB of RAM. We are currently not working this machine very hard at all. Thanks, -M@
On Thursday 04 March 2004 01:03, Matthew Hixson wrote: [snip] > I recently discovered the Hibernate project and was pleasantly > surprised how simple it was to store an image in Postgres as a bytea > using Hibernate's BLOB support. I'm wondering if Postgres would have > any problem handling all of our files if we were to put them into > Postgres as bytea data. And how well would Postgres scale as the > number of files increased? PG itself cares nothing whether the data is text or bytea - it won't be able to compress the data much presumably (if they are GIF/JPEG). The only issue I can think of is that you will have to access these images through PG rather than the filesystem - worth checking there aren't any little utilities relying on that. > Our Java application and Postgres are currently running on the same > machine, a dual Xeon 2.6Ghz with 1GB of RAM. We are currently not > working this machine very hard at all. More RAM might be an idea - it's not expensive. Also - consider whether this will have an impact on your backup plans. -- Richard Huxton Archonet Ltd
I am assuming these are mugshots displayed on web pages? Put them in Postgres by all means, I would, but push them out to the file system as well so the webserver can return them directly, saving database access CPU cycles. Intercept 404s with a jsp that checks if it was an image being asked for. If it was and it can find it in the DB, push it out to the client and write it to disk for the next time. Or run a nightly script that checks for missing files. Bas.