Re: What is the best way to storage music files in Postgresql - Mailing list pgsql-performance
From | Peter Koczan |
---|---|
Subject | Re: What is the best way to storage music files in Postgresql |
Date | |
Msg-id | 4544e0330803171101g680ffa5bj8ac32908a9f3dfb5@mail.gmail.com Whole thread Raw |
In response to | Re: What is the best way to storage music files in Postgresql (Craig Ringer <craig@postnewspapers.com.au>) |
Responses |
Re: What is the best way to storage music files in Postgresql
|
List | pgsql-performance |
> > I am going to embarkon building a music library using apache, > > postgresql and php. What is the best way to store the music files? > > Your options are either to use a BLOB within the database or to store > paths to normal files in the file system in the database. I suspect > using normal files will make backup and management a great deal easier > than using in-database BLOBs, so personally I'd do it that way. I discussed something like this with some co-workers recently, and here's what I had to say. Not all of these apply to the original message, but they are things to consider when marrying a database to a file storage system. Storing the files in the database as BLOBs: Pros: - The files can always be seen by the database system as long as it's up (there's no dependence on an external file system). - There is one set of locking mechanisms, meaning that the file operations can be atomic with the database operations. - There is one set of permissions to deal with. Cons: - There is almost no way to access files outside of the database. If the database goes down, you are screwed. - If you don't make good use of tablespaces and put blobs on a separate disk system, the disk could thrash going between data and blobs, affecting performance. - There are stricter limits for PostgreSQL blobs (1 GB size limits, I've read). Storing files externally, storing pathnames in the database: Pros: - You can access and manage files from outside the database and possibly using different interfaces. - There's a lot less to store directly in the database. - You can use existing file-system permissions, mechanisms, and limits. Cons: - You are dealing with two storage systems and two different locking systems which are unlikely to play nice with each other. Transactions are not guaranteed to be atomic (e.g. a database rollback will not rollback a file system operation, a commit will not guarantee that data in a file will stay). - The file system has to be seen by the database system and any remote clients that wish to use your application, meaning that a networked FS is likely to be used (depending on how many clients you have and how you like to separate services), with all the fun that comes from administering one of those. Note that this one in particular really only applies to enterprise-level installations, not smaller installations like the original poster's. - If you don't put files on a separate disk-system or networked FS, you can get poor performance from the disk thrashing between the database and the files. There are a couple main points: 1. The favorite answer in computing, "it depends", applies here. What you decide depends on your storage system, your service and installation policies, and how important fully atomic transactions are to you. 2. If you want optimal performance out of either of these basic models, you should make proper use of separate disk systems. I have no idea which one is faster (it depends, I'm sure) nor do I have much of an idea of how to benchmark this properly. Peter
pgsql-performance by date: