Thread: Storing large files in multiple schemas: BLOB or BYTEA
>Yeah, a pg_dump mode that dumped everything but large objects would be nice. There is option -b for pg_dump which controls whether large objects are dumped or no. The problem is that with option -b it dumps all large objects regardless of what schema you requested it to dump using option -n. Otherwise it works fine. >I'm now wondering about the idea of implementing a pg_dump option that >dumped large objects into a directory tree like > lobs/[loid]/[lob_md5] >and wrote out a restore script that loaded them using `lo_import`. > >During dumping temporary copies could be written to something like >lobs/[loid]/.tmp.nnnn with the md5 being calculated on the fly as the >byte stream is read. If the dumped file had the same md5 as the existing >one it'd just delete the tempfile; otherwise the tempfile would be >renamed to the calculated md5. > >That way incremental backup systems could manage the dumped LOB tree >without quite the same horrible degree of duplication as is currently >faced when using lo in the database with pg_dump. > >A last_modified timestamp on `pg_largeobject_metadata` would be even >better, allowing the cost of reading and discarding rarely-changed large >objects to be avoided. Definitely interesting idea with incremental backups.
>I believe the general consensus around here is to not do that, if you can avoid it. File systems are much better equipped to handle files of that magnitude, especially when it comes to retrieving them, scanning >through their contents, or really, any access pattern aside from simple storage.
>
>You're better off storing the blob on disk somewhere and storing a row that refers to its location. Either key pieces for a naming scheme or the full path.
>
>This is especially true if you mean to later access that data with PHP.
>
>--
>Shaun Thomas
Using files stored outside the database creates all sorts of problems. For starters you lose ACID guaranties. I would prefer to keep them in database. We did a lot of experiments with Large Objects and they really worked fine (stored hundreds of LOBs ranging from a few MB up to 1GB). Postgres does a really good job with Large Objects. If it was not the pg_dump problem I would not hesitate to use LOBs.