Re: Database versus filesystem for storing images - Mailing list pgsql-general
From | Andrew Chernow |
---|---|
Subject | Re: Database versus filesystem for storing images |
Date | |
Msg-id | 459EB9D8.1000401@esilo.com Whole thread Raw |
In response to | Re: Database versus filesystem for storing images (Jorge Godoy <jgodoy@gmail.com>) |
Responses |
Re: Database versus filesystem for storing images
Re: Database versus filesystem for storing images |
List | pgsql-general |
>> Don't store your images in the database. Store them on the filesystem and >> store their path in the database I 100% agree. Use the database as a lookup into the filesystem. Don't load the database up with terabytes of non-searchable binary data? not sure how that would help you? Here is one idea: Have a 64-bit sequence that you use to generate an image_id (becomes file name). Hash that id out over a 3-level deep directory structure that allows 4096 entries per directory. Should give you 64-bit worth of files. CREATE OR REPLACE FUNCTION get_image_path(image_id BIGINT) RETURNS TEXT AS $$ DECLARE -- 40963, avoids "integer out of range" dir1_val BIGINT := 68719476736; dir1 BIGINT; dir2 BIGINT; dir3 BIGINT; path TEXT; BEGIN dir1 := (image_id / dir1_val) % 4096; dir2 := (image_id / (4096 * 4096)) % 4096; dir3 := (image_id / 4096) % 4096; RETURN '/BASE_PATH/' || dir1 || '/' || dir2 || '/' || dir3 || '/' || image_id; END; $$ LANGUAGE PLPGSQL; test=# select get_image_path(200399322222); get_image_path ------------------------------------- /BASE_PATH/2/3752/2991/200399322222 (1 row) >I mean, how do you handle integrity with data > outside the database? You don't, the file system handles integrity of the stored data. Although, one must careful to avoid db and fs orphans. Meaning, a record with no corresponding file or a file with no corresponging record. Always write()/insert an image file to the system within a transaction, including writing the image out to the fs. Make sure to unlink any paritally written image files. >>How do you plan your backup routine In regards to backup, backup the files one-by-one. Grab the lastest image file refs from the database and start backing up those images. Each successfully backed up image should be followed by inserting that file's database record into a remote db server. If anything fails, cleanup the partial image file (to avoid orphaned data) and rollout the transaction. just one idea. i'm sure there are other ways of doing it. point is, this is completely possible to do reliably. andrew Jorge Godoy wrote: > John McCawley <nospam@hardgeus.com> writes: > >> Don't store your images in the database. Store them on the filesystem and >> store their path in the database. Anyone that tells you otherwise is a stark >> raving madman :) >> >> My system is very heavily used, and our pg_dump is only a few gigs. Meanwhile >> our images/documents storage is well over a hundred gigs. I'd hate to think >> that I'd have to dump and restore 100 gigs every time I wanted to dump the >> newest data to the development database. > > How do you plan your backup routine and how do you guarantee that on a failure > all needed data is restored? I mean, how do you handle integrity with data > outside the database? >
pgsql-general by date: