Thread: Question re large objects
(Hopefully I'm addressing this to the right list... This is my first time with BLOBs anywhere.) I'm working on a web page template system (PHP/PostgreSQL) which will (in part, other stuff is done) allow users to upload a gif or jpeg via a form on another site. The image will then slot into a predetermined place on their template website. I've considered the alternatives (store files in a directory or in the db), and I think storing the image files in the database is the way to go. Unless anybody can think of why I should just upload the files to a directory on the server instead of storing them as BLOBs. I can upload a binary file just fine, and retrieve it from the database just fine. The problem is that the file (using pg_loreadall) just plunks itself into my test page as text/plain. How do I get a mime type header associated with this data so my browser will recognize it as a gif or jpeg? Do I have to export the file to a web-accessible directory (owned by web server user, I know) and then use an <img> tag? Or is there a way to get my browser to recognize that the BLOB is in fact an image when PHP pulls it out of the PostgreSQL database? Be gentle if I've asked a stupid question... :-) -------------------------------------------------------- Christopher Wood, Community Access Technical Coordinator Western Valley Development Authority Nova Scotia's Smart Community Box 251 Bridgetown, Nova Scotia B0S 1C0 tel: (902) 665-4083/ fax: (902) 665-2878 www.wvda.com www.wvcn.ns.ca --------------------------------------------------------
Point #1: I've heard rumors that BLOBs have been buggy in the past (I'm using 7.0 right now...). Point #2: I would be interested in knowing if you come up with a way to do this. Right now we have a large web site with databases of over 10GB and many image files. We are storing them on a single shared directory on one server as we can't find any way to do what you are saying... Of course, I've never asked anyone for help, either, so kudos for your bravery! :-) Justin Long _________________________________________________________________ Justin Long Network for Strategic Missions 1732 South Park Court Never retreat. Never surrender. Chesapeake, VA 23320, USA Never cut a deal with a dragon. 757-213-2055, ICQ 97269932 http://www.strategicnetwork.org Monday Morning Reality Check: reality-check-subscribe@egroups.com -----Original Message----- From: pgsql-php-owner@postgresql.org [mailto:pgsql-php-owner@postgresql.org]On Behalf Of chriswood@wvda.com Sent: Tuesday, November 28, 2000 12:26 PM To: pgsql-php@postgresql.org Subject: [PHP] Question re large objects (Hopefully I'm addressing this to the right list... This is my first time with BLOBs anywhere.) I'm working on a web page template system (PHP/PostgreSQL) which will (in part, other stuff is done) allow users to upload a gif or jpeg via a form on another site. The image will then slot into a predetermined place on their template website. I've considered the alternatives (store files in a directory or in the db), and I think storing the image files in the database is the way to go. Unless anybody can think of why I should just upload the files to a directory on the server instead of storing them as BLOBs. I can upload a binary file just fine, and retrieve it from the database just fine. The problem is that the file (using pg_loreadall) just plunks itself into my test page as text/plain. How do I get a mime type header associated with this data so my browser will recognize it as a gif or jpeg? Do I have to export the file to a web-accessible directory (owned by web server user, I know) and then use an <img> tag? Or is there a way to get my browser to recognize that the BLOB is in fact an image when PHP pulls it out of the PostgreSQL database? Be gentle if I've asked a stupid question... :-) -------------------------------------------------------- Christopher Wood, Community Access Technical Coordinator Western Valley Development Authority Nova Scotia's Smart Community Box 251 Bridgetown, Nova Scotia B0S 1C0 tel: (902) 665-4083/ fax: (902) 665-2878 www.wvda.com www.wvcn.ns.ca --------------------------------------------------------
> (Hopefully I'm addressing this to the right list... This is my first time > with BLOBs anywhere.) > > I'm working on a web page template system (PHP/PostgreSQL) > which will (in part, other stuff is done) allow users to upload a gif or > jpeg via a form on another site. The image will then slot into a > predetermined place on their template website. > > I've considered the alternatives (store files in a directory or in the > db), and I think storing the image files in the database is the way to > go. Unless anybody can think of why I should just upload the files > to a directory on the server instead of storing them as BLOBs. In 7.0.X there is a limit on the size of tuples, 32k, and some images could easily be bigger than that (though it depends on your images I guess). 7.1 will be out shortly after the first of the year and will take care of this though. Depending on the use of these images, I'd think up a naming scheme and store them as flat-files and perhaps store the image name or something in the database... You could still display a link to the images but the link would have to be to a PHP (or other) script that would grab the image from the database, set the MIME type and display it to the browser -- you couldn't have it in an HTML document without first writing it to a flat file (at least I don't know of a way you could display it in an HTML document) . > I can upload a binary file just fine, and retrieve it from the database > just fine. The problem is that the file (using pg_loreadall) just > plunks itself into my test page as text/plain. How do I get a mime > type header associated with this data so my browser will recognize > it as a gif or jpeg? Do I have to export the file to a web-accessible > directory (owned by web server user, I know) and then use an > <img> tag? Or is there a way to get my browser to recognize that > the BLOB is in fact an image when PHP pulls it out of the > PostgreSQL database? The MIME type is image/jpeg for jpegs and image/gif for gifs. Use PHP's header() function. > Be gentle if I've asked a stupid question... :-) There are no stupid questions, only stupid answers! -Mitch
The tuple limit is the reason why we created a function for writing large text files to a database. Here's the function we use: function inject($id, $text) { // this function splits the article, $text, into 4k chunks and stores it. global $db; $rs = db_exec("SELECT * FROM article WHERE textid LIKE '$id-%'"); // check to see if the text exists if ($rs) { db_exec("DELETE FROM article WHERE textid LIKE '$id-%'"); // delete old text if it does } $row=1; $string=""; while ($text != "") { if (strlen($text) <= 4096) { $string = $text; $text = ""; } else { $x = 4096; while (substr($text,$x,1) != "\n" && $x > 0) { $x = $x - 1; } if ($x==0) { $x = 4096; while (substr($text,$x,1) != " " && $x > 0) { $x = $x - 1; } } $string = substr($text,0,$x); $text = substr($text,$x+1); } $string = trim($string); if ($string!="") { $blockid = $id . "-" . substr("00".$row,-2); $len = strlen($id); $kbid = str_replace("KB","",$id); if (strpos($kbid,":")) { $kbid = substr($kbid,0,strpos($kbid,":")); } if (!$kbid) { $string = addslashes($string); } if ($len > strlen($kbid)) { $sql = "INSERT INTO article (kbid,textid,article) VALUES ('$kbid','$blockid','$string')"; db_exec($sql); } else { db_exec("INSERT INTO article (textid,article) VALUES ('$blockid','$string')"); } } $row++; } } function retrieve($id) { // this function retrieves and concatenates the files. $rs = db_exec("SELECT * from article WHERE textid LIKE '$id-%' ORDER BY textid"); // check to see if text exists if ($rs) { $row=0; $string=""; while ($row < pg_numrows($rs)) { $rec = pg_fetch_object($rs,$row); $string = $string . $rec->article; $row++; } } else { $string=""; } return($string); } It's likely, with a little bit of tweaking, that these functions could be used to store image files? Justin
Sure, you could break the image up into pieces -- you just need to be sure you put every byte right back where you got it in the whole image.. All in all I think storing images for display on web pages in any database is much more trouble than it's worth... A flat file system with a naming scheme should work rather well (I use something like this in several applications to deal with MS-Word documents and the like).. You could even store the image as a flat-file and stick the name and path in the directory, that way you could grab the link right out of the database and display it.. This approach requires you have a directory for each user so that they can have files named the same thing, that or you can put some ID in the file name to identify it's owner (you'll have directories for each user anyway I would assume). Good luck! -Mitch ----- Original Message ----- From: "Justin Long" <justinlong@strategicnetwork.org> To: "Pgsql-Php" <pgsql-php@postgresql.org> Sent: Tuesday, November 28, 2000 9:48 AM Subject: RE: [PHP] Question re large objects > The tuple limit is the reason why we created a function for writing large > text files to a database. Here's the function we use: > > function inject($id, $text) { > // this function splits the article, $text, into 4k chunks and stores it. > global $db; > $rs = db_exec("SELECT * FROM article WHERE textid LIKE '$id-%'"); // check > to see if the text exists > if ($rs) { > db_exec("DELETE FROM article WHERE textid LIKE '$id-%'"); // delete old > text if it does > } > $row=1; > $string=""; > while ($text != "") { > if (strlen($text) <= 4096) { > $string = $text; > $text = ""; > } else { > $x = 4096; > while (substr($text,$x,1) != "\n" && $x > 0) { $x = $x - 1; } > if ($x==0) { > $x = 4096; > while (substr($text,$x,1) != " " && $x > 0) { $x = $x - 1; } > } > $string = substr($text,0,$x); > $text = substr($text,$x+1); > } > $string = trim($string); > if ($string!="") { > $blockid = $id . "-" . substr("00".$row,-2); > $len = strlen($id); > $kbid = str_replace("KB","",$id); > if (strpos($kbid,":")) { $kbid = substr($kbid,0,strpos($kbid,":")); } > if (!$kbid) { $string = addslashes($string); } > if ($len > strlen($kbid)) { > $sql = "INSERT INTO article (kbid,textid,article) VALUES > ('$kbid','$blockid','$string')"; > db_exec($sql); > } else { > db_exec("INSERT INTO article (textid,article) VALUES > ('$blockid','$string')"); > } > } > $row++; > } > } > > function retrieve($id) { > // this function retrieves and concatenates the files. > $rs = db_exec("SELECT * from article WHERE textid LIKE '$id-%' ORDER BY > textid"); // check to see if text exists > if ($rs) { > $row=0; > $string=""; > while ($row < pg_numrows($rs)) { > $rec = pg_fetch_object($rs,$row); > $string = $string . $rec->article; > $row++; > } > } else { > $string=""; > } > return($string); > } > > > It's likely, with a little bit of tweaking, that these functions could be > used to store image files? > > Justin > >
chriswood@wvda.com (chriswood@wvda.com) wrote: > I've considered the alternatives (store files in a directory or in the > db), and I think storing the image files in the database is the way to > go. Unless anybody can think of why I should just upload the files > to a directory on the server instead of storing them as BLOBs. Because you will lose the images when you do a restore from backup. And you will have to restore from backup eventually, count on it. On the other hand, if your service will ever grow to more than one computer doing HTTP service (i.e. load balancing), you will have to have the items in the database or resort to unsightly things like NFS. If you expect your site to get big or be busy, you will have to come up with a way (preferably an abstraction on top of pg_lo* functions) that will preserve the BLOBs in the database so that when you do a reload, you can pump in the BLOBs as well and connect their *new* OIDs with the appropriate table. Because the OIDs will change when you reload. You do have a backup plan, right? > plunks itself into my test page as text/plain. How do I get a mime > type header associated with this data so my browser will recognize > it as a gif or jpeg? Store the content type in the database along with the file data. When you pull it out, use: header("Content-type: $mime"); ,,, (. .) +--ooO-(_)-Ooo------------ --- -- - - - - | Stephen van Egmond http://bang.dhs.org/
-------------------------------------------------------- Christopher Wood, Community Access Technical Coordinator Western Valley Development Authority Nova Scotia's Smart Community Box 251 Bridgetown, Nova Scotia B0S 1C0 tel: (902) 665-4083/ fax: (902) 665-2878 www.wvda.com www.wvcn.ns.ca --------------------------------------------------------
(Sorry about that. This text didn't go through the first time.) Thanks to all for your help. Because of 1) the speed issue and 2) the backup issue, I now think that flat-file storage with the filenames in the database is the way to go (with the directory prefix in a variable in the PHP script). Eric Cestari's way would have been neat, though. :-) -------------------------------------------------------- Christopher Wood, Community Access Technical Coordinator Western Valley Development Authority Nova Scotia's Smart Community Box 251 Bridgetown, Nova Scotia B0S 1C0 tel: (902) 665-4083/ fax: (902) 665-2878 www.wvda.com www.wvcn.ns.ca --------------------------------------------------------
> Because you will lose the images when you do a restore from backup. > And you will have to restore from backup eventually, count on it. I think one should always plan for the worst case scenerio, that's exactly thye you do backups, so you don't lose data.. Why would he lose data if he's preforming backups and restored from that backup.. Unless the backup wasn't preformed or failed in some way, he wouldn't lose any data.. > On the other hand, if your service will ever grow to more than one > computer doing HTTP service (i.e. load balancing), you will have to > have the items in the database or resort to unsightly things like NFS. Again, I always think one should make an application scaleable but having said that, I think what you're mentioning here is a cart before the horse situation. Even saying he needed to load balance I'd never use NFS, ever. Large RAID arrays and such could provide all the storage needed -- especially since we're just talking about images here. > If you expect your site to get big or be busy, you will have to come up > with a way (preferably an abstraction on top of pg_lo* functions) that > will preserve the BLOBs in the database so that when you do a reload, > you can pump in the BLOBs as well and connect their *new* OIDs with the > appropriate table. Because the OIDs will change when you reload. > > You do have a backup plan, right? I'd suggest that you don't use OIDs as binding record IDs, make another integer field for that. There is an option to pg_dump to preserve OIDs even if you do. -Mitch
Mitch Vincent (mitch@venux.net) wrote: > > Because you will lose the images when you do a restore from backup. > > And you will have to restore from backup eventually, count on it. > > I think one should always plan for the worst case scenerio, that's exactly > thye you do backups, so you don't lose data.. Why would he lose data if he's > preforming backups and restored from that backup.. BLOBs are not dumped from pgsql. This might be because there's no valid SQL to create BLOBs, and since pgsql dumps are supposed to be SQL, it just doesn't work. > Again, I always think one should make an application scaleable but having > said that, I think what you're mentioning here is a cart before the horse > situation. Even saying he needed to load balance I'd never use NFS, ever. > Large RAID arrays and such could provide all the storage needed -- > especially since we're just talking about images here. I'm referring to multiple serving machines due to CPU or local disk capacity. > I'd suggest that you don't use OIDs as binding record IDs, make another > integer field for that. There is an option to pg_dump to preserve OIDs even > if you do. I don't think you understand large objects. When you create one, you get an OID. When you want to retrieve it, you hand it the OID. End of story. And, again, they are dumped by pg_dump.
> BLOBs are not dumped from pgsql. OK, that makes sense -- I do a file system backup though, in addition to pg_dump.. > This might be because there's no valid SQL to create BLOBs, and since > pgsql dumps are supposed to be SQL, it just doesn't work. > I don't think you understand large objects. When you create one, you > get an OID. When you want to retrieve it, you hand it the OID. > End of story. And, again, they are dumped by pg_dump. You're right, I didn't understand large objects (never had any reason to use them), I thought PG implemented BLOBs as a data type like *gulp* MySQL *gulp*. I shouldn't have opened my mouth until I educated myself on them... It seems that large objects are quite useless if you can't dump the data (since as you point out, it's not SQL) though I guess a filesystem backup would still work.. They sound like a disaster waiting to happen, glad I haven't needed them... An additional note (now that I read about Large Objects) it seems they're broken up into different tuples so I guess they're total size isn't limited by the BLCKSZ eh'? Oh well, I still like flat-file storage for Christopher's project, even more so now... Thanks for pointing out my ignorance on this subject, I learned something. -Mitch
I'm in the (slow) process of writing a tutorial for this sort of thing (ie how to do it etc). This has probably been discussed before, but I'm looking for some positives & negatives for storing files in the database (I think you can use a switch on dump which dumps EVERYTHING, but I could be wrong), and some positives & negatives for outside the database in a seperate directory. Can someone send me some pointers & experiences :) (It doesn't have to go back to the list if no-one else is interested). Thanks & Regards, ------------------------ Chris Smith http://www.squiz.net
Well, as I said before I really just don't like the idea of storing binary files or even large text files in a RDBMS (which was reinforced by my learning a little about large objects).. I have written several applications for the recruiting industry and have stored text resumes in the database (with a MS-Word doc counterpart as a flat-file) and found that works rather well. The only reason that I stored the resumes in the database was so I could search them with ease.. The tuple size being BLCKSZ hindered that a bit but I've been limping by with the BLCKSZ increased to 32k, oddly enough there is a thread on -HACKERS on why that might not be such a great idea. Regardless -- this tuple size problem will be squished with TOAST in 7.1 within the next few months. Binary files have a tendency to be much larger than ASCII text files and generally you're not going to perform any searching on them, at least not with SQL, so I'm left seeing only problems and overhead associated with storing binary files in the database... I'm sure there are some situations where storing them in the database might be the best option but I haven't run into that situation yet.. With the MS-Word resumes, I store them in a common directory (they can only have 1 in the application I'm speaking of ) and they're named with their ID, which is taken from the ID of the applicants record in the database. That's easy enough to do and it would be just a little more work to be able to store multiple resumes in the same directory for the same person as the ID is always unique (adding a -1 -2 etc to the filename would be a quick way to store multiples).. Just my opinion and as we've seen, I can be and often am --- wrong! :-) -Mitch ----- Original Message ----- From: "Chris" <csmith@squiz.net> To: "Pgsql-Php" <pgsql-php@postgresql.org> Sent: Tuesday, November 28, 2000 2:02 PM Subject: RE: [PHP] Question re large objects > I'm in the (slow) process of writing a tutorial for this sort of thing (ie > how to do it etc). > This has probably been discussed before, but I'm looking for some positives > & negatives for storing files in the database (I think you can use a switch > on dump which dumps EVERYTHING, but I could be wrong), and some positives & > negatives for outside the database in a seperate directory. > Can someone send me some pointers & experiences :) (It doesn't have to go > back to the list if no-one else is interested). > Thanks & Regards, > ------------------------ > Chris Smith > http://www.squiz.net > >
Chris (csmith@squiz.net) wrote: > I'm in the (slow) process of writing a tutorial for this sort of thing (ie > how to do it etc). > This has probably been discussed before, but I'm looking for some positives > & negatives for storing files in the database (I think you can use a switch > on dump which dumps EVERYTHING, but I could be wrong), and some positives & > negatives for outside the database in a seperate directory. > Can someone send me some pointers & experiences :) (It doesn't have to go > back to the list if no-one else is interested). Pros to using large objects: - makes your system scalable (i.e. many HTTP servers) - clean Cons: - not currently dumpable - user interface is a tiny bit tricky (possibly due to gaps in the php docs) - introduces performance bottlenecks if you have many files coming out of the RDBMS, as opposed to using flat files. pg_dump will not dump large objects under any circumstances. Read the manual. All of the above cons can be eliminated with some amount of work. You can avoid unpleasant surprises by building an abstraction on top of lowrite and loread, e.g. pg_associate('Product', 'product_id', $product_id, 'product_image', $image_filename); pg_find_association('Product', 'product_id', $product_id, 'product_image'); would be for this table: CREATE TABLE Product ( product_id (...) primary key, product_image oid ... ); You can implement associate() to store the file and association information somewhere in the filesystem, and make up a restore script that will pump this saved information back into the db.
Greetings, chriswood! At 28.11.2000, 11:00, you wrote: cwc> I'm working on a web page template system (PHP/PostgreSQL) cwc> which will (in part, other stuff is done) allow users to upload a gif or cwc> jpeg via a form on another site. The image will then slot into a cwc> predetermined place on their template website. cwc> I've considered the alternatives (store files in a directory or in the cwc> db), and I think storing the image files in the database is the way to cwc> go. Unless anybody can think of why I should just upload the files cwc> to a directory on the server instead of storing them as BLOBs. Well, there is one (huge) reason - speed. Besides, current BLOB support in pgsql is not too good: 1) You can't pg_dump BLOBs, 2) For _each_ BLOB _two_ files are created in database dir. So, my advice would be to store images in filesystem and to store paths to them in DB. -- Yours, Alexey V. Borzov, Webmaster of RDW
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I´ve found some material, but I have not implemented on a large scale yet. http://www.php.net/manual/features.file-upload.php and http://www.phpbuilder.com/ Best Regards. **************************************************** Aristeu Gil Alves Junior<arisjr@terra.com.br> IT Analyst Porto Alegre/RS - Brasil **************************************************** "Communications without intelligence is noise; Intelligence without communications is irrelevant." Gen. Alfred. M. Gray, USMC - ----- Original Message ----- From: Chris <csmith@squiz.net> To: Pgsql-Php <pgsql-php@postgresql.org> Sent: Tuesday, November 28, 2000 8:02 PM Subject: RE: [PHP] Question re large objects >I'm in the (slow) process of writing a tutorial for this sort of >thing (ie how to do it etc). >This has probably been discussed before, but I'm looking for some >positives & negatives for storing files in the database (I think >you can use a switch on dump which dumps EVERYTHING, but I could be >wrong), and some positives & negatives for outside the database in >a seperate directory. >Can someone send me some pointers & experiences :) (It doesn't have >to go back to the list if no-one else is interested). >Thanks & Regards, >------------------------ >Chris Smith >http://www.squiz.net > -----BEGIN PGP SIGNATURE----- Version: PGPfreeware 6.5.8 for non-commercial use <http://www.pgp.com> iQA/AwUBOiTmLGHwwXgCu34uEQK7ywCdGsbbMnVyn5Y78w8DHtD+gN6A9PEAmwfW BgcyyCgTZpQ82R3PATFiWGyp =6PBF -----END PGP SIGNATURE-----