Thread: array iteration?
Is it possible to iterate over an array in plpgsql? Something like: function insert_stuff (rel_ids int[]) ... foreach rel_ids as id insert into table (rel_id, val) values (id, 5); __________________________________ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/
Yep. http://archives.postgresql.org/pgsql-general/2003-11/msg00852.php On Sat, 2003-11-22 at 12:44, CSN wrote: > Is it possible to iterate over an array in plpgsql? > Something like: > > function insert_stuff (rel_ids int[]) > ... > > foreach rel_ids as id > insert into table (rel_id, val) > values (id, 5); > > > > __________________________________ > Do you Yahoo!? > Free Pop-Up Blocker - Get it now > http://companion.yahoo.com/ > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Hi there :) Someone asks about performance of Large objects [LO] in PgSql [PG]. It was interesting for me, because I didn't work yet with them yet and I will have to soon. I tried search the web, doc and mailinglists, but I didn't found any adequate reply. I would be happy, of someone, who have experimence with them (or who know PG internals :) can make it clean. I think I am not alone, who is interested in this topic. In past time, where row data was limited by single tupple (8K-32K), LO was really needed, if you wanted to store more than 8K per row. Older implementation issues are described in documentation. Today, there is 1G limit per row. Is there reason to use LO instead of storing simple 'text' into database table ? (let's ignore 2G limit on LO, which can make LO interesting in some cases :)Documentation says it is obsolvete. But documentation also doesn't tell anything about performance, suggestion about using, etc. By 'access' to binary file, I understand: - reading (and then working with its contents on server side) - sending (read and send as-is to client, without workaround) - updating (from contents which I get from somewhere) LO means storing file using large objects, column means storing file as 'text' column (standard way). Let's say I am having 'file' - smaller (tenhts of Kb) - let's say some JPG image (photo of beer :) The file is 'readed' as is and passed to client (browser, app, whatever). In this case, would be probably better to store file as normal column, because it is easier to access it. This is one of typical use. Every time I will write to that image, SQL parser need to parse all datas, allocate proper memory, .... I think, real differnece will not be too big, since picture having just tenths of Kb. But let's have the same picture, but in pretty fine resolution for printing. File size will be megabytes and more [let's say 10Mb]. Ofcourse, file should be stored in DB because of data integrity. In this case, we will need pretty much memory while reading/updating. All the time. And if I will get many similar requests - and they will go directly to DB, not throw sime 'local cache', it can raise problem. Using 'text' column as binaru file facts [Am I right ???]: - time which SQL parser need to parse statements will be significant (adequate its size) - memory which SQL parser need will be at least equal to filesize - while fetching datas, file will be stored in memory while sending result - when I will receive result, PG will 'release' sql result memory, but file stay on 'my' side (still eating memory) - while updating file, we need 2*file_size memory at least (one for my generated SQLstatement, second for PgSql internal process - at least) - using char/text type is not 'binary safe', so, I need to convert it (next additional memory/CPU) On the other side, LO facts [Am I right ???]: - datas to file are not going throw SQL parser (no additional CPU/TIME requirements) - PG uses pg_largeobjects table to store the files 'directly' - datas are transfered in blocks - there is no nned for memory - because of binary safe functions, no additional converts needed - when I am updatinng the file, it doesn't need to be in memory - can be copied throw buffer from local stored file and PG also doesn't need to have its contents in memory (I don't count buffer/tupple/...) Please, let me know, where I am true and where I am false. If the PG at the end uses LO throw standard SQL, then everigth I wrote is false and there is really no reason for use LO. Then I will miss that feature..... So, I hope it works the assumed way :) So, my questions: 1. How big are claims of parser while analyzing 'megs' statement ? I mean CPU/Memory in comparing with LO ? 2. How PG works with LO internally - throw SQL statements or directly ? 3. When I will access LO file, it will be accessed really thow 'buffers' or PgSql somehow prepare file 'as-is' and then access it ? If the things works as I assume (but I don't know where to verify right now, except sources and mailing list, so, I selected faster way :) Conclusion: a. For smaller files the difference between CPU/MEMORY usage is small LO: is not really needed for this case column: wins, because of consistent/portable work b. Files, which contents are needed 'as is' (full size) and work with them on server side it is also not too big difference LO: files here will probably save additional CPU, but only while updating contents column: wins, CPU difference only while updating c. Larger files, not processed on server-side (typically sent to client) LO: wins, save a LOTT OF memory while reading and also memory and cpu time while updating column: lost, easts redundant CPU & MEMORY, disadvantages of LO are lees then saved resources d. Larger files, processed on server @ local fs - like [c] I wish new PgSql 7.4 documentation (as far I know it was not yet released) contain small chapter / paragraph at LO about this. If someone is new to PG (like me), there is no enough information about 'using' LO. Thank you for corrections and comments about LO/columns binary storing. I hope this email belong to general discussion. If no, I am sorry and let me know - I will repost. Best regards, Lada 'Ray' Lostak Unreal64 Develop group http://www.orcave.com http://www.unreal64.net -------------------------------------------------------------------------- In the 1960s you needed the power of two C64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
Ben <bench@silentmedia.com> writes: > On Sat, 2003-11-22 at 12:44, CSN wrote: >> Is it possible to iterate over an array in plpgsql? > Yep. > http://archives.postgresql.org/pgsql-general/2003-11/msg00852.php The cited example is pretty iffy since it assumes that the valid array entries are all > 0. In recent PG version you can use the array_upper and array_lower functions instead: for i in array_lower(a,1) .. array_upper(a,1) loop -- do something with a[i] end loop; regards, tom lane
Hey, that's cool. When did those functions arrive? On Sun, 23 Nov 2003, Tom Lane wrote: > Ben <bench@silentmedia.com> writes: > > On Sat, 2003-11-22 at 12:44, CSN wrote: > >> Is it possible to iterate over an array in plpgsql? > > > Yep. > > http://archives.postgresql.org/pgsql-general/2003-11/msg00852.php > > The cited example is pretty iffy since it assumes that the valid array > entries are all > 0. In recent PG version you can use the array_upper > and array_lower functions instead: > > for i in array_lower(a,1) .. array_upper(a,1) loop > -- do something with a[i] > end loop; > > regards, tom lane >
> Someone asks about performance of Large objects [LO] in PgSql [PG]. It > was interesting for me, because I didn't work yet with them yet and I > will have to soon. I tried search the web, doc and mailinglists, but I > didn't found any adequate reply. I would be happy, of someone, who have > experimence with them (or who know PG internals :) can make it clean. I > think I am not alone, who is interested in this topic. You're certainly not alone. I haven't done this yet, but will need to in the near future with a web-based database project I'm in the midst of planning at the moment which will involve allowing users to upload PNG, GIF and JPeG images and having them displayed in a sort of "picture album" type layout. > In past time, where row data was limited by single tupple (8K-32K), LO > was really needed, if you wanted to store more than 8K per row. Older > implementation issues are described in documentation. I'm somewhat concerned about this, but not because of the large object support (which I understand is handled by not storing the object data directly in the row, thus circumventing this limitation altogether), rather I think about scalability with large rows that have a lot of columns that sum up to more than 32,768 bytes in size. > Today, there is 1G limit per row. Is there reason to use LO instead of > storing simple 'text' into database table ? (let's ignore 2G limit on That depends on the needs of your application. If all you're storing is text data, then the choice is yours. If you're storing binary data, then simple text probably won't be appropriate. > LO, which can make LO interesting in some cases :)Documentation says it > is obsolvete. But documentation also doesn't tell anything about > performance, suggestion about using, etc. > > By 'access' to binary file, I understand: > > - reading (and then working with its contents on server side) > - sending (read and send as-is to client, without workaround) > - updating (from contents which I get from somewhere) > > LO means storing file using large objects, > column means storing file as 'text' column (standard way). I don't know about this. Hopefully someone else who has experience in this area will jump in here and clear this matter up. > Let's say I am having 'file' - smaller (tenhts of Kb) - let's say some > JPG image (photo of beer :) The file is 'readed' as is and passed to > client (browser, app, whatever). In this case, would be probably better > to store file as normal column, because it is easier to access it. This > is one of typical use. Every time I will write to that image, SQL parser > need to parse all datas, allocate proper memory, .... I think, real > differnece will not be too big, since picture having just tenths of Kb. If you can ensure that the image size will never exceed the width of your column, then you'll be okay, but note that image sizes can vary widely depending on a number of factors in addition to image dimensions, one of which that is often overlooked is the possibility of poorly designed compression algorithms in the applications used to generate the JPeG image or the user selecting minimal compression because they want a better quality image (I do this sometimes in the web development I do, but only when it's appropriate). Remember, however, that storing binary data in a column intended for text might cause some problems down the road (even if it doesn't now) since it will contain unexpected data. You probably should look into the text encoding systems out there to see if there could be some potential problems with, for example, your data being misinterpreted as Chinese text. > But let's have the same picture, but in pretty fine resolution for > printing. File size will be megabytes and more [let's say 10Mb]. > Ofcourse, file should be stored in DB because of data integrity. In this > case, we will need pretty much memory while reading/updating. All the > time. And if I will get many similar requests - and they will go > directly to DB, not throw sime 'local cache', it can raise problem. If you can't trust your underlying file system to store files reliably, then you really shouldn't store your database their either. If you're concerned about the transactional side of things, then your concern is definitely warranted. [sNip -- other questions I don't know the answers for, but would like to] > I wish new PgSql 7.4 documentation (as far I know it was not yet > released) contain small chapter / paragraph at LO about this. If someone > is new to PG (like me), there is no enough information about 'using' LO. I'd like to see this too. > Thank you for corrections and comments about LO/columns binary storing. [sNip] Thanks for asking. I've been meaning to ask about some of this myself, but you've saved me the trouble. =) -- Randolf Richardson - rr@8x.ca Inter-Corporate Computer & Network Services, Inc. Vancouver, British Columbia, Canada http://www.8x.ca/ This message originated from within a secure, reliable, high-performance network ... a Novell NetWare network.
Ben wrote: > Hey, that's cool. When did those functions arrive? I think 7.4. We had a lot of nice array stuff added. > > On Sun, 23 Nov 2003, Tom Lane wrote: > > > Ben <bench@silentmedia.com> writes: > > > On Sat, 2003-11-22 at 12:44, CSN wrote: > > >> Is it possible to iterate over an array in plpgsql? > > > > > Yep. > > > http://archives.postgresql.org/pgsql-general/2003-11/msg00852.php > > > > The cited example is pretty iffy since it assumes that the valid array > > entries are all > 0. In recent PG version you can use the array_upper > > and array_lower functions instead: > > > > for i in array_lower(a,1) .. array_upper(a,1) loop > > -- do something with a[i] > > end loop; > > > > regards, tom lane > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Here's a quick list of my experiences with BLOB's and such. Performance is just fine, I get about 1M hits a month and haven't had any problems. Use a BLOB if you don't need to search though the data. The main reason being that bytea and text types are parsed. To explain, your entire SQL statement has to be parsed to make sure it's valid, this includes you 100 meg jpeg. Slow, REAL slow. I have found that it is best to have a separate connection for BLOB's and one for everything else. Mind you, this is with Java, but the autocommit settings on the connection don't appear to be thread safe, so in high traffic you can accidentally cut off a transfer, or stop one before it ever starts. -jj- On Sun, 2003-11-23 at 10:48, Randolf Richardson, DevNet SysOp 29 wrote: > > Someone asks about performance of Large objects [LO] in PgSql [PG]. It > > was interesting for me, because I didn't work yet with them yet and I > > will have to soon. I tried search the web, doc and mailinglists, but I > > didn't found any adequate reply. I would be happy, of someone, who have > > experimence with them (or who know PG internals :) can make it clean. I > > think I am not alone, who is interested in this topic. > > You're certainly not alone. I haven't done this yet, but will need to > in the near future with a web-based database project I'm in the midst of > planning at the moment which will involve allowing users to upload PNG, GIF > and JPeG images and having them displayed in a sort of "picture album" type > layout. > > > In past time, where row data was limited by single tupple (8K-32K), LO > > was really needed, if you wanted to store more than 8K per row. Older > > implementation issues are described in documentation. > > I'm somewhat concerned about this, but not because of the large object > support (which I understand is handled by not storing the object data > directly in the row, thus circumventing this limitation altogether), rather > I think about scalability with large rows that have a lot of columns that > sum up to more than 32,768 bytes in size. > > > Today, there is 1G limit per row. Is there reason to use LO instead of > > storing simple 'text' into database table ? (let's ignore 2G limit on > > That depends on the needs of your application. If all you're storing > is text data, then the choice is yours. If you're storing binary data, > then simple text probably won't be appropriate. > > > LO, which can make LO interesting in some cases :)Documentation says it > > is obsolvete. But documentation also doesn't tell anything about > > performance, suggestion about using, etc. > > > > By 'access' to binary file, I understand: > > > > - reading (and then working with its contents on server side) > > - sending (read and send as-is to client, without workaround) > > - updating (from contents which I get from somewhere) > > > > LO means storing file using large objects, > > column means storing file as 'text' column (standard way). > > I don't know about this. Hopefully someone else who has experience in > this area will jump in here and clear this matter up. > > > Let's say I am having 'file' - smaller (tenhts of Kb) - let's say some > > JPG image (photo of beer :) The file is 'readed' as is and passed to > > client (browser, app, whatever). In this case, would be probably better > > to store file as normal column, because it is easier to access it. This > > is one of typical use. Every time I will write to that image, SQL parser > > need to parse all datas, allocate proper memory, .... I think, real > > differnece will not be too big, since picture having just tenths of Kb. > > If you can ensure that the image size will never exceed the width of > your column, then you'll be okay, but note that image sizes can vary widely > depending on a number of factors in addition to image dimensions, one of > which that is often overlooked is the possibility of poorly designed > compression algorithms in the applications used to generate the JPeG image > or the user selecting minimal compression because they want a better > quality image (I do this sometimes in the web development I do, but only > when it's appropriate). > > Remember, however, that storing binary data in a column intended for > text might cause some problems down the road (even if it doesn't now) since > it will contain unexpected data. You probably should look into the text > encoding systems out there to see if there could be some potential problems > with, for example, your data being misinterpreted as Chinese text. > > > But let's have the same picture, but in pretty fine resolution for > > printing. File size will be megabytes and more [let's say 10Mb]. > > Ofcourse, file should be stored in DB because of data integrity. In this > > case, we will need pretty much memory while reading/updating. All the > > time. And if I will get many similar requests - and they will go > > directly to DB, not throw sime 'local cache', it can raise problem. > > If you can't trust your underlying file system to store files > reliably, then you really shouldn't store your database their either. > > If you're concerned about the transactional side of things, then your > concern is definitely warranted. > > [sNip -- other questions I don't know the answers for, but would like to] > > I wish new PgSql 7.4 documentation (as far I know it was not yet > > released) contain small chapter / paragraph at LO about this. If someone > > is new to PG (like me), there is no enough information about 'using' LO. > > I'd like to see this too. > > > Thank you for corrections and comments about LO/columns binary storing. > [sNip] > > Thanks for asking. I've been meaning to ask about some of this > myself, but you've saved me the trouble. =) -- "You can't make a program without broken egos." -- Jeremiah Jahn <jeremiah@cs.earlham.edu>
[sNip] > Here's a quick list of my experiences with BLOB's and such. > > Performance is just fine, I get about 1M hits a month and haven't had > any problems. Use a BLOB if you don't need to search though the data. > The main reason being that bytea and text types are parsed. To explain, > your entire SQL statement has to be parsed to make sure it's valid, this > includes you 100 meg jpeg. Slow, REAL slow. It appears that BLOBs are very well justified then. > I have found that it is best to have a separate connection for BLOB's > and one for everything else. Mind you, this is with Java, but the > autocommit settings on the connection don't appear to be thread safe, so > in high traffic you can accidentally cut off a transfer, or stop one > before it ever starts. What is your reason for using a separate connection? -- Randolf Richardson - rr@8x.ca Vancouver, British Columbia, Canada Please do not eMail me directly when responding to my postings in the newsgroups.
Here's a quick list of my experiences with BLOB's and such. Performance is just fine, I get about 1M hits a month and haven't had any problems. Use a BLOB if you don't need to search though the data. The main reason being that bytea and text types are parsed. To explain, your entire SQL statement has to be parsed to make sure it's valid, this includes you 100 meg jpeg. Slow, REAL slow. I have found that it is best to have a separate connection for BLOB's and one for everything else. Mind you, this is with Java, but the autocommit settings on the connection don't appear to be thread safe, so in high traffic you can accidentally cut off a transfer, or stop one before it ever starts. -jj- On Sun, 2003-11-23 at 10:48, Randolf Richardson, DevNet SysOp 29 wrote: > > Someone asks about performance of Large objects [LO] in PgSql [PG]. It > > was interesting for me, because I didn't work yet with them yet and I > > will have to soon. I tried search the web, doc and mailinglists, but I > > didn't found any adequate reply. I would be happy, of someone, who have > > experimence with them (or who know PG internals :) can make it clean. I > > think I am not alone, who is interested in this topic. > > You're certainly not alone. I haven't done this yet, but will need to > in the near future with a web-based database project I'm in the midst of > planning at the moment which will involve allowing users to upload PNG, GIF > and JPeG images and having them displayed in a sort of "picture album" type > layout. > > > In past time, where row data was limited by single tupple (8K-32K), LO > > was really needed, if you wanted to store more than 8K per row. Older > > implementation issues are described in documentation. > > I'm somewhat concerned about this, but not because of the large object > support (which I understand is handled by not storing the object data > directly in the row, thus circumventing this limitation altogether), rather > I think about scalability with large rows that have a lot of columns that > sum up to more than 32,768 bytes in size. > > > Today, there is 1G limit per row. Is there reason to use LO instead of > > storing simple 'text' into database table ? (let's ignore 2G limit on > > That depends on the needs of your application. If all you're storing > is text data, then the choice is yours. If you're storing binary data, > then simple text probably won't be appropriate. > > > LO, which can make LO interesting in some cases :)Documentation says it > > is obsolvete. But documentation also doesn't tell anything about > > performance, suggestion about using, etc. > > > > By 'access' to binary file, I understand: > > > > - reading (and then working with its contents on server side) > > - sending (read and send as-is to client, without workaround) > > - updating (from contents which I get from somewhere) > > > > LO means storing file using large objects, > > column means storing file as 'text' column (standard way). > > I don't know about this. Hopefully someone else who has experience in > this area will jump in here and clear this matter up. > > > Let's say I am having 'file' - smaller (tenhts of Kb) - let's say some > > JPG image (photo of beer :) The file is 'readed' as is and passed to > > client (browser, app, whatever). In this case, would be probably better > > to store file as normal column, because it is easier to access it. This > > is one of typical use. Every time I will write to that image, SQL parser > > need to parse all datas, allocate proper memory, .... I think, real > > differnece will not be too big, since picture having just tenths of Kb. > > If you can ensure that the image size will never exceed the width of > your column, then you'll be okay, but note that image sizes can vary widely > depending on a number of factors in addition to image dimensions, one of > which that is often overlooked is the possibility of poorly designed > compression algorithms in the applications used to generate the JPeG image > or the user selecting minimal compression because they want a better > quality image (I do this sometimes in the web development I do, but only > when it's appropriate). > > Remember, however, that storing binary data in a column intended for > text might cause some problems down the road (even if it doesn't now) since > it will contain unexpected data. You probably should look into the text > encoding systems out there to see if there could be some potential problems > with, for example, your data being misinterpreted as Chinese text. > > > But let's have the same picture, but in pretty fine resolution for > > printing. File size will be megabytes and more [let's say 10Mb]. > > Ofcourse, file should be stored in DB because of data integrity. In this > > case, we will need pretty much memory while reading/updating. All the > > time. And if I will get many similar requests - and they will go > > directly to DB, not throw sime 'local cache', it can raise problem. > > If you can't trust your underlying file system to store files > reliably, then you really shouldn't store your database their either. > > If you're concerned about the transactional side of things, then your > concern is definitely warranted. > > [sNip -- other questions I don't know the answers for, but would like to] > > I wish new PgSql 7.4 documentation (as far I know it was not yet > > released) contain small chapter / paragraph at LO about this. If someone > > is new to PG (like me), there is no enough information about 'using' LO. > > I'd like to see this too. > > > Thank you for corrections and comments about LO/columns binary storing. > [sNip] > > Thanks for asking. I've been meaning to ask about some of this > myself, but you've saved me the trouble. =) -- "You can't make a program without broken egos."
Attachment
On 25/11/2003 21:55 Jeremiah Jahn wrote: > [snip] > I have found that it is best to have a separate connection for BLOB's > and one for everything else. Mind you, this is with Java, but the > autocommit settings on the connection don't appear to be thread safe, so > in high traffic you can accidentally cut off a transfer, or stop one > before it ever starts. How could a connection ever have the kind of "thread-safe" behavior you seem to believe it should have? -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Thanks! This is exactly what I wanted to know when I first asked the question. And it is the only response that seems to make sense. Does anyone else have experiecne with this? rg <<<<<<<<<<<<<<<<<<<<< Here's a quick list of my experiences with BLOB's and such. Performance is just fine, I get about 1M hits a month and haven't had any problems. Use a BLOB if you don't need to search though the data. The main reason being that bytea and text types are parsed. To explain, your entire SQL statement has to be parsed to make sure it's valid, this includes you 100 meg jpeg. Slow, REAL slow. I have found that it is best to have a separate connection for BLOB's and one for everything else. Mind you, this is with Java, but the autocommit settings on the connection don't appear to be thread safe, so in high traffic you can accidentally cut off a transfer, or stop one before it ever starts. -jj- >>>>>>>>>>>>>>>>>>>>>