Thread: mysql load_file() function
Hi all,
Does anyone know if PostgreSQL got a function which work like load_file() of mySQL?
I need it for uploading of big files... i encounter a memory limit when i try to upload a query with the file data in it and i recieved this response for a question about it at php-general maillist:
"load_file() is mysql internal function. the file is read by mysql
server, so it completely bypass php and also client libraries. you need
to have file priviledge."
Or maybe there is another way to upload big queries without running off the memory limit?
Another recommendation that i received is to just upload the files to the file system... but then ill lose the integrity of the database (though i can make safety checks to know that everything is as it should be) so i prefer to find a way to insert it to the DB.
My server is as follows:
1. PHP 4.3.9
2. DB - Postgresql 7.4
3. Apache 1.3.26
Thanks in advance,
Ben-Nes Yonatan
On Fri, Feb 04, 2005 at 09:27:08AM +0200, Ben-Nes Yonatan wrote: > Hi all, > > Does anyone know if PostgreSQL got a function which work like > load_file() of mySQL? I am not quite sure what load_file() does, but check the COPY command and the analgous \copy in psql. As with many other PostgreSQL commands, COPY has a man page in lower case: man copy > I need it for uploading of big files... i encounter a memory limit when > i try to upload a query with the file data in it and i recieved this > response for a question about it at php-general maillist: > "load_file() is mysql internal function. the file is read by mysql > server, so it completely bypass php and also client libraries. you need > to have file priviledge." > Or maybe there is another way to upload big queries without running off > the memory limit? > Another recommendation that i received is to just upload the files to > the file system... but then ill lose the integrity of the database > (though i can make safety checks to know that everything is as it should > be) so i prefer to find a way to insert it to the DB. > > My server is as follows: > 1. PHP 4.3.9 > 2. DB - Postgresql 7.4 Consider getting PostgreSQL 8.0.1 (or whatever the latest is when you get this message ;) > 3. Apache 1.3.26 > > Thanks in advance, > Ben-Nes Yonatan > HTH :) Cheers, D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote!
>On Fri, Feb 04, 2005 at 09:27:08AM +0200, Ben-Nes Yonatan wrote: >> Hi all, >> >> Does anyone know if PostgreSQL got a function which work like >> load_file() of mySQL? > > I am not quite sure what load_file() does, but check the COPY command > and the analgous \copy in psql. As with many other PostgreSQL > commands, COPY has a man page in lower case: > > man copy > First thanks for your answer David but im afraid that i still got a problem with this solution... im not trying to upload a big file which contain data which is supposed to be divided to plenty of rows, i want to upload a big file (wav,ppt,whatever...) into one column one row... maybe i can use COPY to upload the rows to a temp_table and then from there build the file at the primary table so it will be only postgresql working in his own enviroment... though that seem "too heavy" isnt there a better elegant solution? >> I need it for uploading of big files... i encounter a memory limit when >> i try to upload a query with the file data in it and i recieved this >> response for a question about it at php-general maillist: >> "load_file() is mysql internal function. the file is read by mysql >> server, so it completely bypass php and also client libraries. you need >> to have file priviledge." >> Or maybe there is another way to upload big queries without running off >> the memory limit? >> Another recommendation that i received is to just upload the files to >> the file system... but then ill lose the integrity of the database >> (though i can make safety checks to know that everything is as it should >> be) so i prefer to find a way to insert it to the DB. >> >> My server is as follows: >> 1. PHP 4.3.9 >> 2. DB - Postgresql 7.4 >Consider getting PostgreSQL 8.0.1 (or whatever the latest is when you >get this message ;) Soon :) >> 3. Apache 1.3.26 >> >> Thanks in advance, >> Ben-Nes Yonatan >> >HTH :) >Cheers, >D >-- >David Fetter david@fetter.org http://fetter.org/ >phone: +1 510 893 6100 mobile: +1 415 235 3778 > >Remember to vote! Whats HTH? and to what should i vote? ;) Thanks alot Yonatan
Hello, We have a table cm_quotastates which has exactly 4624564 rows and 25 columns and 9 indexes... Out of these, our code retrieves 75262 rows and modifies just one column in each row... but updating these to database is taking some significant time (around 20 minutes)... Tried the following with the update (a) Tried updating after removing all the 9 indexes associated with the table (b) Tried updating the 75K rows in batches (c) vacuum analyze the table before updation but none are helping and update still takes the same amount of time. Is there anything else that can be done so that update takes lesser time... Also, where can I find info about how postgres update actually works? Thanks, Venkatesh __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Friday 04 Feb 2005 7:04 pm, Ben-Nes Yonatan wrote: First thanks for your answer David but im afraid that i still got a problem > with this solution... im not trying to upload a big file which contain data > which is supposed to be divided to plenty of rows, i want to upload a big > file (wav,ppt,whatever...) into one column one row... maybe i can use COPY > to upload the rows to a temp_table and then from there build the file at > the primary table so it will be only postgresql working in his own > enviroment... though that seem "too heavy" isnt there a better elegant > solution? May be this will help.. http://www.postgresql.org/docs/8.0/static/datatype-binary.html > >HTH :) > Whats HTH? and to what should i vote? ;) HTH is Hope this/that helps.. Shridhar
On Feb 4, 2005, at 8:34 AM, Ben-Nes Yonatan wrote: >> On Fri, Feb 04, 2005 at 09:27:08AM +0200, Ben-Nes Yonatan wrote: >>> Hi all, >>> >>> Does anyone know if PostgreSQL got a function which work like >>> load_file() of mySQL? >> >> I am not quite sure what load_file() does, but check the COPY command >> and the analgous \copy in psql. As with many other PostgreSQL >> commands, COPY has a man page in lower case: >> >> man copy >> > > First thanks for your answer David but im afraid that i still got a > problem > with this solution... im not trying to upload a big file which contain > data > which is supposed to be divided to plenty of rows, i want to upload a > big > file (wav,ppt,whatever...) into one column one row... maybe i can use > COPY > to upload the rows to a temp_table and then from there build the file > at the > primary table so it will be only postgresql working in his own > enviroment... > though that seem "too heavy" isnt there a better elegant solution? > Try looking here--again if I understand what you are trying to do.... http://www.postgresql.org/docs/8.0/interactive/largeobjects.html Large objects in postgresql can be files, images, etc. There are several functions for manipulating them (loading, etc.) available in postgresql. Sean
Venkatesh Babu <venkatbabukr@yahoo.com> writes: > We have a table cm_quotastates which has exactly > 4624564 rows and 25 columns and 9 indexes... Out of > these, our code retrieves 75262 rows and modifies just > one column in each row... but updating these to > database is taking some significant time (around 20 > minutes)... Tried the following with the update Any triggers or foreign keys on that table? Also, what PG version is this? Are you doing this in a single UPDATE command, or 75262 separate commands? regards, tom lane
Hi, There aren't any triggers but there are 75262 update statements. The problem is that we have a datatype called as "Collection" and we are fetching the data rows into it, modifying the data and call Collection.save(). This save method generates one update satement per record present in it. Thanks, Venkatesh --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > Venkatesh Babu <venkatbabukr@yahoo.com> writes: > > We have a table cm_quotastates which has exactly > > 4624564 rows and 25 columns and 9 indexes... Out > of > > these, our code retrieves 75262 rows and modifies > just > > one column in each row... but updating these to > > database is taking some significant time (around > 20 > > minutes)... Tried the following with the update > > Any triggers or foreign keys on that table? Also, > what PG version is > this? Are you doing this in a single UPDATE > command, or 75262 separate > commands? > > regards, tom lane > __________________________________ Do you Yahoo!? Yahoo! Mail - Find what you need with new enhanced search. http://info.mail.yahoo.com/mail_250
Hi, There aren't any foreign keys and we are currently using Postgres version 7.4... --- Venkatesh Babu <venkatbabukr@yahoo.com> wrote: > Hi, > > There aren't any triggers but there are 75262 update > statements. The problem is that we have a datatype > called as "Collection" and we are fetching the data > rows into it, modifying the data and call > Collection.save(). This save method generates one > update satement per record present in it. > > Thanks, > Venkatesh > > --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > Venkatesh Babu <venkatbabukr@yahoo.com> writes: > > > We have a table cm_quotastates which has exactly > > > 4624564 rows and 25 columns and 9 indexes... Out > > of > > > these, our code retrieves 75262 rows and > modifies > > just > > > one column in each row... but updating these to > > > database is taking some significant time (around > > 20 > > > minutes)... Tried the following with the update > > > > Any triggers or foreign keys on that table? Also, > > what PG version is > > this? Are you doing this in a single UPDATE > > command, or 75262 separate > > commands? > > > > regards, tom lane > > > > > > > __________________________________ > Do you Yahoo!? > Yahoo! Mail - Find what you need with new enhanced > search. > http://info.mail.yahoo.com/mail_250 > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > __________________________________ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail
Venkatesh Babu <venkatbabukr@yahoo.com> writes: > Hi, > > There aren't any triggers but there are 75262 update > statements. The problem is that we have a datatype > called as "Collection" and we are fetching the data > rows into it, modifying the data and call > Collection.save(). This save method generates one > update satement per record present in it. Are you running in autocommit mode? You might try disabling autocommit and committing once at the end of the transaction. Postgres runs faster when not committing more than necessary. -- greg
Venkatesh Babu <venkatbabukr@yahoo.com> writes: > There aren't any triggers but there are 75262 update > statements. The problem is that we have a datatype > called as "Collection" and we are fetching the data > rows into it, modifying the data and call > Collection.save(). This save method generates one > update satement per record present in it. Well, that's going to be dog-slow in any case compared to putting the logic on the server side, but a couple of things you could possibly do: make sure all of this is in one transaction block (a commit per row updated is a lot of overhead) and use a prepared statement for the UPDATE to get you out from under the repeated parse/plan overhead. Check the UPDATE's plan, too, and make sure it's an indexscan on the primary key rather than anything less efficient. regards, tom lane
Oops! venkatbabukr@yahoo.com (Venkatesh Babu) was seen spray-painting on a wall: > There aren't any triggers but there are 75262 update > statements. The problem is that we have a datatype > called as "Collection" and we are fetching the data > rows into it, modifying the data and call > Collection.save(). This save method generates one > update satement per record present in it. Is that "Collection" in your application the entire table? If it is, then you can get a nice win thus: --> Delete from stbl; --- One fast statement --> COPY stbl from stdin; --- Another fast statement row 1 data row 2 data row 3 data ... row 75262 data \. That update would be REALLY fast! Even if it isn't, consider trying the following transaction: BEGIN; select * into temp table stbl_12341 from stbl limit 0; -- Note that 12341 is the PID of your process, so that should be -- pretty unique copy stbl_12341 from stdin; -- Load your 75262 rows in one fell swoop row 1 data row 2 data ... row 75262 data \. -- Now, delete from stbl all the rows that are in the replacement table... delete from stbl where pkey in (select pkey from stbl_12341); insert into stbl (select * from stbl_12341); COMMIT; Both approaches will be WAY faster than doing the processing row by row. -- let name="cbbrowne" and tld="gmail.com" in name ^ "@" ^ tld;; http://linuxfinances.info/info/postgresql.html "Waving away a cloud of smoke, I look up, and am blinded by a bright, white light. It's God. No, not Richard Stallman, or Linus Torvalds, but God. In a booming voice, He says: "THIS IS A SIGN. USE LINUX, THE FREE Unix SYSTEM FOR THE 386." -- Matt Welsh
> On Feb 4, 2005, at 8:34 AM, Ben-Nes Yonatan wrote: > >>> On Fri, Feb 04, 2005 at 09:27:08AM +0200, Ben-Nes Yonatan wrote: >>>> Hi all, >>>> >>>> Does anyone know if PostgreSQL got a function which work like >>>> load_file() of mySQL? >>> >>> I am not quite sure what load_file() does, but check the COPY command >>> and the analgous \copy in psql. As with many other PostgreSQL >>> commands, COPY has a man page in lower case: >>> >>> man copy >>> >> >> First thanks for your answer David but im afraid that i still got a >> problem >> with this solution... im not trying to upload a big file which contain >> data >> which is supposed to be divided to plenty of rows, i want to upload a big >> file (wav,ppt,whatever...) into one column one row... maybe i can use >> COPY >> to upload the rows to a temp_table and then from there build the file at >> the >> primary table so it will be only postgresql working in his own >> enviroment... >> though that seem "too heavy" isnt there a better elegant solution? >> > > Try looking here--again if I understand what you are trying to do.... > > http://www.postgresql.org/docs/8.0/interactive/largeobjects.html > > Large objects in postgresql can be files, images, etc. There are several > functions for manipulating them (loading, etc.) available in postgresql. > > Sean Hi again David and Sean and thanks for your help, Sean you were correct!! huray for you!! you dont know how much time i spent on that problem (beside of all things other maillists also). I dont understand why they dont write anything at the manual on the better memory usage that the large objects interface have with other languages as PHP... they just write there that you can upload a sum of 2GB instead of the new TOAST which can hold 1GB which is more then enough to 99.99% of the projects.... Anyway thanks alot again for both of you, Ben-Nes Yonatan
How are you updating this tables? Do you use UPDATE .... WHERE .... On Fri, 4 Feb 2005 05:57:32 -0800 (PST), Venkatesh Babu <venkatbabukr@yahoo.com> wrote: > Hello, > > We have a table cm_quotastates which has exactly > 4624564 rows and 25 columns and 9 indexes... Out of > these, our code retrieves 75262 rows and modifies just > one column in each row... but updating these to > database is taking some significant time (around 20 > minutes)... Tried the following with the update > > (a) Tried updating after removing all the 9 indexes > associated with the table > (b) Tried updating the 75K rows in batches > (c) vacuum analyze the table before updation > > but none are helping and update still takes the same > amount of time. Is there anything else that can be > done so that update takes lesser time... Also, where > can I find info about how postgres update actually > works? > > Thanks, > Venkatesh > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings >
you mention you use one update statement by record, this may be to basic but anyway, it just happened to me... do you use the WHERE clause in your UPDATE statement, and if so is the column you use to filter indexed? javier On Sat, 5 Feb 2005 03:14:52 -0800 (PST), Venkatesh Babu <venkatbabukr@yahoo.com> wrote: > Hi, > > There aren't any foreign keys and we are currently > using Postgres version 7.4... > > --- Venkatesh Babu <venkatbabukr@yahoo.com> wrote: > > > Hi, > > > > There aren't any triggers but there are 75262 update > > statements. The problem is that we have a datatype > > called as "Collection" and we are fetching the data > > rows into it, modifying the data and call > > Collection.save(). This save method generates one > > update satement per record present in it. > > > > Thanks, > > Venkatesh > > > > --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > Venkatesh Babu <venkatbabukr@yahoo.com> writes: > > > > We have a table cm_quotastates which has exactly > > > > 4624564 rows and 25 columns and 9 indexes... Out > > > of > > > > these, our code retrieves 75262 rows and > > modifies > > > just > > > > one column in each row... but updating these to > > > > database is taking some significant time (around > > > 20 > > > > minutes)... Tried the following with the update > > > > > > Any triggers or foreign keys on that table? Also, > > > what PG version is > > > this? Are you doing this in a single UPDATE > > > command, or 75262 separate > > > commands? > > > > > > regards, tom lane > > > > > > > > > > > > > __________________________________ > > Do you Yahoo!? > > Yahoo! Mail - Find what you need with new enhanced > > search. > > http://info.mail.yahoo.com/mail_250 > > > > ---------------------------(end of > > broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > > __________________________________ > Do you Yahoo!? > Yahoo! Mail - You care about security. So do we. > http://promotions.yahoo.com/new_mail > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Hello, Thanks for providing info... I tried disabling autocommit, as suggested by Mr. Greg Stark, I tried issuing the command "set autocommit to off", but got the following error message: ERROR: SET AUTOCOMMIT TO OFF is no longer supported Also, I can't implement the suggestions of Mr. Christopher Browne, because I'm not working with database directly. There is an abstract layer built over the database. This abstract layer provides an interface between application objects and data tables corresponding to those objects. Our application is developed over this abstract layer. Infact, we are using "Collection" datatype provided by this layer. Collection is similar to java vectors in that it can store any kind of persistable objects, also it implements the save method (which updates the tables corresponding to each object present in the collection), hence one update statement generated per object present in the collection. all i can do is to play with indexes for the tables or change postgres settings. I hope the problem is clear now... Also, the suggestions of Mr. Tom Lane on transaction blocking and making use of prepared statements and indexes on primary have been taken care of.... i forgot to mention that even though i deleted all indexes, i ensured that the index on primary key is not deleted.... to give more background information, we've migrated the database from db2 to postgres.... things were fine in db2... is this migration having any effect on the poor performance of updates (i mean to say is this problem happening due to some improper migration???) Thanks, Venkatesh --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > Venkatesh Babu <venkatbabukr@yahoo.com> writes: > > There aren't any triggers but there are 75262 > update > > statements. The problem is that we have a datatype > > called as "Collection" and we are fetching the > data > > rows into it, modifying the data and call > > Collection.save(). This save method generates one > > update satement per record present in it. > > Well, that's going to be dog-slow in any case > compared to putting the > logic on the server side, but a couple of things you > could possibly > do: make sure all of this is in one transaction > block (a commit per > row updated is a lot of overhead) and use a prepared > statement for the > UPDATE to get you out from under the repeated > parse/plan overhead. > Check the UPDATE's plan, too, and make sure it's an > indexscan on the > primary key rather than anything less efficient. > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose > an index scan if your > joining column's datatypes do not match > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Hi, The where clause is used in update statements and the column present in the where clause is indexed... but still updates are slow. Thanks, Venkatesh --- guegue <guegue@gmail.com> wrote: > you mention you use one update statement by record, > this may be to > basic but anyway, it just happened to me... > > do you use the WHERE clause in your UPDATE > statement, and if so is the > column you use to filter indexed? > > javier > > > On Sat, 5 Feb 2005 03:14:52 -0800 (PST), Venkatesh > Babu > <venkatbabukr@yahoo.com> wrote: > > Hi, > > > > There aren't any foreign keys and we are currently > > using Postgres version 7.4... > > > > --- Venkatesh Babu <venkatbabukr@yahoo.com> wrote: > > > > > Hi, > > > > > > There aren't any triggers but there are 75262 > update > > > statements. The problem is that we have a > datatype > > > called as "Collection" and we are fetching the > data > > > rows into it, modifying the data and call > > > Collection.save(). This save method generates > one > > > update satement per record present in it. > > > > > > Thanks, > > > Venkatesh > > > > > > --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > > Venkatesh Babu <venkatbabukr@yahoo.com> > writes: > > > > > We have a table cm_quotastates which has > exactly > > > > > 4624564 rows and 25 columns and 9 indexes... > Out > > > > of > > > > > these, our code retrieves 75262 rows and > > > modifies > > > > just > > > > > one column in each row... but updating these > to > > > > > database is taking some significant time > (around > > > > 20 > > > > > minutes)... Tried the following with the > update > > > > > > > > Any triggers or foreign keys on that table? > Also, > > > > what PG version is > > > > this? Are you doing this in a single UPDATE > > > > command, or 75262 separate > > > > commands? > > > > > > > > regards, tom lane > > > > > > > > > > > > > > > > > > > __________________________________ > > > Do you Yahoo!? > > > Yahoo! Mail - Find what you need with new > enhanced > > > search. > > > http://info.mail.yahoo.com/mail_250 > > > > > > ---------------------------(end of > > > broadcast)--------------------------- > > > TIP 6: Have you searched our list archives? > > > > > > http://archives.postgresql.org > > > > > > > __________________________________ > > Do you Yahoo!? > > Yahoo! Mail - You care about security. So do we. > > http://promotions.yahoo.com/new_mail > > > > ---------------------------(end of > broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to > majordomo@postgresql.org so that your > message can get through to the mailing list > cleanly > __________________________________ Do you Yahoo!? Yahoo! Mail - Helps protect you from nasty viruses. http://promotions.yahoo.com/new_mail
Venkatesh Babu <venkatbabukr@yahoo.com> writes: > Hello, > > Thanks for providing info... I tried disabling > autocommit, as suggested by Mr. Greg Stark, I tried > issuing the command "set autocommit to off", but got > the following error message: > > ERROR: SET AUTOCOMMIT TO OFF is no longer supported Autocommit is handled by the drivers now. > Also, I can't implement the suggestions of Mr. > Christopher Browne, because I'm not working with > database directly. There is an abstract layer built > over the database. This abstract layer provides an > interface between application objects and data tables > corresponding to those objects. Our application is > developed over this abstract layer. Infact, we are > using "Collection" datatype provided by this layer. > Collection is similar to java vectors in that it can > store any kind of persistable objects, also it > implements the save method (which updates the tables > corresponding to each object present in the > collection), hence one update statement generated per > object present in the collection. Sounds like Hibernate--is that what you're using? Make sure you use your mapping library's transaction mechanism to execute the save() inside a transaction and you may get get some speedup. -Doug