Thread: Batch Operations
hi, im currently working on a project that requires batch operations - eg. Batch insert/update etc. The database im using is PostgreSQL. However, i cannot find any documentation for batch opeartions on PostgreSQL. Does anyone know how to do this, or if it is possible? Thanx in advance regards rahul
On Fri, Aug 08, 2003 at 15:32:05 +0530, Rahul_Iyer <rahul_iyer@persistent.co.in> wrote: > hi, > im currently working on a project that requires batch operations - eg. Batch > insert/update etc. The database im using is PostgreSQL. However, i cannot > find any documentation for batch opeartions on PostgreSQL. Does anyone know > how to do this, or if it is possible? Usually you just use psql to run a script with the commands in it.
On Fri, Aug 08, 2003 at 03:32:05PM +0530, Rahul_Iyer wrote: > im currently working on a project that requires batch operations - eg. Batch > insert/update etc. The database im using is PostgreSQL. However, i cannot > find any documentation for batch opeartions on PostgreSQL. Does anyone know > how to do this, or if it is possible? If I understand you correctly, batch inserts are available via COPY. I'm not sure what you mean by "batch updates" -- can you elaborate? -Neil
In an attempt to throw the authorities off his trail, bruno@wolff.to (Bruno Wolff III) transmitted: > On Fri, Aug 08, 2003 at 15:32:05 +0530, > Rahul_Iyer <rahul_iyer@persistent.co.in> wrote: >> hi, im currently working on a project that requires batch >> operations - eg. Batch insert/update etc. The database im using is >> PostgreSQL. However, i cannot find any documentation for batch >> opeartions on PostgreSQL. Does anyone know how to do this, or if it >> is possible? > > Usually you just use psql to run a script with the commands in it. Ah, but I rather think he's looking for some place to queue up sets of operations inside the database, or inside some sort of "middleware" application. In SAP's R/3 application, automated batch processes are set up so that the updates that they are doing are stored "serialized" so that you can try and possibly retry them. It's almost exactly analagous to the notion of taking the set of URLs and FORM POSTs involved in navigating through a web application and storing them somewhere so that you could rerun data entry. To do things that way requires having quite a lot of application infrastructure; the right answer probably is, if the input is a set of SQL statements, to stick them in a file, and load the file. If this is the sort of thing someone needed to do and redo, and they had a lot of these files, it might make sense to make up a naming convention for files in a directory, perhaps parallelling Maildir, and have a daemon that looks there and processes whatever gets "spooled." If you have a lot of reporting processes to throw at a "reports server," this approach may make a lot of sense. If the DB server is a 4 CPU box, then you surely don't want to flood it by running 28 reports at once. Spooling the requests allows them to be processed two or three or four at a time, and the other 25 will wait until the predecessors are complete. -- output = reverse("moc.enworbbc" "@" "enworbbc") http://www3.sympatico.ca/cbbrowne/linuxxian.html Signs of a Klingon Programmer - 8. "Debugging? Klingons do not debug. Our software does not coddle the weak. Bugs are good for building character in the user."
Christopher Browne wrote: >In an attempt to throw the authorities off his trail, bruno@wolff.to (Bruno Wolff III) transmitted: > > >>On Fri, Aug 08, 2003 at 15:32:05 +0530, >> Rahul_Iyer <rahul_iyer@persistent.co.in> wrote: >> >> >>>hi, im currently working on a project that requires batch >>>operations - eg. Batch insert/update etc. The database im using is >>>PostgreSQL. However, i cannot find any documentation for batch >>>opeartions on PostgreSQL. Does anyone know how to do this, or if it >>>is possible? >>> >>> >>Usually you just use psql to run a script with the commands in it. >> >> > >Ah, but I rather think he's looking for some place to queue up sets of >operations inside the database, or inside some sort of "middleware" >application. > >In SAP's R/3 application, automated batch processes are set up so that >the updates that they are doing are stored "serialized" so that you >can try and possibly retry them. > >It's almost exactly analagous to the notion of taking the set of URLs >and FORM POSTs involved in navigating through a web application and >storing them somewhere so that you could rerun data entry. > >To do things that way requires having quite a lot of application >infrastructure; the right answer probably is, if the input is a set of >SQL statements, to stick them in a file, and load the file. > >If this is the sort of thing someone needed to do and redo, and they >had a lot of these files, it might make sense to make up a naming >convention for files in a directory, perhaps parallelling Maildir, and >have a daemon that looks there and processes whatever gets "spooled." > >If you have a lot of reporting processes to throw at a "reports >server," this approach may make a lot of sense. If the DB server is a >4 CPU box, then you surely don't want to flood it by running 28 >reports at once. Spooling the requests allows them to be processed >two or three or four at a time, and the other 25 will wait until the >predecessors are complete. > > This sounds like that "Scheduled Jobs" task thread that was discussed May13th. The essence was that it would be nice to have, but nobody seems to work on this at the moment. Regards, Andreas
After takin a swig o' Arrakan spice grog, pgadmin@pse-consulting.de (Andreas Pflug) commented...: > This sounds like that "Scheduled Jobs" task thread that was > discussed May13th. The essence was that it would be nice to have, > but nobody seems to work on this at the moment. What I didn't point out was that this is the sort of thing that wouldn't be at the "database" level, but rather would be suitably implemented as part of some form of "application server." Consider: There may be automated processes of various kinds: -> Processes that do statically defined updates (e.g. - purge out old data); -> Processes that read some data and transform it on a regular basis; -> Processes that generate reports on a timed basis. The first sort of process might be amenable to being done inside the database. But the second one isn't; it requires having some code that reads the input. The third one also isn't; it requires having some sort of "reporting framework," including such things as [Defining Parameters], [Formatting Output], and [Spooling Output], none of which are particularly amenable to being handled in the database engine. I wouldn't contemplate pushing that all into PostgreSQL; it's appropriate to implement alongside in some form of "application server." I should have been more clear about intent there; this is all "good stuff," but not things I'd expect to see added in 7.5 or necessarily ever. -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://www.ntlug.org/~cbbrowne/lsf.html Sex is the mathematics urge sublimated. -- M. C. Reed.