Thread: Batch Operations

Batch Operations

From
"Rahul_Iyer"
Date:
hi,
im currently working on a project that requires batch operations - eg. Batch
insert/update etc. The database im using is PostgreSQL. However, i cannot
find any documentation for batch opeartions on PostgreSQL. Does anyone know
how to do this, or if it is possible?

Thanx in advance
regards
rahul



Re: Batch Operations

From
Bruno Wolff III
Date:
On Fri, Aug 08, 2003 at 15:32:05 +0530, Rahul_Iyer <rahul_iyer@persistent.co.in> wrote:
> hi,
> im currently working on a project that requires batch operations - eg. Batch
> insert/update etc. The database im using is PostgreSQL. However, i cannot
> find any documentation for batch opeartions on PostgreSQL. Does anyone know
> how to do this, or if it is possible?

Usually you just use psql to run a script with the commands in it.


Re: Batch Operations

From
Neil Conway
Date:
On Fri, Aug 08, 2003 at 03:32:05PM +0530, Rahul_Iyer wrote:
> im currently working on a project that requires batch operations - eg. Batch
> insert/update etc. The database im using is PostgreSQL. However, i cannot
> find any documentation for batch opeartions on PostgreSQL. Does anyone know
> how to do this, or if it is possible?

If I understand you correctly, batch inserts are available via COPY. I'm
not sure what you mean by "batch updates" -- can you elaborate?

-Neil



Re: Batch Operations

From
Christopher Browne
Date:
In an attempt to throw the authorities off his trail, bruno@wolff.to (Bruno Wolff III) transmitted:
> On Fri, Aug 08, 2003 at 15:32:05 +0530,
>   Rahul_Iyer <rahul_iyer@persistent.co.in> wrote:
>> hi, im currently working on a project that requires batch
>> operations - eg. Batch insert/update etc. The database im using is
>> PostgreSQL. However, i cannot find any documentation for batch
>> opeartions on PostgreSQL. Does anyone know how to do this, or if it
>> is possible?
>
> Usually you just use psql to run a script with the commands in it.

Ah, but I rather think he's looking for some place to queue up sets of
operations inside the database, or inside some sort of "middleware"
application.

In SAP's R/3 application, automated batch processes are set up so that
the updates that they are doing are stored "serialized" so that you
can try and possibly retry them.

It's almost exactly analagous to the notion of taking the set of URLs
and FORM POSTs involved in navigating through a web application and
storing them somewhere so that you could rerun data entry.

To do things that way requires having quite a lot of application
infrastructure; the right answer probably is, if the input is a set of
SQL statements, to stick them in a file, and load the file.

If this is the sort of thing someone needed to do and redo, and they
had a lot of these files, it might make sense to make up a naming
convention for files in a directory, perhaps parallelling Maildir, and
have a daemon that looks there and processes whatever gets "spooled."

If you have a lot of reporting processes to throw at a "reports
server," this approach may make a lot of sense.  If the DB server is a
4 CPU box, then you surely don't want to flood it by running 28
reports at once.  Spooling the requests allows them to be processed
two or three or four at a time, and the other 25 will wait until the
predecessors are complete.
-- 
output = reverse("moc.enworbbc" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/linuxxian.html
Signs of  a  Klingon  Programmer -  8.  "Debugging?   Klingons do  not
debug.  Our software  does not   coddle the  weak. Bugs  are  good for
building character in the user."


Re: Batch Operations

From
Andreas Pflug
Date:
Christopher Browne wrote:

>In an attempt to throw the authorities off his trail, bruno@wolff.to (Bruno Wolff III) transmitted:
>  
>
>>On Fri, Aug 08, 2003 at 15:32:05 +0530,
>>  Rahul_Iyer <rahul_iyer@persistent.co.in> wrote:
>>    
>>
>>>hi, im currently working on a project that requires batch
>>>operations - eg. Batch insert/update etc. The database im using is
>>>PostgreSQL. However, i cannot find any documentation for batch
>>>opeartions on PostgreSQL. Does anyone know how to do this, or if it
>>>is possible?
>>>      
>>>
>>Usually you just use psql to run a script with the commands in it.
>>    
>>
>
>Ah, but I rather think he's looking for some place to queue up sets of
>operations inside the database, or inside some sort of "middleware"
>application.
>
>In SAP's R/3 application, automated batch processes are set up so that
>the updates that they are doing are stored "serialized" so that you
>can try and possibly retry them.
>
>It's almost exactly analagous to the notion of taking the set of URLs
>and FORM POSTs involved in navigating through a web application and
>storing them somewhere so that you could rerun data entry.
>
>To do things that way requires having quite a lot of application
>infrastructure; the right answer probably is, if the input is a set of
>SQL statements, to stick them in a file, and load the file.
>
>If this is the sort of thing someone needed to do and redo, and they
>had a lot of these files, it might make sense to make up a naming
>convention for files in a directory, perhaps parallelling Maildir, and
>have a daemon that looks there and processes whatever gets "spooled."
>
>If you have a lot of reporting processes to throw at a "reports
>server," this approach may make a lot of sense.  If the DB server is a
>4 CPU box, then you surely don't want to flood it by running 28
>reports at once.  Spooling the requests allows them to be processed
>two or three or four at a time, and the other 25 will wait until the
>predecessors are complete.
>  
>
This sounds like that "Scheduled Jobs" task thread that was discussed 
May13th. The essence was that it would be nice to have, but nobody seems 
to work on this at the moment.

Regards,
Andreas



Re: Batch Operations

From
Christopher Browne
Date:
After takin a swig o' Arrakan spice grog, pgadmin@pse-consulting.de (Andreas Pflug) commented...:
> This sounds like that "Scheduled Jobs" task thread that was
> discussed May13th. The essence was that it would be nice to have,
> but nobody seems to work on this at the moment.

What I didn't point out was that this is the sort of thing that
wouldn't be at the "database" level, but rather would be suitably
implemented as part of some form of "application server."

Consider: There may be automated processes of various kinds:

-> Processes that do statically defined updates (e.g. - purge out  old data);

-> Processes that read some data and transform it on a regular basis;

-> Processes that generate reports on a timed basis.

The first sort of process might be amenable to being done inside the
database.

But the second one isn't; it requires having some code that reads the
input.  The third one also isn't; it requires having some sort of
"reporting framework," including such things as [Defining Parameters],
[Formatting Output], and [Spooling Output], none of which are
particularly amenable to being handled in the database engine.

I wouldn't contemplate pushing that all into PostgreSQL; it's
appropriate to implement alongside in some form of "application
server."

I should have been more clear about intent there; this is all "good
stuff," but not things I'd expect to see added in 7.5 or necessarily
ever.
-- 
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/lsf.html
Sex is the mathematics urge sublimated.
-- M. C. Reed.