On 10/05/2015 05:02 AM, Begin Daniel wrote:
> In order to process a large amount of data I need to run a procedure
> using parallel batch processes.
> The query I run in each process is expected to ...
>
> 1- select a bunch of id (500) in a table (wait4processing) containing
> the list of all records to process
> 2- remove selected records from wait4processing table in order to
> eliminate duplicate processing
> 3- run the procedure (build_contributions_clusters) over the range of
> selected ids
>
> --The query I use:
> With ids as( delete from wait4processing where id in( select id from
> wait4processing limit 500) returning id)
> select build_contributions_clusters(min(id),max(id)) from ids;
>
> The query runs properly if I send it sequentially (wait for the
> completion of the query before sening it again) but it does'nt work
> when sending multiple occurrences in parallel. Seems from the results I
> got that the first query received by the server runs properly but the
> following ones try to process the same first 500 records even if deleted
> - the build_contributions_clusters procedure receive NULL values instead
> of the following 500 records.
>
> Since I am almost certain it is the expected behavior, I would like to
> like to understand why, and I would also appreciate to
> see alternative queries to do the job.
See here:
http://www.postgresql.org/docs/9.4/interactive/transaction-iso.html
"13.2.1. Read Committed Isolation Level
Read Committed is the default isolation level in PostgreSQL. When a
transaction uses this isolation level, a SELECT query (without a FOR
UPDATE/SHARE clause) sees only data committed before the query began; it
never sees either uncommitted data or changes committed during query
execution by concurrent transactions. In effect, a SELECT query sees a
snapshot of the database as of the instant the query begins to run.
However, SELECT does see the effects of previous updates executed within
its own transaction, even though they are not yet committed. Also note
that two successive SELECT commands can see different data, even though
they are within a single transaction, if other transactions commit
changes after the first SELECT starts and before the second SELECT starts."
>
> Thanks :-)
--
Adrian Klaver
adrian.klaver@aklaver.com