Re: Unexpected query result - Mailing list pgsql-general

From dinesh kumar
Subject Re: Unexpected query result
Date
Msg-id CALnrH7qZr4C0wv+CHZcxajnimNvN=JsrmbtsAg=xD6BwoM+_7w@mail.gmail.com
Whole thread Raw
In response to Unexpected query result  (Begin Daniel <jfd553@hotmail.com>)
List pgsql-general
Hi,

On Mon, Oct 5, 2015 at 5:02 AM, Begin Daniel <jfd553@hotmail.com> wrote:
In order to process a large amount of data I need to run a procedure using parallel batch processes.  
The query I run in each process is expected to ...


It seems, you are trying to achieve the same, what we did. Find my blog entry, which may help you in this scenarios.

1- select a bunch of id (500) in a table (wait4processing 
) containing the list of all records to process
2- remove selected records from wait4processing table in order to eliminate duplicate processing
3- run the procedure (build_contributions_clusters) over the range of selected ids 

--The query I use:
With ids as( delete from wait4processing where id in( select id from wait4processing limit 500)  returning id) 
select build_contributions_clusters(min(id),max(id)) from ids;

The query runs properly if I send it sequentially (wait for the completion of the query before sening it again) but it does'nt work when sending multiple occurrences in parallel.  Seems from the results I got that the first query received by the server runs properly but the following ones try to process the same first 500 records even if deleted - the build_contributions_clusters procedure receive NULL values instead of the following 500 records.

Since I am almost certain it is the expected behavior, I would like to like to understand why, and I would also appreciate to see alternative queries to do the job.

Thanks :-)



--

pgsql-general by date:

Previous
From: jimbosworth
Date:
Subject: Re: Best way to sync table DML between databases
Next
From: Kevin Grittner
Date:
Subject: Re: md5(large_object_id)