Re: Bulk processing & deletion - Mailing list pgsql-general

From Ivan Voras
Subject Re: Bulk processing & deletion
Date
Msg-id CAF-QHFWLvmpNsP_UvG3WAVob7n6Ps_gCYW+pPMnNQBx37ApZow@mail.gmail.com
Whole thread Raw
In response to Re: Bulk processing & deletion  (Steve Crawford <scrawford@pinpointresearch.com>)
List pgsql-general
On 13 October 2011 20:08, Steve Crawford <scrawford@pinpointresearch.com> wrote:
> On 10/13/2011 05:20 AM, Ivan Voras wrote:
>>
>> Hello,
>>
>> I have a table with a large number of records (millions), on which the
>> following should be performed:
>>
>>        1. Retrieve a set of records by a SELECT query with a WHERE
>> condition
>>        2. Process these in the application
>>        3. Delete them from the table
>>
> Without knowing a bit more, it is difficult to say. A couple questions:
>
> 1. Are there conflicting processes - i.e. could multiple applications be in
> contention to process the same set of records?

No, only one bulk processor.

> 2. Is the processing "all or none" or could individual records fail? If so,
> how do you deal with reprocessing or returning those to the main table.

All or none; the nature of thing is that there can be no fatal failures.

> Depending on the nature of your app, it might be feasible to reorder the
> actions to move the records to be processed into a "processing" table and
> delete them from that table as the records are processed by the application.
>
> You could move the records into the processing table with:
>
> with foo as (delete from main_table where your_where_clause returning a,b,c)
> insert into processing_table (a,b,c) select a,b,c from foo;
>
> In this case I would not recommend temporary or unlogged tables for the
> processing table as that becomes the only source of the data once deleted
> from the master table.

Ok, thanks (to everyone)!

pgsql-general by date:

Previous
From: Mark Keisler
Date:
Subject: Re: Monitoring Replication
Next
From: Mark Keisler
Date:
Subject: Re: [ADMIN] Trying to use binary replication - from tutorial