Delete/update with limit - Mailing list pgsql-general
From | Csaba Nagy |
---|---|
Subject | Delete/update with limit |
Date | |
Msg-id | 1185206218.20153.69.camel@coppola.muc.ecircle.de Whole thread Raw |
Responses |
Re: Delete/update with limit
Re: Delete/update with limit |
List | pgsql-general |
Hi all, This subject was touched a few times in the past, I looked into the archives... the result is invariably key developers saying such a feature is unsafe because the result is unpredictable, while the people requesting is saying it is OK that way, it is expected... but no compelling use case for it. OK, I have one... We have here quite a few processes which collect user input, and put them in "batch" tables, which in turn are processed regularly and update other tables with summary data, reject invalid records, etc. The insertions are unpredictable, they can happen any time and any of them in parallel, they are user input... and they must be very fast, it's our user experience at stake. The batch processing is done by a single periodical process. Now we had a few attempts of making this parallelism safe enough so we don't loose some of the concurrent input while we do the processing step, while still keeping minimal overhead in the table. The end result was a scheme where the batch processor deletes from the table and a delete trigger puts the deleted rows into a temporary table, and then the processor can do with that private data anything it pleases without interfering with the inserts (the processing is actually quite complex on occasions). This works fine in terms of correctness, however it turns out to be a problem with high bursts of incoming data, or when the processor is not running for a while and a lot of data is accumulating... then we have lots of data to process at once, which leads to long running transactions (the whole thing runs in one transaction) and worse, connection timeouts. On other databases, it is possible to limit the delete to a maximum number of rows to be deleted. This way we can limit the size of one batch with minimal overhead... In postgres we're currently not chunking, due to the fact that the code to do it is simply overly contorted and inefficient compared to the other DBs we use. At least all the solutions we could think of to do the chunking in a safe way while the inserts are running in parallel, without disturbing them, have invariably resulted in overly complicated code compared to the simple delete with limit + delete trigger solution we have for the other DBs. Now I don't put too much hope I can convince anybody that the limit on the delete/update commands has valid usage scenarios, but then can anybody help me find a good solution to chunk-wise process such a buffer table where insert speed is the highest priority (thus no indexes, the minimum of fields), and batch processing should still work fine with big table size, while not impacting at all the inserts, and finish in short time to avoid long running transactions ? Cause I can't really think of one... other than our scheme with the delete with limit + trigger + private temp table thing. Cheers, Csaba.
pgsql-general by date: