Thread: Feature suggestions: "dead letter"-savepoint.

Feature suggestions: "dead letter"-savepoint.

From
Terje Elde
Date:
Hi all,

I’d like to pitch the idea of supporting “dead letter”-savepoints, similar to the way you have “dead letter”-exchanges
inmessage-queue systems, etc.  The idea is basically that a client can publish a message, but in such a away that it
onlyever actually gets published if the client dies, goes away suddenly, etc.  That allows for some nice logic, from
simplyannouncing “Daemon X just died”, to more advanced logic, simplifying self-healing clusters and what not. 

Different name would be “last will”.

The use-cases for PostgreSQL would be a bit different, so I’d like to draw up an example of where this could be very
(imho)useful. 


A very common usecase for PostgreSQL goes something like this:

1. Worker picks up a row.
2. Worker does something non-idempotent with an external service, where it’s important that it only gets done at most
once. Charge a credit card, send an email/sms, launch a rocket, and so on. 
3. Worker marks the row as successful, or failed.

But you need only one worker to pick up the task, so you expand to lock the row, and implement the first point as
somethinglike: 

SELECT * FROM targets WHERE status=‘scheduled’ FOR UPDATE SKIP LOCKED LIMIT 1;

That’ll give you a nice lock on the line, yet allow other workers to pick up other targets.

But what if there’s a bug making a call to the external service?  Most of the time, you’ll trap the error and set
statusto something sane, but what if there’s a crash-bug in the SDK implementing it, or some other situation where
thingsgo very bad?  The rocket might be fired, then the client dies, lock is released, another worker picks up the
task,and repeats the process ad nausium. 

Okay, so you’ll just update the row to say you’re trying to send it.  You set status=‘in-flight’ or some other status
that’llprevent the SELECT in other workers from picking up the task, and you commit that, so other workers won’t pick
upthe row if you die.  In the process though, you also loose the lock on the row.  You still want the row to be tied to
youspecifically, so you add a unique tag to the row, that later needs to be removed, so there’s more housekeeping. 

This is pretty basic, it works, and it works well.  However, it could (imho) be improved, both in terms of developer
comfort,and also more efficient.  The need for that extra commit – before doing the actual work – could also be
avoided,potentially reducing the number of transaction by half. 

Typically the flow would be something like:

BEGIN;
SELECT id FROM targets WHERE status=‘scheduled’ FOR UPDATE SKIP LOCKED LIMIT 1;
UPDATE targets SET status=‘in-flight’ WHERE id =%(id);
COMMIT;
— Do the work.
BEGIN;
UPDATE targets SET status=‘completed’ WHERE id = %(id); — or status=‘failed-foo’, if it fails for reason foo
COMMIT;


What I’m suggesting would be something along the lines of;

BEGIN;
SELECT id FROM targets WHERE status=‘scheduled’ FOR UPDATE SKIP LOCKED LIMIT 1;
UPDATE targets SET status=‘failed-unknown’ WHERE id =%(id);
SAVEPOINT deadletter ON FAILURE COMMIT;
— Do the work.
UPDATE targets SET status=‘completed’ WHERE id = %(id); — or status=‘failed-foo'
COMMIT;

Or, unless re-setting the columns changed before the savepoint, roll back to one prior to it.


The basic idea is to be able to say “If I go belly up, I want this stuff to happen”.  Depending on different needs,
couldbe made persistent once the savepoint is taken, but for a lot of cases that wouldn’t really be needed.  There’s
someroom for variation as well, such as having it support only dropped connections, or also support turning errors
and/orrollbacks into rollback to and commits of the savepoint.  Ideally configurable at the point the snapshot it
taken,to easily support pr. snapshot variation. 


I did for a few moments wonder if prepared transactions would be a better place for something like this.  It could
allowfor named independent transactions, but there’s a fairly big mismatch between the two concepts.  It also wouldn’t
betoo hard to use multiple named savepoints for effectively the same logic for most cases.  One advantage of prepared
transactionsis that it could perhaps also cover the case of a postgresql child dying, but that’s not exactly a common
problem. A huge dealbreaker though, is that the prepared transaction would very likely keep conflicting locks with the
workto be done. 




I have to admit, I’m not sure if this is a big ask or not, but I’m hopeful that it’s not.  In part because so much is
alreadythere.  My hope is that it wouldn’t take a lot to turn an error into what is effectively pretty close to
“ROLLBACKTO deadletter; COMMIT;”, combined with extending savepoints to include information about which failures they
should“catch”, and the routing to use those.  There could be a host of issues I’m not aware of though. 

Terje Elde




Re: Feature suggestions: "dead letter"-savepoint.

From
Marko Tiikkaja
Date:
On 2016-06-23 12:34, Terje Elde wrote:
> Typically the flow would be something like:
>
> BEGIN;
> SELECT id FROM targets WHERE status=‘scheduled’ FOR UPDATE SKIP LOCKED LIMIT 1;
> UPDATE targets SET status=‘in-flight’ WHERE id =%(id);
> COMMIT;
> — Do the work.
> BEGIN;
> UPDATE targets SET status=‘completed’ WHERE id = %(id); — or status=‘failed-foo’, if it fails for reason foo
> COMMIT;
>
>
> What I’m suggesting would be something along the lines of;
>
> BEGIN;
> SELECT id FROM targets WHERE status=‘scheduled’ FOR UPDATE SKIP LOCKED LIMIT 1;
> UPDATE targets SET status=‘failed-unknown’ WHERE id =%(id);
> SAVEPOINT deadletter ON FAILURE COMMIT;
> — Do the work.
> UPDATE targets SET status=‘completed’ WHERE id = %(id); — or status=‘failed-foo'
> COMMIT;

Comparing these two; how is the latter any better?  It's the same number 
of commands, except it's holding a transaction open for longer, it's 
using a non-standard concept and it's arguably more complex.


.m



Re: Feature suggestions: "dead letter"-savepoint.

From
Terje Elde
Date:
> On 23 Jun 2016, at 11:50, Marko Tiikkaja <marko@joh.to> wrote:
>
> Comparing these two; how is the latter any better?  It's the same number of commands, except it's holding a
transactionopen for longer, it's using a non-standard concept and it's arguably more complex. 

Same number of commands, but half the number of transactions/commits.  It’s holding the transaction open for longer,
butnot necessarily very long.  It’s also not holding any contested locks (in the example at least). 

It is arguably more complex, but also arguably (imho anyway) simpler.  It gives a generic pattern of saying “I want
thisto happen if things go bad”.  With the upper example, you spend time setting up a system or pattern specifically
forthe use-case.  With patterns such as using a column to mark ownership of a row to your process, you’d need that
extracolumn.  With a large number of rows, and a low probability of actually needing to update the row, that could also
resultin significant update-traffic, that could be avoided.  Similar if you’d need to keep a lock on another table as
well,you’d loose the lock between the first and second transaction.  You could work around that with using
ownership-typecolumns on that table as well, but then you have update-traffic there too.  Not to mention that for any
useof ownership-type columns, you run the risk of having to do housekeeping on them, versus the second example, where
you’resure that all locks get cleaned up. 

I entirely agree though, that this does in no way allow you to solve any new problems, that you can’t solve today.

Part of the reason I like the idea, is that it lowers the threshold for and work involved in dropping in “If what I’m
aboutto do goes badly, I want X to happen”.  Yeah, it’s often comparable to simply do X first, and revert or alter it
later,but that easily leads to twice the commits, loss of locks, having to work around that, possibly introducing even
moreupdate-traffic in the process.  I just see a potential to both avoid that, and gain some developer-comfort at the
sametime, so I figured I should pitch the idea. 

One could easily argue that session-level advisory locks could often be used to avoid the issues of loosing locks
acrossthe two transactions, and it wouldn’t be wrong.  It just moves the question back to complexity though, of which
therewould be more (imho). 


Terje Elde




Re: Feature suggestions: "dead letter"-savepoint.

From
Craig Ringer
Date:
On 23 June 2016 at 17:34, Terje Elde <terje@elde.net> wrote:
 

But what if there’s a bug making a call to the external service?  Most of the time, you’ll trap the error and set status to something sane, but what if there’s a crash-bug in the SDK implementing it, or some other situation where things go very bad?  The rocket might be fired, then the client dies, lock is released, another worker picks up the task, and repeats the process ad nausium.

Okay, so you’ll just update the row to say you’re trying to send it.  You set status=‘in-flight’ or some other status that’ll prevent the SELECT in other workers from picking up the task, and you commit that, so other workers won’t pick up the row if you die.  In the process though, you also loose the lock on the row.  You still want the row to be tied to you specifically, so you add a unique tag to the row, that later needs to be removed, so there’s more housekeeping.

It sounds like you're trying to re-invent distributed commit. Don't do that. It's hard. Use 2PC and an external transaction co-ordinator that can do in-doubt transaction resolution. On Java you want JTA. On Windows you want MSDTC. In C you want to run screaming, um, I mean use XA.
 
The basic idea is to be able to say “If I go belly up, I want this stuff to happen”.

This only papers over the problem rather weakly. What if the PostgreSQL backend dies undexpectedly, not just the client? You've still got a problem. Unless you're thinking of something that'd write to WAL then do the work in some kind of critical section where if we fail we panic the server and it gets done during WAL redo, or something like that.
 
Depending on different needs, could be made persistent once the savepoint is taken, but for a lot of cases that wouldn’t really be needed.

It's starting to sound a lot like 2PC, you know.
 
There’s some room for variation as well, such as having it support only dropped connections, or also support turning errors and/or rollbacks into rollback to and commits of the savepoint.  Ideally configurable at the point the snapshot it taken, to easily support pr. snapshot variation.

Very like 2PC.
 
I did for a few moments wonder if prepared transactions would be a better place for something like this.  It could allow for named independent transactions, but there’s a fairly big mismatch between the two concepts.  It also wouldn’t be too hard to use multiple named savepoints for effectively the same logic for most cases.  One advantage of prepared transactions is that it could perhaps also cover the case of a postgresql child dying, but that’s not exactly a common problem.  A huge dealbreaker though, is that the prepared transaction would very likely keep conflicting locks with the work to be done.

Actually, that's a benefit. It means you hold a lock on the row you're working on until your coordinator determines whether the action was actually performed or not, and commits or rolls back the 2PC prepared xact. In the mean time nobody else tries to grab that same row and work on it.
 
Now, what I do think we need is to give the client the ability to determine whether one of its xacts actually committed or not when it lost the session after dispatching COMMIT but before getting a confirmation from the server and persistently storing that knowledge. Right now if you want that you have to do full 2PC. You shouldn't need to, you should be able to get the xid when the xact is assigned it and store it somewhere locally. Then later, if you're unsure if that xid committed or not due to a client crash etc, you should be able to do some kind of SELECT pg_xact_is_committed(xid)    to find out. Right now this is possible to write with a pretty simple extension, but adds an extra roundtrip for a SELECT txid_current() call (unless you pipeline it). I'd prefer that the server just tell you when an xid is assigned. And yes, I think xid is the right identifier for this; it's short, simple, and while it wraps around it takes long enough to do so that it's very well suited for this job. 


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Feature suggestions: "dead letter"-savepoint.

From
"Tsunakawa, Takayuki"
Date:

From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Craig Ringer
Now, what I do think we need is to give the client the ability to determine whether one of its xacts actually committed or not when it lost the session after dispatching COMMIT but before getting a confirmation from the server and persistently storing that knowledge. Right now if you want that you have to do full 2PC. You shouldn't need to, you should be able to get the xid when the xact is assigned it and store it somewhere locally. Then later, if you're unsure if that xid committed or not due to a client crash etc, you should be able to do some kind of SELECT pg_xact_is_committed(xid)    to find out. Right now this is possible to write with a pretty simple extension, but adds an extra roundtrip for a SELECT txid_current() call (unless you pipeline it). I'd prefer that the server just tell you when an xid is assigned. And yes, I think xid is the right identifier for this; it's short, simple, and while it wraps around it takes long enough to do so that it's very well suited for this job. 


 

This is interesting.  Oracle provides Transaction Guard for this.  Our customers also sometimes encounter the trouble of duplicate records in the database when, when their apps get disconnected during commit and reconnect to insert the same record again.

 

Regards

Takayuki Tsunakawa