Re: Feature freeze date for 8.1 - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Feature freeze date for 8.1
Date
Msg-id 18511.1115046459@sss.pgh.pa.us
Whole thread Raw
In response to Re: Feature freeze date for 8.1  (Oliver Jowett <oliver@opencloud.com>)
Responses Re: Feature freeze date for 8.1  (Oliver Jowett <oliver@opencloud.com>)
Re: Feature freeze date for 8.1  (Oliver Jowett <oliver@opencloud.com>)
List pgsql-hackers
Oliver Jowett <oliver@opencloud.com> writes:
> The scenario I need to deal with is this:

> There are multiple nodes, network-separated, participating in a cluster.
> One node is selected to talk to a particular postgresql instance (call
> this node A).

> A starts a transaction and grabs some locks in the course of that
> transaction. Then A falls off the network before committing because of a
> hardware or network failure. A's connection might be completely idle
> when this happens.

> The cluster liveness machinery notices that A is dead and selects a new
> node to talk to postgresql (call this node B). B resumes the work that A
> was doing prior to failure.

> B has to wait for any locks held by A to be released before it can make
> any progress.

> Without some sort of tunable timeout, it could take a very long time (2+
> hours by default on Linux) before A's connection finally times out and
> releases the locks.

Wouldn't it be reasonable to expect the "cluster liveness machinery" to
notify the database server's kernel that connections to A are now dead?
I find it really unconvincing to suppose that the above problem should
be solved at the database level.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Dave Held"
Date:
Subject: Re: [pgsql-advocacy] Increased company involvement
Next
From: Thomas Hallgren
Date:
Subject: Re: SPI bug.