Thread: auto-reconnect: temp schemas, sequences, transactions

auto-reconnect: temp schemas, sequences, transactions

From

Marek Więckowski

Date:

02 May 2011, 04:52:38

Hi,

If I understand correctly, Tom's reply in:
http://archives.postgresql.org/pgsql-general/2007-06/msg01668.php suggests
that temp schemas are kept when a session gets disconnected because
connections get automatically re-established with the same backend id, and if
this succeeds the old temp tables get picked up by the new connection as if
there was no disconnection at all.

However, it seems that the same does not happen for currval of sequences? This
is quite inconvenient, I must say (in some situations our re-established
connection is basically useless... and it even does not "know" about it...).

For my information: what happens if the disconnect (due to another client
process dying) happened during a transaction? After reconnect are we again in
the middle of the previous transaction? (Or does something else happen, e.g.
the transaction continues until "commit" and only then the session actually
disconnects?) I'm anxiously assuming that it's not like the next queries after
reconnection will happily be executed outside of a transaction, but I haven't
found much (recent) info on the subject.

Best,
~Marek

=# select version();
 version
---------------------------------------------------------------------------------------
 PostgreSQL 8.4.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.4.4, 64-bit
(1 row

Re: auto-reconnect: temp schemas, sequences, transactions

From

Tom Lane

Date:

02 May 2011, 11:44:15

Marek =?utf-8?q?Wi=C4=99ckowski?= <wieckom@foxi.nl> writes:
> If I understand correctly, Tom's reply in:
> http://archives.postgresql.org/pgsql-general/2007-06/msg01668.php suggests
> that temp schemas are kept when a session gets disconnected because
> connections get automatically re-established with the same backend id, and if
> this succeeds the old temp tables get picked up by the new connection as if
> there was no disconnection at all.

Uh, no, surely not.  The schema itself is re-used if it exists, but all
the contained tables get flushed by the new session (if for some reason
the old session failed to do that, as it would in case of a crash).
Re-use of the schema object is just a minor implementation optimization
--- there's no expectation that temp tables would ever survive into
another session.

            regards, tom lane

Re: auto-reconnect: temp schemas, sequences, transactions

From

Marek Więckowski

Date:

02 May 2011, 12:18:46

Hi,

On Monday 02 May 2011 16:43:54 Tom Lane wrote:
> Marek Wieckowski <wieckom@foxi.nl> writes:
> > If I understand correctly, Tom's reply in:
> > http://archives.postgresql.org/pgsql-general/2007-06/msg01668.php
> > suggests that temp schemas are kept when a session gets disconnected
> > because connections get automatically re-established with the same
> > backend id, and if this succeeds the old temp tables get picked up by the
> > new connection as if there was no disconnection at all.
>
> Uh, no, surely not.  The schema itself is re-used if it exists, but all
> the contained tables get flushed by the new session (if for some reason
> the old session failed to do that, as it would in case of a crash).

Clear. Thanks for your answer.

But what happens with a db transaction upon disconnect? If I have (say, in c++
code or a script):

begin;
query1;
query2;
query3;
query4;
query5;
commit;

(with possibly some extra c++ or script code in between queries), and
somewhere at the time when query2 is being executed some other backend
crashes; session gets disconnected and automatically connected: what would
happen to next queries which would be executed by the external code (query3, 4
and so on)? They would not be executed outside of db transaction, wouldn't
they? I would hope that they all keep failing up until next commit/rollback or
something similar...

Best,
~Marek

Re: auto-reconnect: temp schemas, sequences, transactions

From

Tom Lane

Date:

02 May 2011, 12:32:40

Marek =?utf-8?q?Wi=C4=99ckowski?= <wieckom@foxi.nl> writes:
> But what happens with a db transaction upon disconnect? If I have (say, in c++
> code or a script):

> begin;
> query1;
> query2;
> query3;
> query4;
> query5;
> commit;

> (with possibly some extra c++ or script code in between queries), and
> somewhere at the time when query2 is being executed some other backend
> crashes; session gets disconnected and automatically connected: what would
> happen to next queries which would be executed by the external code (query3, 4
> and so on)? They would not be executed outside of db transaction, wouldn't
> they? I would hope that they all keep failing up until next commit/rollback or
> something similar...

Well, there will be no memory on the server side of any uncompleted
queries.  If the client-side logic tries to re-issue these queries
after re-connecting, it would be up to that logic to be careful about
what to reissue or not.  Possibly this is a question for the author
of your client library.

            regards, tom lane

Re: auto-reconnect: temp schemas, sequences, transactions

From

Marek Więckowski

Date:

04 May 2011, 12:58:00

On Monday 02 May 2011 17:32:26 Tom Lane wrote:
> If the client-side logic tries to re-issue these queries
> after re-connecting, it would be up to that logic to be careful about
> what to reissue or not.  Possibly this is a question for the author
> of your client library.

I see. So I have two use cases:
1) "my" client library;
2) psql :p

Let's look briefly at psql, shall we? (I would look at what needs changing in
"my" client library when I understand what is according to you a proper
solution to psql.)

If there is a script executed in psql there is no easy way to catch that psql
has reconnected in the middle of it... The simplest example that could do a
lot of damage would be a simple script executed in psql (by e.g. \i file_name
or "psql -f"):
    begin;
    update test1 set value = value+1;
    update test1 set value = value+10;
    update test1 set value = value+100;
    commit;
Obviously the intention is that all three queries succeed (values would be
increased by 111) or all three fail (values not changed).

See attached "orig" output: if backend dies just after the first query, then
the next query triggers reconnect and psql does not execute it, but ... the
third update query gets happily executed outside of transaction. So the result
is that in this case value was increased by 100 - a really not expected (and
possibly - very, very bad) result.

I think it should be considered a problem which should be solved in psql. What
follows is my rough solution => it does solve the problem, although there is
probably a much easier way to solve it.

Now, the problem is that psql (as well as any other client program/library)
* would have hard time keeping track of whether its session is in a
transaction;
* libpq already does this for us, in conn->xactStatus;
* but (because of check that conn->status == CONNECTION_OK in
PQtransactionStatus() ) there is no way to get to the value of conn-
>xactStatus once the connection is dead (i.e. to the last trans status of a
now-disconnected connection)....

You will probably have a much better idea, but what I did (see attached patch)
is I removed the part of PQtransactionStatus(const PGconn *conn) which says
that "if conn->status != CONNECTION_OK then returned value is
PQTRANS_UNKNOWN". Thus the meaning of the function PQtransactionStatus()
changes slightly (instead of "trans status of an active connection" it would
mean now "trans status of an active connection or last trans status of a
disconnected connection"), but the API and binary compatibility of libpq is
preserved.

Anyway, after this change I am able to get pre-disconnect trans status in psql
just before reconnecting. And if we were in a transaction then after
reconnecting I create immediately a trans-in-error (again, see a psql part of
the attached patch; BTW: how to trigger an error in a way nicer than "select
1/0", preferably with a message? i.e., is there some libpq equivalent of
"raise exception"?).

See the "new" output in the attachment: the result is that, in the example I
gave at the beginning of this mail, if there is a disconnect after the first
update, then the second query triggers a reconnect, but we are in trans-in-
error, so also all subsequent queries fail => it is as if a proper db
transaction was rolled back. I think this is a much, much better behaviour of
psql, is it?

PS: It would be more straightforward to change PQreset() in a similar way
instead of changing psql (PQreset has direct access to conn->xactStatus), but
ofc PQreset() it's a part of public API of libpq; client code could in
principle execute PQreset() when within a db transaction, and the expectation
would be that after the call you get a "clean" new session. Still, maybe a not
bad idea for the future would be to keep PQreset() as it is and add e.g. a
PQreconnect() which would do basically the same but in the case the old
connection was in transaction it would right away create a new trans-in-error
in the new session? Well, just an idea => it would lead to even less handling
in the client programs (like psql => it would just call PQreconnect() and not
have to issue "begin + create and error").

Best,
~Marek

Attachment

Re: auto-reconnect: temp schemas, sequences, transactions

From

Tom Lane

Date:

04 May 2011, 13:04:28

Marek =?utf-8?q?Wi=C4=99ckowski?= <wieckom@foxi.nl> writes:
> If there is a script executed in psql there is no easy way to catch that psql
> has reconnected in the middle of it...

As far as psql goes, it should certainly abandon executing any script
file if it loses the connection.  I rather thought it did already.
See the bit just above where you propose patching:

        if (!pset.cur_cmd_interactive)
        {
            psql_error("connection to server was lost\n");
            exit(EXIT_BADCONN);
        }

            regards, tom lane

Re: auto-reconnect: temp schemas, sequences, transactions

From

Marek Więckowski

Date:

04 May 2011, 14:03:41

On Wednesday 04 May 2011 18:04:16 Tom Lane wrote:
> Marek Wieckowski <wieckom@foxi.nl> writes:
> > If there is a script executed in psql there is no easy way to catch that
> > psql has reconnected in the middle of it...
>
> As far as psql goes, it should certainly abandon executing any script
> file if it loses the connection.  I rather thought it did already.
> See the bit just above where you propose patching:
>
>         if (!pset.cur_cmd_interactive)
>         {
>             psql_error("connection to server was lost\n");
>             exit(EXIT_BADCONN);
>         }

Hmm, indeed, I've missed it. Clear for psql. And yes, I agree that for psql it
is better to exit. Thanks for the answer.

But for the library which I'm using, simply exiting/aborting is not an option
(and this is why I was looking into this in the first place). There is a
danger that client programs will continue issuing queries while believing that
they are in a transaction... They do expect db errors and rolled back
transactions, but not that their begin-commit section would be executed only
partially. Solving this on the level of "my" library would solve it once for
good (and the alternative sounds more complex: it would require exposing extra
info to the programs using this library, and add handling of reconnect
situation in each of these programs etc.).

In my head, it wraps up to a following structure: In the library (which gives
access to libpq functionality):
1. anytime we use db connection we would check if connection is OK;
2. if yes, we would ask for PQtransactionStatus() and keep a copy of returned
status;
3. if not, we would try to reconnect, BUT based on (2.) we would know if
before the connection was lost we were in a trans;
4. if we were in a trans before disconnect, then immediately after
reconnecting we would create a trans-in-error.

Does the above make sense to you? Any points of attention?

Also, would you have any suggestion for how to create a trans-in-error in a
way nicer than
      begin;
      select 1/0;
preferably with a message? In other words, is there some libpq equivalent of
"raise exception"?

Best,
~Marek

Re: auto-reconnect: temp schemas, sequences, transactions

From

Tom Lane

Date:

04 May 2011, 14:22:00

Marek =?utf-8?q?Wi=C4=99ckowski?= <wieckom@foxi.nl> writes:
> But for the library which I'm using, simply exiting/aborting is not an option
> (and this is why I was looking into this in the first place). There is a
> danger that client programs will continue issuing queries while believing that
> they are in a transaction... They do expect db errors and rolled back
> transactions, but not that their begin-commit section would be executed only
> partially. Solving this on the level of "my" library would solve it once for
> good (and the alternative sounds more complex: it would require exposing extra
> info to the programs using this library, and add handling of reconnect
> situation in each of these programs etc.).

Well, I think it's foolish to imagine that a client library should try
to do transparent reconnection: it's somewhere between difficult and
impossible to keep track of all the server-side state that the
application might be relying on, above and beyond the immediate problem
of an unfinished transaction.  It's almost always better to punt the
problem back to the application, and let it decide whether to try again
or just curl up and die.

If you have server restarts occurring often enough that this seems
useful to work on, then I submit that you have problems you ought to be
fixing on the server side instead.

            regards, tom lane

Re: auto-reconnect: temp schemas, sequences, transactions

From

Andrew Sullivan

Date:

04 May 2011, 15:02:47

On Wed, May 04, 2011 at 07:03:31PM +0200, Marek Więckowski wrote:

> (and this is why I was looking into this in the first place). There is a
> danger that client programs will continue issuing queries while believing that
> they are in a transaction... They do expect db errors and rolled back
> transactions, but not that their begin-commit section would be executed only
> partially.

I don't understand.  If they are prepared for errors, then if they
have any error they have to roll back to the beginning of the
savepoint or the transaction if there's no savepoint.  What is this
"partial execution" of which you speak?  Nothing is partially
executed: if the transaction rolls back, everything is lost.

> good (and the alternative sounds more complex: it would require exposing extra
> info to the programs using this library, and add handling of reconnect
> situation in each of these programs etc.).

[. . .]

> 4. if we were in a trans before disconnect, then immediately after
> reconnecting we would create a trans-in-error.

You're going to have to expose trans-in-error to the client anyway.
What's the difference between that and exposing the program to needing
to fix its connection?

A

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: auto-reconnect: temp schemas, sequences, transactions

From

Marek Więckowski

Date:

12 May 2011, 10:58:29

Hi,

Just to sum things up:

On Wednesday 04 May 2011 19:21:42 Tom Lane wrote:
> Well, I think it's foolish to imagine that a client library should try
> to do transparent reconnection: it's somewhere between difficult and
> impossible to keep track of all the server-side state that the
> application might be relying on, above and beyond the immediate problem
> of an unfinished transaction.

After sleeping on it - I now agree 100%. (A simple example would be
savepoints... The idea to try to create "trans in error" was silly, I must
say.)

> It's almost always better to punt the problem back to the application,
> and let it decide whether to try again or just curl up and die.

Yes. I dug into it a bit more and I have found the magic place where the
library which I'm using did a silent reconnection in the background. Now I
think this is the place which is wrong - if connection is not re-established
applications have a chance to notice that something went wrong and react
appropriately (do a proper clean-up, or reconnect, or abort etc.).

> If you have server restarts occurring often enough that this seems
> useful to work on, then I submit that you have problems you ought to be
> fixing on the server side instead.

Agreed. For your information, it does not happen that often, but when it did
(once in two years...) was scary enough to trigger an investigation.

Tom, thank you very much for your help!

Best,
~Marek