Thread: libpq error codes

libpq error codes

From
Denis Perchine
Date:
Hello all,

I try to add automatical connection restoring possibility to my app.
And I have the following problem:

When I execute query I have:

query: 1024: 'select count(*) from pg_class'
ResStatus: PGRES_TUPLES_OK
Status: 0

ResStatus is the result of PQresultStatus, Status is the result of PQstatus.

If I shutdown postgres between queries I get:

query: 1024: 'select count(*) from pg_class'
ResStatus: PGRES_FATAL_ERROR
Status: 0
except: pqReadData() --  read() failed: errno=32
���������� �����

query: 1024: 'select count(*) from pg_class'
FATAL 1:  The system is shutting down
NOTICE:  AbortTransaction and not in in-progress state
Status: 1
except: pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.

Please note, that Status is 0 in the first case. There's already no any backend on the
other side but Status is still OK. That's bad... And the second query just return NULL
to PQexec.

The problem is that I cannot properly distinguish between errors in SQL, or some incorrect
SQL usage and situations when connection is lost and I should try to reconnect.

Any ideas how this can be implemented?

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: libpq error codes

From
Tom Lane
Date:
Denis Perchine <dyp@perchine.com> writes:
> If I shutdown postgres between queries I get:

> query: 1024: 'select count(*) from pg_class'
> ResStatus: PGRES_FATAL_ERROR
> Status: 0
> except: pqReadData() --  read() failed: errno=32
> ���������� �����

What version are you running, and are you sure you are using libpq
correctly?  Using psql I see

regression=# select count(*) from pg_class;
 count
-------
   260
(1 row)

< in another window, kill postgres backend >

regression=# select count(*) from pg_class;
FATAL 1:  The system is shutting down
NOTICE:  AbortTransaction and not in in-progress state
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
regression=#

which looks pretty reasonable.

I should also point out that in the current system, normal shutdown
(via pg_ctl stop or 'kill' on the postmaster) produces no such result
because extant backends are allowed to finish their sessions normally.

            regards, tom lane

Re: libpq error codes

From
Denis Perchine
Date:
> > If I shutdown postgres between queries I get:
>
> > query: 1024: 'select count(*) from pg_class'
> > ResStatus: PGRES_FATAL_ERROR
> > Status: 0
> > except: pqReadData() --  read() failed: errno=32
> > ���������� �����
>
> What version are you running, and are you sure you are using libpq
> correctly?  Using psql I see

7.0.2.

And you use PIPE, but I use sockets. If I just do psql -d db, all is as you've said,
but if I do psql -d db -h localhost the pictures is as following:

db=> select count(*) from pg_class;
 count
-------
 28531
(1 row)

db=> select count(*) from pg_class;
pqReadData() --  read() failed: errno=32
���������� �����
db=> select count(*) from pg_class;
FATAL 1:  The system is shutting down
NOTICE:  AbortTransaction and not in in-progress state
pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.

> regression=# select count(*) from pg_class;
>  count
> -------
>    260
> (1 row)
>
> < in another window, kill postgres backend >
>
> regression=# select count(*) from pg_class;
> FATAL 1:  The system is shutting down
> NOTICE:  AbortTransaction and not in in-progress state
> pqReadData() -- backend closed the channel unexpectedly.
>         This probably means the backend terminated abnormally
>         before or while processing the request.
> regression=#
>
> which looks pretty reasonable.
>
> I should also point out that in the current system, normal shutdown
> (via pg_ctl stop or 'kill' on the postmaster) produces no such result
> because extant backends are allowed to finish their sessions normally.

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: libpq error codes

From
Tom Lane
Date:
Denis Perchine <dyp@perchine.com> writes:
> And you use PIPE, but I use sockets. If I just do psql -d db, all is
> as you've said, but if I do psql -d db -h localhost the pictures is as
> following:

Works the same for me with either pipe or socket connection.  I think
something must be broken on your platform --- what platform are you
using, anyway?

> db=> select count(*) from pg_class;
> pqReadData() --  read() failed: errno=32
> ���������� �����

The two obvious questions about this are (a) what is errno 32 on
your system and (b) why is your strerror() yielding garbage instead
of an appropriate error message?

On my system errno 32 is EPIPE, but surely read() should never
return EPIPE.

            regards, tom lane

Re: libpq error codes

From
Denis Perchine
Date:
> Works the same for me with either pipe or socket connection.  I think
> something must be broken on your platform --- what platform are you
> using, anyway?

Linux. I've tested this with 2.2.15pre7 & 2.4.0test1-ac22-riel kernels.

> > db=> select count(*) from pg_class;
> > pqReadData() --  read() failed: errno=32
> > ���������� �����
>
> The two obvious questions about this are (a) what is errno 32 on
> your system and (b) why is your strerror() yielding garbage instead
> of an appropriate error message?

a. It's broken pipe
b. Sorry, it's (a) in russian.

> On my system errno 32 is EPIPE, but surely read() should never
> return EPIPE.

That's right... But what should read return if connection is closed by other side?

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

Re: libpq error codes

From
Tom Lane
Date:
Denis Perchine <dyp@perchine.com> writes:
>>>> db=> select count(*) from pg_class;
>>>> pqReadData() --  read() failed: errno=32
>>>> ���������� �����
>>
>> The two obvious questions about this are (a) what is errno 32 on
>> your system and (b) why is your strerror() yielding garbage instead
>> of an appropriate error message?

> a. It's broken pipe
> b. Sorry, it's (a) in russian.

Duh, should've figured it was something like that.  Didn't show up
as anything much on my display...

>> On my system errno 32 is EPIPE, but surely read() should never
>> return EPIPE.

> That's right... But what should read return if connection is closed by
> other side?

EOF, no more and no less.  It is not for the kernel to decide whether
the connection closure represents an application-level error or not.
Sounds like someone has managed to blow this badly in recent Linux TCP
stacks.  Care to file a kernel bug report?

In the meantime, it's probably reasonable for libpq to treat EPIPE from
read() the same as EOF --- if I recall correctly, it already tests for
ECONNRESET instead of EOF from kernels that have that variety of
braindamage, so adding a defense against this variety is fair game.
If you look in src/interfaces/libpq/fe-misc.c the places to fix should
be obvious (but note there are two or three of them, not just one).
Please try it out and submit a patch after you've verified it fixes
your problem.

            regards, tom lane