Help debugging a hung postgresql client connection - Mailing list pgsql-general

From Venkatraju T.V.
Subject Help debugging a hung postgresql client connection
Date
Msg-id 1177341415.409948.299590@n59g2000hsh.googlegroups.com
Whole thread Raw
Responses Re: Help debugging a hung postgresql client connection  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

We have a Python application that spawns multiple threads the write
the
the database. We are seeing an intermittent problem where one of the
threads will "block" seemingly for ever. We saw a similar problem
twice
with the same signature in a C++ based program as well (but that
seems harder to reproduce).

This is PostgreSQL 8.1.8-1 on a Fedora Core 5 system (x86_64 arch).
The
PostgreSQL library used is the Python pgdb library. The

The backtrace for the client program is:
#0  0x000000321e6c3086 in poll () from /lib64/libc.so.6
#1  0x0000003379c0f775 in pqSocketCheck (conn=0x61f070, forRead=1,
forWrite=0, end_time=-1) at fe-misc.c:1039
#2  0x0000003379c0f870 in pqWaitTimed (forRead=Variable "forRead" is
not available.
) at fe-misc.c:913
#3  0x0000003379c0e792 in PQgetResult (conn=0x61f070) at fe-exec.c:
1186
#4  0x0000003379c0e86e in PQexecFinish (conn=0x61f070) at fe-exec.c:
1415
#5  0x00002aaaae91b341 in pgsource_execute (self=0x2aaaaea4da50,
args=Variable "args" is not available.
) at pgmodule.c:534
#6  0x00000032283954a0 in PyEval_EvalFrame () from /usr/lib64/
libpython2.4.so.1.0

The process is connected to the database on the local machine. The
connection
information in ps says "idle in transaction".

The backtrace for the postgresql backend process at that time:
(gdb) bt
#0  0x000000321e6cc5e5 in recv () from /lib64/libc.so.6
#1  0x00000000005004f6 in secure_read (port=0xa0e0f0, ptr=0x7dc8c0,
len=8192) at /usr/include/bits/socket2.h:35
#2  0x0000000000506114 in pq_recvbuf () at pqcomm.c:697
#3  0x0000000000506537 in pq_getbyte () at pqcomm.c:738
#4  0x00000000005688b7 in PostgresMain (argc=4, argv=0x9ef7d8,
username=0x9ef7a0 "user") at postgres.c:289
#5  0x00000000005422fb in ServerLoop () at postmaster.c:2865
#6  0x0000000000543234 in PostmasterMain (argc=5, argv=0x9ec590) at
postmaster.c:941
#7  0x0000000000507ebe in main (argc=5, argv=0x1a) at main.c:265

There does not appear to be any deadlock:
select * from pg_locks;
   locktype    | database | relation | page | tuple | transactionid |
classid | objid | objsubid | transaction |  pid  |      mode       |
granted -----------+----------+----------+------+-------
+---------------+---------+-------+----------+-------------+-------
+-----------------+---------nsactionid |          |          |
|       |      74260015 |         |       |          |    74260015 |
16855 | ExclusiveLock   | t
 relation      |   217529 |     1247 |      |       |
|         |       |          |    74260015 | 16855 | AccessShareLock |
t
 transactionid |          |          |      |       |      74531807
|         |       |          |    74531807 | 10377 | ExclusiveLock   |
t
 relation      |   217529 |    10342 |      |       |
|         |       |          |    74695161 | 22791 | AccessShareLock |
t
 transactionid |          |          |      |       |      74695161
|         |       |          |    74695161 | 22791 | ExclusiveLock   |
t
 relation      |   217529 |   218724 |      |       |
|         |       |          |    74260015 | 16855 | AccessShareLock |
t
(6 rows) \q

libpq is built with thread-safety as far as I can tell -
#define ENABLE_THREAD_SAFETY 1
in /usr/include/pg_config_x86_64.h

Any suggestions for what else I can try to narrow down the problem? I
found
a couple of similar problems on the archives, but no solution that I
could apply.

Please let me know if there is any other information that can be
collected to
help debug this and I will collect it the next time the problem is
see. Any
suggestions appreciated.

Thanks in advance,
Venkat


pgsql-general by date:

Previous
From: shieldy
Date:
Subject: The directory of the postgresql source
Next
From: Niederland
Date:
Subject: Re: Mass Update