Re: client waits for end of update operation and server proc is idle - Mailing list pgsql-general

From Matthias Apitz
Subject Re: client waits for end of update operation and server proc is idle
Date
Msg-id YIVYich/4L12jk3a@c720-r368166.fritz.box
Whole thread Raw
In response to Re: client waits for end of update operation and server proc is idle  (Matthias Apitz <guru@unixarea.de>)
Responses Re: client waits for end of update operation and server proc is idle  (Matthias Apitz <guru@unixarea.de>)
Re: client waits for end of update operation and server proc is idle  (Matthias Apitz <guru@unixarea.de>)
List pgsql-general
El día viernes, abril 23, 2021 a las 11:32:02a. m. +0200, Matthias Apitz escribió:

> El día viernes, abril 23, 2021 a las 10:48:24a. m. +0200, Laurenz Albe escribió:
> 
> > > The serverlog has around this time (sorry for German):
> > > 
> > > 2021-04-23 05:55:23.591 CEST [2317] LOG:  unvollständige Message vom Client
> > > 2021-04-23 05:55:23.593 CEST [2317] FEHLER:  Speicher aufgebraucht
> > > 2021-04-23 05:55:23.593 CEST [2317] DETAIL:  Kann Zeichenkettenpuffer mit 0 Bytes nicht um 1380986158 Bytes
vergrößern.
> > > 2021-04-23 05:55:23.593 CEST [2317] LOG:  konnte Daten nicht an den Client senden: Datenübergabe unterbrochen
(brokenpipe)
 
> > > 2021-04-23 05:55:23.593 CEST [2317] FATAL:  Verbindung zum Client wurde verloren
> > > 2021-04-23 05:57:20.339 CEST [27190] LOG:  unerwartetes EOF auf Client-Verbindung mit einer offenen Transaktion
> > > 2021-04-23 05:59:17.419 CEST [26827] LOG:  unerwartetes EOF auf Client-Verbindung mit einer offenen Transaktion
> > > 
> > > What else could be checked for this? Any hints about this issue?
> > 
> > It seems pretty clear that both the client and the server are waiting for each other.
> 
> Yes, they are both alive as processes. Both are running on the same
> Linux server, i.e. no real network between. And any loss of data would
> have caused perhaps the end of one (and then the other) process.
> I will try to gather a TCPDUMP for this.
> 

I hit the same situation again. The ESQL/C client sent an UPDATE to the
PostgreSQL server:

# gdb /opt/lib/sisis/catserver/bin/INDEX 31864
...
#0  0x00007f688bb955c4 in poll () from /lib64/libc.so.6
#1  0x00007f688a82622f in pqSocketPoll () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#2  0x00007f688a8260ff in pqSocketCheck () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#3  0x00007f688a825f83 in pqWaitTimed () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#4  0x00007f688a825f55 in pqWait () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#5  0x00007f688a821d3a in PQgetResult () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#6  0x00007f688a822478 in PQexecFinish () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#7  0x00007f688a82212e in PQexecParams () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5
#8  0x00007f688b792216 in ecpg_execute ()
   from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#9  0x00007f688b7935de in ecpg_do () from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#10 0x00007f688b7936ef in ECPGdo () from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6
#11 0x00007f688dd831c6 in update_record () at per_kette.pgc:1968
#12 0x00007f688dd7ed16 in per_kette (zugriff=10, scroll=-20000, lock=-20000, key=-20000, 
    sto=-20000, p_daten=0x7ffef0499950, sel_anw=0x0, where_anw=0x0, p_btw_daten=0x0, 
    order_by=0x0, auf_ab=0x0, group_by=0x0, having=0x0, into_temp=0x0, 
    count=0x7ffef0498670) at per_kette.pgc:497
#13 0x00007f688dcf770f in DB_rewr (tabmodul=0x7f688dd7e7ca <per_kette>, 
    p_daten=0x7ffef0499950) at dbcall.pgc:3274
#14 0x00007f688ea58821 in BKDeskInsert (setid=FSTAB_Personen, kette=..., 
    katkey=168952979) at ./BKDesk.c:1447


and the Pos server sits idle:

# select * from pg_stat_activity where client_port = 52288;

 datid  | datname |  pid  | leader_pid | usesysid | usename |        application_name         | client_addr |
client_hostname| client_port |          backend_start          |           xact_start            |
query_start          |          state_change           | wait_event_type | wait_event |        state        |
backend_xid| backend_xmin |        query        |  backend_type  
 

--------+---------+-------+------------+----------+---------+---------------------------------+-------------+-----------------+-------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+-----------------+------------+---------------------+-------------+--------------+---------------------+----------------
 951941 | sisis   | 31894 |            |    16384 | sisis   | SunRise DBCALL V7.2 (pid=31864) | 127.0.0.1   |
     |       52288 | 25.04.2021 10:57:15.922137 CEST | 25.04.2021 10:57:15.968081 CEST | 25.04.2021 10:57:16.051256
CEST| 25.04.2021 10:57:16.051273 CEST | Client          | ClientRead | idle in transaction |    18314796 |     18314796
|fetch per_kette_seq | client backend
 


But this time I was collecting all TCP traffic of all clients to and
from the port 5432. This TCPDUMP command writes all traffic as binary
data to files of 100.000.000 bytes in size (-C 100) and appends a number
when a new file is needed:

# /usr/sbin/tcpdump -n -i lo -C 100 -w /var/spool/sisis/tcp5432. port 5432 &

Later one can look into the files for the local port 52288 of the
connection with:

# tcpdump -n -r tcp5432.2 -Xx -s1024 port 52288 > INDEX.52288.txt


10:57:15.921783 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [S], seq 3652783347, win 65495, options [mss 65495,sackOK,TS
val3424445579 ecr 0,nop,wscale 7], length 0
 
    0x0000:  4500 003c b720 4000 4006 8599 7f00 0001  E..<..@.@.......
    0x0010:  7f00 0001 cc40 1538 d9b9 0cf3 0000 0000  .....@.8........
    0x0020:  a002 ffd7 fe30 0000 0204 ffd7 0402 080a  .....0..........
    0x0030:  cc1c e48b 0000 0000 0103 0307            ............
10:57:15.921792 IP 127.0.0.1.5432 > 127.0.0.1.52288: Flags [S.], seq 4241480135, ack 3652783348, win 65483, options
[mss65495,sackOK,TS val 3424445579 ecr 3424445579,nop,wscale 7], length 0
 
    0x0000:  4500 003c 0000 4000 4006 3cba 7f00 0001  E..<..@.@.<.....
    0x0010:  7f00 0001 1538 cc40 fccf d9c7 d9b9 0cf4  .....8.@........
    0x0020:  a012 ffcb fe30 0000 0204 ffd7 0402 080a  .....0..........
    0x0030:  cc1c e48b cc1c e48b 0103 0307            ............
10:57:15.921799 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [.], ack 1, win 512, options [nop,nop,TS val 3424445579 ecr
3424445579],length 0
 
    0x0000:  4500 0034 b721 4000 4006 85a0 7f00 0001  E..4.!@.@.......
    0x0010:  7f00 0001 cc40 1538 d9b9 0cf4 fccf d9c8  .....@.8........
    0x0020:  8010 0200 fe28 0000 0101 080a cc1c e48b  .....(..........
    0x0030:  cc1c e48b                                ....



And here you see the last command (the UPDATE) sent to the Pos server:

10:57:16.051388 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq 14407:14812, ack 120304, win 512, options
[nop,nop,TSval 3424445708 ecr 3424445708], length 405
 
    0x0000:  4500 01c9 b84f 4000 4006 82dd 7f00 0001  E....O@.@.......
    0x0010:  7f00 0001 cc40 1538 d9b9 453a fcd1 afb7  .....@.8..E:....
    0x0020:  8018 0200 ffbd 0000 0101 080a cc1c e50c  ................
    0x0030:  cc1c e50c 5000 0000 7e00 7570 6461 7465  ....P...~.update
    0x0040:  2070 6572 5f6b 6574 7465 2073 6574 2064  .per_kette.set.d
    0x0050:  6573 6b6e 7220 3d20 2431 2020 2c20 6672  esknr.=.$1..,.fr
    0x0060:  6569 203d 2024 3220 202c 206d 6f64 6966  ei.=.$2..,.modif
    0x0070:  203d 2024 3320 202c 2061 6e7a 203d 2024  .=.$3..,.anz.=.$
    0x0080:  3420 202c 206b 6174 6c69 7374 203d 2024  4..,.katlist.=.$
    0x0090:  3520 2077 6865 7265 2063 7572 7265 6e74  5..where.current
    0x00a0:  206f 6620 6863 5f70 6572 5f6b 6574 7465  .of.hc_per_kette
    0x00b0:  0000 0042 0000 00ff 0000 0000 0005 0000  ...B............
    0x00c0:  0007 3133 3834 3131 3800 0000 0130 0000  ..1384118....0..
    0x00d0:  0001 3000 0000 0232 3600 0000 d25c 7865  ..0....26....\xe
    0x00e0:  6539 6530 3130 3031 6561 6630 3130 3035  e9e01001eaf01005
    0x00f0:  3761 6630 3130 3030 3563 6530 3130 3038  7af010005ce01008
    0x0100:  3264 3230 3130 3064 3466 3830 3130 3038  2d20100d4f801008
    0x0110:  6530 6130 3230 3061 6633 3430 3230 3032  e0a0200af3402002
    0x0120:  3533 6130 3230 3061 3538 6530 3230 3061  53a0200a58e0200a
    0x0130:  6538 6530 3230 3034 3461 3630 3230 3033  e8e020044a602003
    0x0140:  6561 3730 3230 3032 3066 3130 3330 3032  ea7020020f103002
    0x0150:  3466 3130 3330 3035 3431 3730 3430 3061  4f1030054170400a
    0x0160:  3034 3330 3430 3033 6636 3730 3430 3062  04304003f670400b
    0x0170:  6138 3930 3430 3033 6265 6430 3030 3030  a8904003bed00000
    0x0180:  3534 3930 3130 3039 3235 3230 3130 3037  5490100925201007
    0x0190:  3435 3430 3130 3038 3336 3030 3130 3038  4540100836001008
    0x01a0:  3238 3330 3130 3039 3330 3431 3230 6100  28301009304120a.
    0x01b0:  0100 0044 0000 0006 5000 4500 0000 0900  ...D....P.E.....
    0x01c0:  0000 0000 5300 0000 04                   ....S....
10:57:16.051395 IP 127.0.0.1.5432 > 127.0.0.1.52288: Flags [.], ack 14812, win 512, options [nop,nop,TS val 3424445708
ecr3424445708], length 0
 
    0x0000:  4500 0034 453c 4000 4006 f785 7f00 0001  E..4E<@.@.......
    0x0010:  7f00 0001 1538 cc40 fcd1 afb7 d9b9 46cf  .....8.@......F.
    0x0020:  8010 0200 fe28 0000 0101 080a cc1c e50c  .....(..........
    0x0030:  cc1c e50c                                ....

The send buffer is complete, the bytea string \x ....00 is terminated
by a NULL byte. TCP ACK'ed the package and then no further reaction of
the Pos server on this connection. It still sits there if someone needs
more information about this connection.

From the collected data, I tend to say: this must be a bug in the Pos
server...

Any ideas?

    matthias
-- 
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub



pgsql-general by date:

Previous
From: Mohan Radhakrishnan
Date:
Subject: Re: BRIN index on timestamptz
Next
From: Malik Rumi
Date:
Subject: impact of version upgrade on fts