Thread: Re: Intermittent errors when fetching cursor rows on PostgreSQL 16
On 12/17/24 08:30, Enrico Schenone wrote: > Good day. > My name is Enrico Schenone, from Genoa, Italy. > I'm a software achitect working at Cleis Tech - Genoa - Italy - > http://gruppocleis.it > Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and > Ubuntu Server 22.04 LTS with no-cluster configuration. > Our applications are developed with 4Js Genero platform (classified as > LCAP) - https://4js.com > > I whish to report an issue where I can't say if it happens at server or > client side (or both as well). This: "unexpected EOF on client connection " makes me believe this is on client side. To be clear the client is running on Ubuntu Server 22.04, correct? Have you looked at the OS system log for relevant entries at the time the error occurs? If so what are they? This only happens in production environment, is there anything in it that is materially different from where you ran the test below? > Hoping you can help me or address to someone who can do it. > Thanks in advance. > Enrico > -- > > *Enrico Schenone* > Software Architect > -- Adrian Klaver adrian.klaver@aklaver.com
On 12/18/24 23:52, Enrico Schenone wrote: > Good day, Adrian. > First of all I thank-you for you answer. > At the time the error occurs, dozens of other SQL sessions are active & > running on DB server, and no-one is reporting any error at-all (not only > fetch errors). > This happens sometimes also in system with no (or low) stress situations. > > One of things I don't understand is why at client side I get the XX001 > error on the FETCH (normally the first fetch) while at server side I > heve no error related to the fetch forward ? Where are you fetching the client error messages from? > Another is why in the meantime no other client application report an > error, considering that there may be several parallel instances of the > same client application ? > And finally why after seconds or minutes the same process newly > instantiated works with no more errors ? Answers to this and the below is going to need the client code. > > I can suppose that the client closes the connection once got the XX001 > error, but I can't say why it receives this error while it is not > reported at server side and not block i/o error is reported. > Is it a false positive or what ? > > Four Js support said <We use the standard C API provided by the DB > vendor. In the case of PostgreSQL, we use the C API client - > https://www.postgresql.org/docs/current/libpq.html > > > At client side I have installed the following PostgreSQL packages ... > > postgresql-client-16:amd64/jammy-pgdg 16.5-1.pgdg22.04+1 > upgradeable to 16.6-1.pgdg22.04+1 > postgresql-client-common:all/jammy-pgdg 262.pgdg22.04+1 upgradeable > to 267.pgdg22.04+1 > > Best regards. > Enrico > > > *Enrico Schenone* > Software Architect > > *Cleis Tech s.r.l.* - www.gruppocleis.it > Sede di Genova, Via Paolo Emilio Bensa, 2 - 16124 Genova, ITALY > Tel: +39-0104071400 Fax: +39-0104073276 > Mobile: +39-320 7709352 > E-mail: eschenone@cleistech.it > > <https://gruppocleis.it><https://ibm.biz/BdqAJh> > > <https://ibm.biz/BdqAJh> > <https://ibm.biz/BdqAJh> > Il 19/12/24 00:11, Adrian Klaver ha scritto: >> On 12/17/24 08:30, Enrico Schenone wrote: >>> Good day. >>> My name is Enrico Schenone, from Genoa, Italy. >>> I'm a software achitect working at Cleis Tech - Genoa - Italy - >>> http://gruppocleis.it >>> Me and my team are using PostgreSQL v12 to v16 on Debian 10-12 and >>> Ubuntu Server 22.04 LTS with no-cluster configuration. >>> Our applications are developed with 4Js Genero platform (classified >>> as LCAP) - https://4js.com >>> >>> I whish to report an issue where I can't say if it happens at server >>> or client side (or both as well). >> >> This: >> >> "unexpected EOF on client connection " >> >> makes me believe this is on client side. >> >> To be clear the client is running on Ubuntu Server 22.04, correct? >> >> Have you looked at the OS system log for relevant entries at the time >> the error occurs? >> >> If so what are they? >> >> This only happens in production environment, is there anything in it >> that is materially different from where you ran the test below? >> >> >>> Hoping you can help me or address to someone who can do it. >>> Thanks in advance. >>> Enrico >>> -- >>> >>> *Enrico Schenone* >>> Software Architect >>> >> > -- Adrian Klaver adrian.klaver@aklaver.com
On 12/19/24 10:11, Enrico Schenone wrote: > Good day, Adrian. > I get the error inside the program by catching the exception and logging > it with diagnostic info provided by the DVM (a runtime interpreter > similar in concept to a JVM) that embed the PG driver. DVM is this?: https://www.geeksforgeeks.org/what-is-dvmdalvik-virtual-machine/ In other words an Android client? > I can't give you info on what the DVM does at low level, but I can send > you the distinct full session log fragment at server side, where it is > quite easy to understand how the DVM translates the program's SQL > queries end what PostgreSQL does. That might be useful. > May I give you any other info ? Not at the moment. > Do you think it can be useful to include in this thread the 4Js Suppory > guys ? I could see filing an issue and pointing at this thread: https://www.postgresql.org/message-id/446423eb-4a4e-4135-bbb8-4d0e5c7aac3b%40cleistech.it > > Thanks again and best regards. > Enrico > > -- Adrian Klaver adrian.klaver@aklaver.com
On 12/19/24 11:40 AM, Enrico Schenone wrote: > Hello, my answers in line along your message ... > Thanks a lot again. > > Enrico > >> On 12/19/24 10:11, Enrico Schenone wrote: >>> Good day, Adrian. >>> I get the error inside the program by catching the exception and >>> logging it with diagnostic info provided by the DVM (a runtime >>> interpreter similar in concept to a JVM) that embed the PG driver. >> > The 4Js DVM (Dynamic Virtual Machine) is that one > https://4js.com/online_documentation/fjs-gas-manual-html/index.html#gas-topics/c_gas_what_is_dvm.html > >> In other words an Android client? >> > No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS and > other unix-like OSs. It ensures the portability of 4Js Genero compiled > programs (p-code) on several OS platforms. > 4Js Genero is a Low Code Application Platform. The programming language, > named "BDL - Business Development Language", is an evolution of the > Informix-4gl. > Compiled programs needs a runtime interpreter (DVM) to be executed. > The DVM embeds at low-level the DB drivers provided by several vendors, From previous post you mentioned: "Four Js support said <We use the standard C API provided by the DB vendor. In the case of PostgreSQL, we use the C API client " So are they building their own driver over libpq? > and at BDL high level the application program can easily connect to the > major DBs on the market thanks to its ODI (Open Database Interface). >>> I can't give you info on what the DVM does at low level, but I can >>> send you the distinct full session log fragment at server side, where >>> it is quite easy to understand how the DVM translates the program's >>> SQL queries end what PostgreSQL does. >> >> That might be useful. >> > Please take a look to the attached text file, that is the full failing > session log (filtered from the debug5 PostgreSQL server log). This is where it falls off the rails, but I can't see why?: 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec, xact.c:5510 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOG: 00000: statement: fetch forward 50 from cu6 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOCATION: exec_simple_query, postgres.c:1073 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 DEBUG: 00000: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec, xact.c:5510 2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6 2024-12-16 17:27:14.407 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17908006676054e0.21cb42 LOG: 08006: could not receive data from client: Connessione interrotta dal corrispondente >>> Thanks again and best regards. >>> Enrico -- Adrian Klaver adrian.klaver@aklaver.com
At 19/12/24 22:47, Adrian Klaver wrote:
They wrote ...
<
This seems to match exactly with the error XX001 reported by the client application.
Enrico
I think so.
On 12/19/24 11:40 AM, Enrico Schenone wrote:Hello, my answers in line along your message ...
Thanks a lot again.
EnricoOn 12/19/24 10:11, Enrico Schenone wrote:Good day, Adrian.
I get the error inside the program by catching the exception and logging it with diagnostic info provided by the DVM (a runtime interpreter similar in concept to a JVM) that embed the PG driver.The 4Js DVM (Dynamic Virtual Machine) is that one https://4js.com/online_documentation/fjs-gas-manual-html/index.html#gas-topics/c_gas_what_is_dvm.htmlIn other words an Android client?No, it is a runtime interpreter for Linux, Windows, IBM AIX, macOS and other unix-like OSs. It ensures the portability of 4Js Genero compiled programs (p-code) on several OS platforms.
4Js Genero is a Low Code Application Platform. The programming language, named "BDL - Business Development Language", is an evolution of the Informix-4gl.
Compiled programs needs a runtime interpreter (DVM) to be executed.
The DVM embeds at low-level the DB drivers provided by several vendors,
From previous post you mentioned:
"Four Js support said <We use the standard C API provided by the DB
vendor. In the case of PostgreSQL, we use the C API client "
So are they building their own driver over libpq?
They wrote ...
<
The error “no connection to the server“ is definitively a PostgreSQL error:
./src/interfaces/libpq/fe-exec.c: libpq_append_conn_error(conn, "no connection to the server");
It is not normal that PostgreSQL client can connect to the server, do some SQL with success and then the SQL connection gets dropped at the next SQL statement execution. This is really suspicious.
>Yes, at 2024-12-16 17:27:14.407and at BDL high level the application program can easily connect to the major DBs on the market thanks to its ODI (Open Database Interface).Please take a look to the attached text file, that is the full failing session log (filtered from the debug5 PostgreSQL server log).I can't give you info on what the DVM does at low level, but I can send you the distinct full session log fragment at server side, where it is quite easy to understand how the DVM translates the program's SQL queries end what PostgreSQL does.
That might be useful.
This is where it falls off the rails, but I can't see why?:
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec, xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOG: 00000: statement: fetch forward 50 from cu6
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOCATION: exec_simple_query, postgres.c:1073
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 DEBUG: 00000: CommitTransaction(1) name: unnamed; blockState: STARTED; state: INPROGRESS, xid/subid/cid: 0/1/0
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 LOCATION: ShowTransactionStateRec, xact.c:5510
2024-12-16 17:27:14.406 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17900000676054e0.21cb42 STATEMENT: fetch forward 50 from cu6
2024-12-16 17:27:14.407 CET [2214722] cleistech@hh24odds_prod - 192.168.16.17908006676054e0.21cb42 LOG: 08006: could not receive data from client: Connessione interrotta dal corrispondente
This seems to match exactly with the error XX001 reported by the client application.
Best regards.Thanks again and best regards.
Enrico
Enrico
On 12/19/24 23:57, Enrico Schenone wrote: > At 19/12/24 22:47, Adrian Klaver wrote: >> >> >> So are they building their own driver over libpq? > I think so. > They wrote ... > < > > /The error “no connection to the server“ is definitively a PostgreSQL > error:/ > > /||/ > /|./src/interfaces/libpq/fe-exec.c: libpq_append_conn_error(conn, "no > connection to the server");|/ > > /It is not normal that PostgreSQL client can connect to the server, do > some SQL with success and then the SQL connection gets dropped at the > next SQL statement execution. This is really suspicious./ They must work in a perfect world where networks never fail. One that would render the blog below unneeded: https://www.cybertec-postgresql.com/en/tcp-keepalive-for-a-better-postgresql-experience/ > > > > >> > Best regards. > Enrico -- Adrian Klaver adrian.klaver@aklaver.com
On 12/20/24 07:02, Enrico Schenone wrote: > Hi, Adrian. > Today I have collected a tcpdump at client side with communications > between application server and db server while the issue was occurring > one time per second on another program. > I send you two files. > The first one is a zipped tarball (.tgz) containing a text > representation of the tcpdump starting at point where it reports the > declaration of the failing cursor ("cu4" as you can see in the first > line of the file) and subsequent fetch. Consider that the client > application log detected the XX001 error on the first FETCH of the > cursor at 2024-12-20 12:17:35.175 > The second file (zipped tarball .tgz) is too big to be sent as > attachment, so I provide a link where it can be downloaded. It is the > fraction of tcpdump recorded during the program failure (occurred > several times). It is in .pcap format so it is possible to open it with > Wireshark or tcpdump -A -r > Anyone interested can download it at > https://cleislabs.cleistech.it/downloads/tcpdump_out009.pcap.tgz > > Consider that during the dump several different cursor was declared with > the name "cu4", but the one failing is the one of the first line. > Maybe an expert (I'm not so expert) can see if the disconnection is > really made by the client and/or if the data returned by the server are > really corrupted as per XX001 SQLSTATE. This is beyond me, someone else will need to chime in. > > Best regards. > Enrico > > Il 19/12/24 22:47, Adrian Klaver ha scritto: -- Adrian Klaver adrian.klaver@aklaver.com
On 12/20/24 12:05, mark bradley wrote: > I'm getting a strange error message when I try to insert a date using > the view/edit grid in pgadmin. See below. I've tried quotes, no quotes > and various formats. The column type is clearly "date." Don't hijack a thread, start a new one. > > > Mark Brady, Ph.D. > Deputy Chief Data Officer, TRMC > _amazon.com/author/markjbrady <https://amazon.com/author/markjbrady>_ > ------------------------------------------------------------------------ -- Adrian Klaver adrian.klaver@aklaver.com