Re: Error: no connection to the server - Mailing list pgsql-interfaces
From | William B. Clay |
---|---|
Subject | Re: Error: no connection to the server |
Date | |
Msg-id | 76efa9df-ca28-59a6-35c6-2ff58210333c@acm.org Whole thread Raw |
In response to | Error: no connection to the server (Marco Bambini <marco@sqlabs.com>) |
List | pgsql-interfaces |
On 04/25/2016 09:58 AM, Marco Bambini wrote: > I have a multithreaded C client and sometimes I receive the "no connection to the server" error message. > I haven't found any documentation about it and about how to fix this issue. Marco, what interface to PGSQL are you using? I haven't seen this exact message coming from from libpq, so I wonder if it's coming from an intermediate SQL library that is calling PostgreSQL. Presuming your client established a session with the server, then this probably means you've lost network connectivity (link down; server/service down; TCP session abandoned due to an inactivity timer somewhere). If your client is intended to be long-running and reliable, it needs to recognize and recover from such outages. Below is an example of insert-only code that has reliably recovered from the inevitable temporary outages in transatlantic Internet connectivity (not to mention scheduled and unscheduled server bounces) while running 7x24 for several years. Good luck, Bill Clay Modena, Italy ==== code snippets ==== The "lost connection" detection snippet: rc = pselect(maxfds, &rfds, &rfds, &rfds, NULL, &nousr2_smask); /* what woke us up? */ if (rc < -1 || rc > 2 || (rc == -1 && errno != EINTR && errno != EBADF)) { /* unexpected result from pselect(): die immediately */ motion_log(LOG_ERR, 1, "Unexpected pgwrite.cpselect() result=%d; PGSQL writer terminating", rc); pgwcnt.quit = PGSQL_TERM_FORCE; return; /* things are totally broken, give up */ } else if (errno == EBADF) { /* invalid FD means DBMS session died; attempt recovery */ if( pgwrite_init() < 0 ) { /* unrecoverable error while awaiting session recovery; abort */ pgwcnt.quit = PGSQL_TERM_FORCE; return; } else { /* all transactions in flightfailed; don't wait for them to complete */ qelem_last = NULL; } } else if (rc>0) { /* PGSQL operation completion: invite PGSQL client library to */ /* process any pending work back from server */ if (!PQconsumeInput(pgwcnt.conn)) { /* did we lose the DBMS connection? */ if (PQstatus(pgwcnt.conn) != PGSQL_CONNECTION_OK) { /* attempt to open new DBMS session */ if ( pgwrite_init() < 0 ) { /* unrecoverableerror while awaiting session recovery; abort */ pgwcnt.quit = PGSQL_TERM_FORCE; return; } else { /* we lost all transactions in flight; don'texpect them to clear */ qelem_last = NULL; } } else { /* not sure what went wrong, keep trying */ motion_log(LOG_ERR, 0, "PQconsumeInput failed,pgwrite.c: %s.", PQerrorMessage(pgwcnt.conn)); } } else { /* check resultof just-completed operation */ The (re-)connection snippet: /** * * pgwrite_init: connect to PGSQL DBMS and PREPARE configured SQL statements * used for initial connectionand recovery of broken sessions * * on entry: * SIGUSR2 should be masked; we restore it on exit * * pgwcnt.mutexNOT held, except startup, when queue always empty * * returns: -1 on unrecoverable error * 0 on successfulfirst open attempt * +1 on successful open after failed sessions or open attempts * */ int pgwrite_init() { int ret = 0; // return value int retries = 0; // connection attempt count int qeaband = 0; // localcount abandoned queue elements int i, j; int so_val; // getsockopt() parms socklen_t so_len = sizeof(so_val); struct context *cnt0 = cnt[0]; // for easy reference if (pgwcnt.conn) { /* clean up failed connection attempt or broken session */ motion_log(LOG_ERR, 0, "PGSQLdatabase '%s' connection failed: %s", cnt0->conf.pgsql_db, PQerrorMessage(pgwcnt.conn)); pgwcnt.sessloss++; // increment lost session count retries++; PQfinish(pgwcnt.conn); pgwcnt.conn= NULL; } /* PGSQL writer timestamps are UTC, not local time */ setenv("PGTZ", "UTC", 1); pthread_sigmask(SIG_UNBLOCK, &usr2_smask, NULL); /* repeat DBMS session open attempt until success or terminationorder */ /* NB: PGSQL enum value CONNECTION_OK is redefined as PGSQL_CONNECTION_OK in motion.h */ while(PQstatus(pgwcnt.conn)!=PGSQL_CONNECTION_OK && pgwcnt.quit<PGSQL_TERM_REQ) { /* * Open session to DBMS. Note: * 1. ONE PGSQL connection is shared by all video threads; * 2. thus, only MAIN thread PGSQL connectionparameters are used. * 3. Parameter "dbname" may be a PostgreSQL-native connect string. * 4. Connectparms may be blank, null, or contain special * characters (including quotes and backslashes). *5. Changes in PGSQL connection parameters via Web UI after the * first PGSQL session open attempt are IGNORED. * 6. Changes in SQL statement text via Web UI after the first * successful session open are IGNORED. */ char pstring[6]; snprintf(pstring, sizeof(pstring), "%i", cnt0->conf.pgsql_port); if ( !(pgwcnt.conn = PQconnectdbParams( (const char *[]) {"dbname", "host", "user", "password","port", "sslmode", "application_name", "connect_timeout", "keepalives", "keepalives_idle","keepalives_interval", "keepalives_count", NULL}, (const char *[]) {cnt0->conf.pgsql_db, cnt0->conf.pgsql_host, cnt0->conf.pgsql_user, cnt0->conf.pgsql_password, pstring, "disable", "motion", PGSQL_KEEPALIVE_QSECS, "1", PGSQL_KEEPALIVE_QSECS, PGSQL_KEEPALIVE_QSECS, "1", NULL}, 0)) ) { motion_log(LOG_CRIT, 0, "LOGIC ERROR: PQconnectdbParams returned NULL"); ret = -1; } /* * PQconnectdbParams() promptly detects lack of connectivity, but the * obvious recovery API, PQreset(),does not block after a failed * connection attempt (at least on my system), causing an infinite * loop. Thus, this less elegant recovery for a failed connection. */ if (PQstatus(pgwcnt.conn)!=PGSQL_CONNECTION_OK){ /* connection attempt failed; first time? */ if (!retries){ /* allow video threads to proceed , even if no DB connection */ if (pthread_mutex_unlock(&pgwcnt.mutex)){ motion_log(LOG_ERR, 1, "PGSQL mutex unlock 2 failed, pgwrite.c;aborting"); ret = -1; break; } motion_log(LOG_ERR,0, "PGSQL attempting recovery of failed connect attempt: host '%s' database '%s' user '%s': %s", cnt0->conf.pgsql_host, cnt0->conf.pgsql_db, cnt0->conf.pgsql_user, PQerrorMessage(pgwcnt.conn)); pgwcnt.sessloss++; // increment lost session count } PQfinish(pgwcnt.conn); // gotta dothis even for failed connections pgwcnt.conn = NULL; retries ++; /* if too many INSERTsqueued while we retried, reduce queue */ qeaband += pgwrite_qflush(PGSQL_MAX_DISCONN_Q); /*is PGSQL transport via Unix socket? */ if (!getsockopt(PQsocket(pgwcnt.conn), SOL_SOCKET, SO_DOMAIN, &so_val,&so_len) && so_val == PF_LOCAL ) { /* yes, wait a while to retry */ sleep(PGSQL_KEEPALIVE_SECS); } else { /* no; we probably waited for connect_timeout, but just incase ... */ sleep(5); } } } /* we are here when a new DBMS session has opened or termination has been forced */ if (pgwcnt.conn && qeaband) { /* report work abandoned prior to session recovery */ motion_log(LOG_ERR, 0, "PGSQL writer abandoned %d insertspending link recovery", qeaband); qeaband = 0; } /* unless forced termination, PREPARE all configured SQL statements */ if (pgwcnt.conn && pgwcnt.quit<PGSQL_TERM_REQ){ /* set up file descriptor for PGSQL session */ pqfd = PQsocket(pgwcnt.conn); maxfds = pqfd+1; motion_log(LOG_INFO, 0, "PGSQL connected to host '%s' database '%s' as user '%s'", PQhost(pgwcnt.conn), PQdb(pgwcnt.conn),PQuser(pgwcnt.conn));
pgsql-interfaces by date: