On Mon, 2021-08-23 at 15:34 +0800, Kelvin Lau wrote:
> I have been using Python to deal with CRUD of the database. I have discovered that there are
> some issues when dealing with long queries (either SELECT or COPY, since it is somewhat big data). The
> connection is dropped by the 2~3 hours mark and I have no idea what is wrong. There is no knowledge
> on how my workstation is connected to the server.
That sounds like a problem in your network; probably some ill-configured firewall or
router that drops idle connections.
> But I managed to work around the issue by putting a few parameters in psycopg2:
>
> > conn = psycopg2.connect(host=“someserver.hk”,
> > port=12345,
> > dbname=“ohdsi”,
> > user=“admin”,
> > password=“admin1”,
> > options="-c search_path="+schema,
> > # it seems the below lines are needed to keep the connection alive.
> > connect_timeout=10,
> > keepalives=1,
> > keepalives_idle=5,
> > keepalives_interval=2,
> > keepalives_count=5)
> It looks like that few keepalives* parameter kept the connection alive so the long queries can run day and night.
>
> The problem now is that, I am forced to use R and JDBC to deal with a bunch of codes, because there
> are a lot of analyses written in R. The issue that a long query would be dropped around the 2~3 hours
> mark showed up again in R/JDBC. How can I work around that?
>
> I have tried putting tcpKeepAlive=true in the link but it seems to have mixed results.
> Do I also have to put tcp_keepalives_interval or tcp_keepalives_count? What are some recommend
> values in these parameters?
You can set "tcp_keepalives_idle" on the database server, then the setting is independent
of the client used.
I would say that a setting of 5 seconds is way too low. Set it to 600 or so, that would
be 10 minutes.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com