Thread: Could not resolve host name error in psycopg2

Could not resolve host name error in psycopg2

From
derwin theduck
Date:
We have been getting this error intermittently (about once a week) in Django with channels since switching from a local database server to a hosted one:

could not translate host name "timescaledb" to address: Name or service not known

Connecting directly to the server with psycopg2 in the python interpreter works ok:

>>> import psycopg2 as pg
>>> pg.__version__
'2.7.7 (dt dec pq3 ext lo64)'
>>> conn = pg.connect(host='timescaledb', database=*, user=*, password=*)        >>> conn.server_version
110005

The resident database / network expert maintains that everything is ok on the networking side, and that django, once it establishes a connection, should not be attempting to reconnect.

The problem happens not on startup, but in the middle of the day after processing transactions without any issues for several hours.

Any idea how to go about troubleshooting / fixing this? :'(

 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py", line 276, in __iter__)
 (    self._fetch_all())
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py", line 1261, in _fetch_all)
 (    self._result_cache = list(self._iterable_class(self)))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py", line 57, in __iter__)
 (    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1135, in execute_sql)
 (    cursor = self.connection.cursor())
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner)
 (    return func(*args, **kwargs))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 260, in cursor)
 (    return self._cursor())
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 236, in _cursor)
 (    self.ensure_connection())
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner)
 (    return func(*args, **kwargs))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 220, in ensure_connection)
 (    self.connect())
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/utils.py", line 90, in __exit__)
 (    raise dj_exc_value.with_traceback(traceback) from exc_value)
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 220, in ensure_connection)
 (    self.connect())
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner)
 (    return func(*args, **kwargs))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", line 197, in connect)
 (    self.connection = self.get_new_connection(conn_params))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 24, in inner)
 (    return func(*args, **kwargs))
 (  File "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", line 185, in get_new_connection)
 (    connection = Database.connect(**conn_params))
 (  File "/usr/local/lib64/python3.7/site-packages/psycopg2/__init__.py", line 130, in connect)
 (    conn = _connect(dsn, connection_factory=connection_factory, **kwasync))
 (  could not translate host name "timescaledb" to address: Name or service not known)


Re: Could not resolve host name error in psycopg2

From
Adrian Klaver
Date:
On 4/16/20 5:38 PM, derwin theduck wrote:
> We have been getting this error intermittently (about once a week) in 
> Django with channels since switching from a local database server to a 
> hosted one:
> 
> could not translate host name "timescaledb" to address: Name or service 
> not known
> 
> Connecting directly to the server with psycopg2 in the python 
> interpreter works ok:
> 
>  >>> import psycopg2 as pg
>  >>> pg.__version__
> '2.7.7 (dt dec pq3 ext lo64)'
>  >>> conn = pg.connect(host='timescaledb', database=*, user=*, 
> password=*)        >>> conn.server_version
> 110005

Well I'm go out on a limb and say there is some sort of DNS resolution 
issue going on.

> 
> The resident database / network expert maintains that everything is ok 
> on the networking side, and that django, once it establishes a 
> connection, should not be attempting to reconnect.

Huh? Leaving open connections is not considered a good thing. In other 
words a connection should last for as long as it takes to get it's task 
done and then it should close.

> 
> The problem happens not on startup, but in the middle of the day after 
> processing transactions without any issues for several hours.
> 
> Any idea how to go about troubleshooting / fixing this? :'(
> 
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py", 
> line 276, in __iter__)
>   (    self._fetch_all())
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py", 
> line 1261, in _fetch_all)
>   (    self._result_cache = list(self._iterable_class(self)))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py", 
> line 57, in __iter__)
>   (    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, 
> chunk_size=self.chunk_size))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/sql/compiler.py", 
> line 1135, in execute_sql)
>   (    cursor = self.connection.cursor())
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", 
> line 260, in cursor)
>   (    return self._cursor())
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", 
> line 236, in _cursor)
>   (    self.ensure_connection())
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", 
> line 220, in ensure_connection)
>   (    self.connect())
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/utils.py", 
> line 90, in __exit__)
>   (    raise dj_exc_value.with_traceback(traceback) from exc_value)
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", 
> line 220, in ensure_connection)
>   (    self.connect())
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py", 
> line 197, in connect)
>   (    self.connection = self.get_new_connection(conn_params))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line 
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File 
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/postgresql/base.py", 
> line 185, in get_new_connection)
>   (    connection = Database.connect(**conn_params))
>   (  File 
> "/usr/local/lib64/python3.7/site-packages/psycopg2/__init__.py", line 
> 130, in connect)
>   (    conn = _connect(dsn, connection_factory=connection_factory, 
> **kwasync))
>   (  could not translate host name "timescaledb" to address: Name or 
> service not known)
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Could not resolve host name error in psycopg2

From
derwin theduck
Date:
Thank you, I've changed it to use the server's IP address since, so I'll wait to see if the error happens again.

On Fri, 17 Apr 2020 at 09:00, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 4/16/20 5:38 PM, derwin theduck wrote:
> We have been getting this error intermittently (about once a week) in
> Django with channels since switching from a local database server to a
> hosted one:
>
> could not translate host name "timescaledb" to address: Name or service
> not known
>
> Connecting directly to the server with psycopg2 in the python
> interpreter works ok:
>
>  >>> import psycopg2 as pg
>  >>> pg.__version__
> '2.7.7 (dt dec pq3 ext lo64)'
>  >>> conn = pg.connect(host='timescaledb', database=*, user=*,
> password=*)        >>> conn.server_version
> 110005

Well I'm go out on a limb and say there is some sort of DNS resolution
issue going on.

>
> The resident database / network expert maintains that everything is ok
> on the networking side, and that django, once it establishes a
> connection, should not be attempting to reconnect.

Huh? Leaving open connections is not considered a good thing. In other
words a connection should last for as long as it takes to get it's task
done and then it should close.

>
> The problem happens not on startup, but in the middle of the day after
> processing transactions without any issues for several hours.
>
> Any idea how to go about troubleshooting / fixing this? :'(
>
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py",
> line 276, in __iter__)
>   (    self._fetch_all())
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py",
> line 1261, in _fetch_all)
>   (    self._result_cache = list(self._iterable_class(self)))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/query.py",
> line 57, in __iter__)
>   (    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch,
> chunk_size=self.chunk_size))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/models/sql/compiler.py",
> line 1135, in execute_sql)
>   (    cursor = self.connection.cursor())
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py",
> line 260, in cursor)
>   (    return self._cursor())
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py",
> line 236, in _cursor)
>   (    self.ensure_connection())
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py",
> line 220, in ensure_connection)
>   (    self.connect())
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/utils.py",
> line 90, in __exit__)
>   (    raise dj_exc_value.with_traceback(traceback) from exc_value)
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py",
> line 220, in ensure_connection)
>   (    self.connect())
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/base/base.py",
> line 197, in connect)
>   (    self.connection = self.get_new_connection(conn_params))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/utils/asyncio.py", line
> 24, in inner)
>   (    return func(*args, **kwargs))
>   (  File
> "/home/dduck/.local/lib/python3.7/site-packages/django/db/backends/postgresql/base.py",
> line 185, in get_new_connection)
>   (    connection = Database.connect(**conn_params))
>   (  File
> "/usr/local/lib64/python3.7/site-packages/psycopg2/__init__.py", line
> 130, in connect)
>   (    conn = _connect(dsn, connection_factory=connection_factory,
> **kwasync))
>   (  could not translate host name "timescaledb" to address: Name or
> service not known)
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Could not resolve host name error in psycopg2

From
Paul Förster
Date:
Hi Adrian,

> On 17. Apr, 2020, at 03:00, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> Huh? Leaving open connections is not considered a good thing. In other words a connection should last for as long as
ittakes to get it's task done and then it should close. 

I basically agree on this, but there are two big "but"s:

- recurring monitoring connections flood the logs unless they connect and never disconnect again.

- applications with hundreds or thousands of users may flood the logs, even though a pool may be used. If said pool
doesn'tkeep its connections open most of the time you will notice that the database cluster is very busy logging
connections.

Do you really want that?

Cheers,
Paul


Re: Could not resolve host name error in psycopg2

From
Adrian Klaver
Date:
On 4/17/20 12:02 AM, Paul Förster wrote:
> Hi Adrian,
> 
>> On 17. Apr, 2020, at 03:00, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>>
>> Huh? Leaving open connections is not considered a good thing. In other words a connection should last for as long as
ittakes to get it's task done and then it should close.
 
> 
> I basically agree on this, but there are two big "but"s:
> 
> - recurring monitoring connections flood the logs unless they connect and never disconnect again.
> 
> - applications with hundreds or thousands of users may flood the logs, even though a pool may be used. If said pool
doesn'tkeep its connections open most of the time you will notice that the database cluster is very busy logging
connections.

But most pools can grow and shrink in response to demand, so at some 
point there are connect/disconnect cycles.

> 
> Do you really want that?

No. The issue at hand though was the idea that an application(Django in 
this case) would open a connection once and never reconnect. That is 
unrealistic.

> 
> Cheers,
> Paul
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Could not resolve host name error in psycopg2

From
Paul Förster
Date:
Hi Adrian,

> On 17. Apr, 2020, at 16:10, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> But most pools can grow and shrink in response to demand, so at some point there are connect/disconnect cycles.

yes, but it is a difference whether you see occasional growing and shrinking pool behavior, or the logs are flooded
withconnect/disconnect messages by short client connects. A relatively small number of connect/disconnect messages
comingfrom a pool is mostly acceptable while huge numbers are not. 

And as for monitoring applications, I would be a big fan of a parameter like log_exclude_users='user1,user2,...' to
listusernames who should not appear in log files. 

Cheers,
Paul