Re: DNS vs /etc/hosts - Mailing list pgsql-general

From Michael Fuhr
Subject Re: DNS vs /etc/hosts
Date
Msg-id 20050804223052.GA90539@winnie.fuhr.org
Whole thread Raw
In response to Re: DNS vs /etc/hosts  (Michael Fuhr <mike@fuhr.org>)
Responses Re: DNS vs /etc/hosts  (Michael Fuhr <mike@fuhr.org>)
Re: DNS vs /etc/hosts  (Lowell.Hought@faa.gov)
List pgsql-general
On Thu, Aug 04, 2005 at 04:01:43PM -0500, Lowell.Hought@faa.gov wrote:
> I also performed the trace you suggested.  The results are the same until
> this point, where the time for
> version 8.0 totals 0.025960 and for
>  version 7.2 totals 0.009481

Those differences probably don't matter, but what comes next does.

The 7.2 trace shows a DNS query to 10.32.104.5 for a name that
begins with zmpweb5.dms.ats.agl (the strace output is truncated
after that).  The DNS server responds with a packet of 142 bytes,
after which the process makes a TCP connection to 10.32.104.110:5432,
which is presumably the database server.

The 8.0 trace is different: it appears to make the same DNS query
to 10.32.104.5, but the response it receives is only 98 bytes (was
it in fact the same query?).  The process then makes a DNS query
to 10.32.104.5 for just zmpweb5, and that query times out after 5
seconds.  Then the process sends a query for zmpweb5 to 172.17.46.46,
which refuses the connection, possibly because no DNS server is
running on that machine.  We then see a query for zmpweb5 to
172.17.40.42, and that query times out after 6 seconds.  Then another
query for zmpweb5 to 10.32.104.5 and a 5-second timeout, a query
for zmpweb5 to 172.17.46.46 and a refused connection, and a query
for zmpweb5 to 172.17.40.42 and a 6-second timeout.  We then see
the process read /etc/hosts, but afterwards it makes another DNS
query to 10.32.104.5 for zmpweb5.dms.ats.agl.<truncated>, and this
time we see a 142-byte response, as 7.2 had received on its first
attempt.  Finally we see a TCP connection to 10.32.104.110:5432.

So why does 8.0 receive a 98-byte response to its first DNS query
when 7.2 received a 142-byte response?  We can tell a little something
about the responses by looking at the data in the strace output,
with the help of RFC 1035 Section 4.1.1.  In octal, the DNS response
headers are:

7.2  \260\5\205\200\0\1\0\1\0\2\0\2
8.0  \30\310\205\200\0\1\0\0\0\1\0\0

The response to 7.2 has an ANCOUNT (number of records in the answer
section) of 1 and an NSCOUNT (number of records in the authority
section) of 2, whereas the response to 8.0 has an ANCOUNT of 0 and
an NSCOUNT of 1.  That disparity is odd if the DNS queries were
indeed the same.

A few DNS queries with dig might show what's happening, and some
sniffer output of the DNS queries that psql makes might also be
enlightening.  Something like the following ought to do the trick:

tcpdump -s526 -n -vv udp and port 53

The -s526 option tells tcpdump to grab enough data for the largest
possible UDP DNS packet (512 octets) plus a bit extra for the layer 2
header.  It might be interesting to see the tcpdump output for psql
7.2's DNS queries and then 8.0's DNS queries (or use ethereal/tethereal
or another sniffer if you prefer, as long as we can see as much of the
DNS packets as possible).

BTW, some resolver libraries can be configured not to attempt DNS
queries for just "hostname" when "hostname.subdomain.domain" fails.
I seldom find such queries useful and I do occasionally find them
problematic, so if my resolver has such an option then I usually
enable it (e.g., "options no_tld_query" in /etc/resolv.conf on
FreeBSD).

--
Michael Fuhr

pgsql-general by date:

Previous
From: Frank Miles
Date:
Subject: What causes lock?
Next
From: Michael Fuhr
Date:
Subject: Re: DNS vs /etc/hosts