Thread: pg_hba.conf && ident ...

pg_hba.conf && ident ...

From
The Hermit Hacker
Date:
has anyone played with/tested this in v7.0?  I'm investigating the hanging
problem, and it just happened ... when I do an lsof on the process, it
shows these two:

postgres 4969 pgsql    5u  IPv4 0xd4631500      0t0    TCP pgsql.tht.net:5432->smaug.vex.net:61189 (ESTABLISHED)
postgres 4969 pgsql    8u  IPv4 0xd46300c0      0t0    TCP pgsql.tht.net:1046->smaug.vex.net:auth (ESTABLISHED)

it doesn't appear to lock it up every time though ... this time it
*eventually* came back again, but, afterwards, if you do another lsof,
there is one more line with that "can't read inpcb..." error on it ...

i pg_hba.conf, that host has:

host    trends_acctng   216.126.72.30   255.255.255.255 ident sameuser

And its the only time we have ident being used ... 

right now, its the only theory I ahve to work with ... 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: pg_hba.conf && ident ...

From
Tom Lane
Date:
The Hermit Hacker <scrappy@hub.org> writes:
> i pg_hba.conf, that host has:
> host    trends_acctng   216.126.72.30   255.255.255.255 ident sameuser
> And its the only time we have ident being used ... 
> right now, its the only theory I ahve to work with ... 

Bingo.  All your cores show the thing waiting inside the ident code:

(gdb) bt
#0  0x18263890 in recvfrom () from /usr/lib/libc.so.4
#1  0x1825062b in recv () from /usr/lib/libc.so.4
#2  0x80ad4d0 in ident (remote_ip_addr={s_addr = 508067544}, local_ip_addr={     s_addr = 56131288}, remote_port=27631,
local_port=14357,   ident_failed=0xbfbfeeef "�\004\023 \b,\207\024\b\212\217(\030\223���\203￿\204￿|�\n\b�\214+\0304￿P",
  ident_username=0xbfbfeef0 "\004\023 \b,\207\024\b\212\217(\030\223���\203￿\204￿|�\n\b�\214+\0304￿P") at hba.c:635
 
#3  0x80ad912 in authident (raddr=0x82011ac, laddr=0x8201140,    postgres_username=0x8201261 "db", auth_arg=0x8201304
"sameuser")  at hba.c:869
 
#4  0x80ac5b9 in be_recvauth (port=0x8201000) at auth.c:523
#5  0x80e0c4a in readStartupPacket (arg=0x8201000, len=292, pkt=0x820101c)   at postmaster.c:1214
#6  0x80aeb67 in PacketReceiveFragment (port=0x8201000) at pqpacket.c:102
#7  0x80e08ad in ServerLoop () at postmaster.c:982
#8  0x80e039a in PostmasterMain (argc=13, argv=0xbfbffbc4) at postmaster.c:723
#9  0x80aee43 in main (argc=13, argv=0xbfbffbc4) at main.c:93
#10 0x8063393 in _start ()

Looking at the code, there doesn't seem to be any defense against a
broken ident server --- there is no timeout or anything being used here!
Ugh.  Has it always been like this?

Anyway, I think the immediate fix for you is to stop using ident auth
for that host, at least till we can improve this code...
        regards, tom lane


Re: pg_hba.conf && ident ...

From
The Hermit Hacker
Date:
On Wed, 10 May 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > i pg_hba.conf, that host has:
> > host    trends_acctng   216.126.72.30   255.255.255.255 ident sameuser
> > And its the only time we have ident being used ... 
> > right now, its the only theory I ahve to work with ... 
> 
> Bingo.  All your cores show the thing waiting inside the ident code:
> 
> (gdb) bt
> #0  0x18263890 in recvfrom () from /usr/lib/libc.so.4
> #1  0x1825062b in recv () from /usr/lib/libc.so.4
> #2  0x80ad4d0 in ident (remote_ip_addr={s_addr = 508067544}, local_ip_addr={
>       s_addr = 56131288}, remote_port=27631, local_port=14357, 
>     ident_failed=0xbfbfeeef "�\004\023 \b,\207\024\b\212\217(\030\223���\203￿\204￿|�\n\b�\214+\0304￿P", 
>     ident_username=0xbfbfeef0 "\004\023 \b,\207\024\b\212\217(\030\223���\203￿\204￿|�\n\b�\214+\0304￿P") at
hba.c:635
> #3  0x80ad912 in authident (raddr=0x82011ac, laddr=0x8201140, 
>     postgres_username=0x8201261 "db", auth_arg=0x8201304 "sameuser")
>     at hba.c:869
> #4  0x80ac5b9 in be_recvauth (port=0x8201000) at auth.c:523
> #5  0x80e0c4a in readStartupPacket (arg=0x8201000, len=292, pkt=0x820101c)
>     at postmaster.c:1214
> #6  0x80aeb67 in PacketReceiveFragment (port=0x8201000) at pqpacket.c:102
> #7  0x80e08ad in ServerLoop () at postmaster.c:982
> #8  0x80e039a in PostmasterMain (argc=13, argv=0xbfbffbc4) at postmaster.c:723
> #9  0x80aee43 in main (argc=13, argv=0xbfbffbc4) at main.c:93
> #10 0x8063393 in _start ()
> 
> Looking at the code, there doesn't seem to be any defense against a
> broken ident server --- there is no timeout or anything being used here!
> Ugh.  Has it always been like this?
> 
> Anyway, I think the immediate fix for you is to stop using ident auth
> for that host, at least till we can improve this code...

Once I started scanning with lsof and saw the auth stuff, I clued in and
we disabled the ident stuff ... looking at your backtrace above, I should
have clued in sooner, as I *saw* the ident on line 2, but didn't *see* it
:(

Thanks ...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: pg_hba.conf && ident ...

From
Malcolm Beattie
Date:
Tom Lane writes:
> The Hermit Hacker <scrappy@hub.org> writes:
> > i pg_hba.conf, that host has:
> > host    trends_acctng   216.126.72.30   255.255.255.255 ident sameuser
> > And its the only time we have ident being used ... 
> > right now, its the only theory I ahve to work with ... 
> 
> Bingo.  All your cores show the thing waiting inside the ident code:
[...]
> Looking at the code, there doesn't seem to be any defense against a
> broken ident server --- there is no timeout or anything being used here!
> Ugh.  Has it always been like this?
> 
> Anyway, I think the immediate fix for you is to stop using ident auth
> for that host, at least till we can improve this code...

I came across this problem a year and a half ago. In my case, the
problem was that the client was connecting more than the default limit
of 40 times per minute so inetd was suspending the auth/identd service.
I raised the limit by changing to "nowait.500" and that problem went
away. I'd thought that I'd fixed PostgreSQL itself too but looking
back in my mail logs I can only find my patch which fixes the problem
with sending ident requests from a server with an IP alias. I may have
forgotten to send in the patch (or even to write one) for the "ident
synchronous in postmaster" problem itself. Sorry. I'll look harder.

--Malcolm

-- 
Malcolm Beattie <mbeattie@sable.ox.ac.uk>
Unix Systems Programmer
Oxford University Computing Services


Re: pg_hba.conf && ident ...

From
The Hermit Hacker
Date:
On Wed, 10 May 2000, Jan Wieck wrote:

> Tom Lane wrote:
> > Bingo.  All your cores show the thing waiting inside the ident code:
> >
> > [...]
> >
> > Looking at the code, there doesn't seem to be any defense against a
> > broken ident server --- there is no timeout or anything being used here!
> > Ugh.  Has it always been like this?
> >
> > Anyway, I think the immediate fix for you is to stop using ident auth
> > for that host, at least till we can improve this code...
> 
>     Looks  like  the  entire  communication  with a new client is
>     handled  in   a   nonblocking   manner   via   select(2)   in
>     ServerLoop().  I think the ident lookup belongs to there too,
>     and this improvement isn't something for  a  quick  hack.  It
>     takes a little longer to be well tested.
> 
>     Let's try it for 7.0.1 or 7.0.2. Clearly is a bugfix IMHO.
> 
>     Also  we  might  think about using some kind of timeout after
>     which a new connection should either get rejected or succeeds
>     in  backend  start.  Just  to  prevent  a  bogus  client from
>     creating a forever dangling connection.

Cool, our first DOS :)




Re: pg_hba.conf && ident ...

From
Tom Lane
Date:
Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes:
> I'd thought that I'd fixed PostgreSQL itself too but looking
> back in my mail logs I can only find my patch which fixes the problem
> with sending ident requests from a server with an IP alias. I may have
> forgotten to send in the patch (or even to write one) for the "ident
> synchronous in postmaster" problem itself. Sorry. I'll look harder.

Yes, I see your alias patch in there, but that doesn't have anything to
do with the problem of a nonresponding ident server.  I agree with Jan
that a really good fix would allow the postmaster to return to its outer
event loop while waiting for the ident response.  It'd be a nontrivial
rewrite though... anyone use ident enough to want to tackle it?
        regards, tom lane


Re: pg_hba.conf && ident ...

From
Malcolm Beattie
Date:
Tom Lane writes:
> Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes:
> > I'd thought that I'd fixed PostgreSQL itself too but looking
> > back in my mail logs I can only find my patch which fixes the problem
> > with sending ident requests from a server with an IP alias. I may have
> > forgotten to send in the patch (or even to write one) for the "ident
> > synchronous in postmaster" problem itself. Sorry. I'll look harder.
> 
> Yes, I see your alias patch in there, but that doesn't have anything to
> do with the problem of a nonresponding ident server.  I agree with Jan
> that a really good fix would allow the postmaster to return to its outer
> event loop while waiting for the ident response.  It'd be a nontrivial
> rewrite though... anyone use ident enough to want to tackle it?

It looks like the whole pg_hba thing isn't really designed to be
asynchronous or event-driven. A cheap and cheerful fix would be to
replace the blocking connect/send/recv in ident() in hba.c with
foo_timeout ones (for foo one of connect/send/recv). Basically, set
O_NONBLOCK on the socket with fcntl and have foo_timeout() do   ...   FD_SET(ourfd, &fds);   tv.tv_sec = TIMEOUT;
foo(...);  if (select(ourfd+1, &fds, &fds, 0, &tv) == -1)return -1;   return foo(...);
 
At least you then have an upper bound of about 3*TIMEOUT on how long
the postmaster is busy. It would still be susceptible to a denial of
service attack though. The other option would be an alarm() timeout
which could wrap the entire ident process but doing alarms portably
and safely is weird on some platforms depending on what else is going
on at the time.

--Malcolm

-- 
Malcolm Beattie <mbeattie@sable.ox.ac.uk>
Unix Systems Programmer
Oxford University Computing Services


Re: pg_hba.conf && ident ...

From
Tom Lane
Date:
Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes:
> It looks like the whole pg_hba thing isn't really designed to be
> asynchronous or event-driven.

Nope, the module would need a pretty thorough rewrite ...

> A cheap and cheerful fix would be to
> replace the blocking connect/send/recv in ident() in hba.c with
> foo_timeout ones (for foo one of connect/send/recv).

That was what I was thinking too, unless we find a volunteer to do
the bigger job.  I don't particularly care to spend that much time
on this problem myself.
        regards, tom lane