Thread: pg_hba.conf && ident ...
has anyone played with/tested this in v7.0? I'm investigating the hanging problem, and it just happened ... when I do an lsof on the process, it shows these two: postgres 4969 pgsql 5u IPv4 0xd4631500 0t0 TCP pgsql.tht.net:5432->smaug.vex.net:61189 (ESTABLISHED) postgres 4969 pgsql 8u IPv4 0xd46300c0 0t0 TCP pgsql.tht.net:1046->smaug.vex.net:auth (ESTABLISHED) it doesn't appear to lock it up every time though ... this time it *eventually* came back again, but, afterwards, if you do another lsof, there is one more line with that "can't read inpcb..." error on it ... i pg_hba.conf, that host has: host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser And its the only time we have ident being used ... right now, its the only theory I ahve to work with ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > i pg_hba.conf, that host has: > host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser > And its the only time we have ident being used ... > right now, its the only theory I ahve to work with ... Bingo. All your cores show the thing waiting inside the ident code: (gdb) bt #0 0x18263890 in recvfrom () from /usr/lib/libc.so.4 #1 0x1825062b in recv () from /usr/lib/libc.so.4 #2 0x80ad4d0 in ident (remote_ip_addr={s_addr = 508067544}, local_ip_addr={ s_addr = 56131288}, remote_port=27631, local_port=14357, ident_failed=0xbfbfeeef "�\004\023 \b,\207\024\b\212\217(\030\223���\203\204|�\n\b�\214+\0304P", ident_username=0xbfbfeef0 "\004\023 \b,\207\024\b\212\217(\030\223���\203\204|�\n\b�\214+\0304P") at hba.c:635 #3 0x80ad912 in authident (raddr=0x82011ac, laddr=0x8201140, postgres_username=0x8201261 "db", auth_arg=0x8201304 "sameuser") at hba.c:869 #4 0x80ac5b9 in be_recvauth (port=0x8201000) at auth.c:523 #5 0x80e0c4a in readStartupPacket (arg=0x8201000, len=292, pkt=0x820101c) at postmaster.c:1214 #6 0x80aeb67 in PacketReceiveFragment (port=0x8201000) at pqpacket.c:102 #7 0x80e08ad in ServerLoop () at postmaster.c:982 #8 0x80e039a in PostmasterMain (argc=13, argv=0xbfbffbc4) at postmaster.c:723 #9 0x80aee43 in main (argc=13, argv=0xbfbffbc4) at main.c:93 #10 0x8063393 in _start () Looking at the code, there doesn't seem to be any defense against a broken ident server --- there is no timeout or anything being used here! Ugh. Has it always been like this? Anyway, I think the immediate fix for you is to stop using ident auth for that host, at least till we can improve this code... regards, tom lane
On Wed, 10 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > i pg_hba.conf, that host has: > > host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser > > And its the only time we have ident being used ... > > right now, its the only theory I ahve to work with ... > > Bingo. All your cores show the thing waiting inside the ident code: > > (gdb) bt > #0 0x18263890 in recvfrom () from /usr/lib/libc.so.4 > #1 0x1825062b in recv () from /usr/lib/libc.so.4 > #2 0x80ad4d0 in ident (remote_ip_addr={s_addr = 508067544}, local_ip_addr={ > s_addr = 56131288}, remote_port=27631, local_port=14357, > ident_failed=0xbfbfeeef "�\004\023 \b,\207\024\b\212\217(\030\223���\203\204|�\n\b�\214+\0304P", > ident_username=0xbfbfeef0 "\004\023 \b,\207\024\b\212\217(\030\223���\203\204|�\n\b�\214+\0304P") at hba.c:635 > #3 0x80ad912 in authident (raddr=0x82011ac, laddr=0x8201140, > postgres_username=0x8201261 "db", auth_arg=0x8201304 "sameuser") > at hba.c:869 > #4 0x80ac5b9 in be_recvauth (port=0x8201000) at auth.c:523 > #5 0x80e0c4a in readStartupPacket (arg=0x8201000, len=292, pkt=0x820101c) > at postmaster.c:1214 > #6 0x80aeb67 in PacketReceiveFragment (port=0x8201000) at pqpacket.c:102 > #7 0x80e08ad in ServerLoop () at postmaster.c:982 > #8 0x80e039a in PostmasterMain (argc=13, argv=0xbfbffbc4) at postmaster.c:723 > #9 0x80aee43 in main (argc=13, argv=0xbfbffbc4) at main.c:93 > #10 0x8063393 in _start () > > Looking at the code, there doesn't seem to be any defense against a > broken ident server --- there is no timeout or anything being used here! > Ugh. Has it always been like this? > > Anyway, I think the immediate fix for you is to stop using ident auth > for that host, at least till we can improve this code... Once I started scanning with lsof and saw the auth stuff, I clued in and we disabled the ident stuff ... looking at your backtrace above, I should have clued in sooner, as I *saw* the ident on line 2, but didn't *see* it :( Thanks ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Tom Lane writes: > The Hermit Hacker <scrappy@hub.org> writes: > > i pg_hba.conf, that host has: > > host trends_acctng 216.126.72.30 255.255.255.255 ident sameuser > > And its the only time we have ident being used ... > > right now, its the only theory I ahve to work with ... > > Bingo. All your cores show the thing waiting inside the ident code: [...] > Looking at the code, there doesn't seem to be any defense against a > broken ident server --- there is no timeout or anything being used here! > Ugh. Has it always been like this? > > Anyway, I think the immediate fix for you is to stop using ident auth > for that host, at least till we can improve this code... I came across this problem a year and a half ago. In my case, the problem was that the client was connecting more than the default limit of 40 times per minute so inetd was suspending the auth/identd service. I raised the limit by changing to "nowait.500" and that problem went away. I'd thought that I'd fixed PostgreSQL itself too but looking back in my mail logs I can only find my patch which fixes the problem with sending ident requests from a server with an IP alias. I may have forgotten to send in the patch (or even to write one) for the "ident synchronous in postmaster" problem itself. Sorry. I'll look harder. --Malcolm -- Malcolm Beattie <mbeattie@sable.ox.ac.uk> Unix Systems Programmer Oxford University Computing Services
On Wed, 10 May 2000, Jan Wieck wrote: > Tom Lane wrote: > > Bingo. All your cores show the thing waiting inside the ident code: > > > > [...] > > > > Looking at the code, there doesn't seem to be any defense against a > > broken ident server --- there is no timeout or anything being used here! > > Ugh. Has it always been like this? > > > > Anyway, I think the immediate fix for you is to stop using ident auth > > for that host, at least till we can improve this code... > > Looks like the entire communication with a new client is > handled in a nonblocking manner via select(2) in > ServerLoop(). I think the ident lookup belongs to there too, > and this improvement isn't something for a quick hack. It > takes a little longer to be well tested. > > Let's try it for 7.0.1 or 7.0.2. Clearly is a bugfix IMHO. > > Also we might think about using some kind of timeout after > which a new connection should either get rejected or succeeds > in backend start. Just to prevent a bogus client from > creating a forever dangling connection. Cool, our first DOS :)
Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes: > I'd thought that I'd fixed PostgreSQL itself too but looking > back in my mail logs I can only find my patch which fixes the problem > with sending ident requests from a server with an IP alias. I may have > forgotten to send in the patch (or even to write one) for the "ident > synchronous in postmaster" problem itself. Sorry. I'll look harder. Yes, I see your alias patch in there, but that doesn't have anything to do with the problem of a nonresponding ident server. I agree with Jan that a really good fix would allow the postmaster to return to its outer event loop while waiting for the ident response. It'd be a nontrivial rewrite though... anyone use ident enough to want to tackle it? regards, tom lane
Tom Lane writes: > Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes: > > I'd thought that I'd fixed PostgreSQL itself too but looking > > back in my mail logs I can only find my patch which fixes the problem > > with sending ident requests from a server with an IP alias. I may have > > forgotten to send in the patch (or even to write one) for the "ident > > synchronous in postmaster" problem itself. Sorry. I'll look harder. > > Yes, I see your alias patch in there, but that doesn't have anything to > do with the problem of a nonresponding ident server. I agree with Jan > that a really good fix would allow the postmaster to return to its outer > event loop while waiting for the ident response. It'd be a nontrivial > rewrite though... anyone use ident enough to want to tackle it? It looks like the whole pg_hba thing isn't really designed to be asynchronous or event-driven. A cheap and cheerful fix would be to replace the blocking connect/send/recv in ident() in hba.c with foo_timeout ones (for foo one of connect/send/recv). Basically, set O_NONBLOCK on the socket with fcntl and have foo_timeout() do ... FD_SET(ourfd, &fds); tv.tv_sec = TIMEOUT; foo(...); if (select(ourfd+1, &fds, &fds, 0, &tv) == -1)return -1; return foo(...); At least you then have an upper bound of about 3*TIMEOUT on how long the postmaster is busy. It would still be susceptible to a denial of service attack though. The other option would be an alarm() timeout which could wrap the entire ident process but doing alarms portably and safely is weird on some platforms depending on what else is going on at the time. --Malcolm -- Malcolm Beattie <mbeattie@sable.ox.ac.uk> Unix Systems Programmer Oxford University Computing Services
Malcolm Beattie <mbeattie@sable.ox.ac.uk> writes: > It looks like the whole pg_hba thing isn't really designed to be > asynchronous or event-driven. Nope, the module would need a pretty thorough rewrite ... > A cheap and cheerful fix would be to > replace the blocking connect/send/recv in ident() in hba.c with > foo_timeout ones (for foo one of connect/send/recv). That was what I was thinking too, unless we find a volunteer to do the bigger job. I don't particularly care to spend that much time on this problem myself. regards, tom lane