Thread: Mail setup broken (still/again?)
I just sent an email to -advocacy where I spelled the list name wrong. I did not get a bounce. Why? Because the mail system is (again or still) broken in it's config. What happens is: 1) Tries to deliver to svr1.postgresql.org. This machine response that the user is unknown, *but does so with a 450 error code indicating that this is a temporary error*. This is of course wrong, it should be responding with 550. 1b) Also, that machine is supposedly named "postgresql.org" for some reason that I don't really understand. But the MX rcord still points to svr1, which is an alias. (I don't say this should be fixed, because I don't see the point in not calling the machine svr1, but that's probably just because I've forgotten the reason for it :-P But either way, it should be consistent) 2) Since it got a 450, it tries a secondary MX, in this case mx3.hub.org. Now: 2a) Why do we even bother with secondary MXes since all they do is relay back to svr1 anyway? It doesn't actualliy *help* us anything that i can see, it only makes the configuration more complex. 2b) If we do relay, then the secondary MX must *also* know the list of users, so it can give a proper bounce. What happens now is that my email is queued up on mx3.hub.org and will stay there as it retries over and over for a couple of days, when the bounce will be generated on that system. 2c) mx3 is then *graylisted* by svr1. A backup MX must *NOT* be graylisted by the primary machine. I know I have mentioned this several times before wrt other machines. 3) At the risk of soundling like a real broken record again, we really need *some kind of basic documentation* of this system. Our mail infrastructure is critical to the project. We simply cannot afford having one that looks like this. My suggestion is the same as before - de-couple it from the hub.org infrastructure, thus making things a lot simpler and less likely to break. If this is for some reason not acceptable, can we please, and *urgently*, have these issues listed above fixed? //Magnus
On Tue, Oct 16, 2007 at 10:52:09AM +0200, Magnus Hagander wrote: > 1) Tries to deliver to svr1.postgresql.org. This machine response that the > user is unknown, *but does so with a 450 error code indicating that this > is a temporary error*. This is of course wrong, it should be responding > with 550. This does appear to be an error. > 1b) Also, that machine is supposedly named "postgresql.org" for some reason > that I don't really understand. But the MX rcord still points to svr1, > which is an alias. (I don't say this should be fixed, because I don't see Nit: it's not an alias; if it were (i.e. a CNAME or probably a DNAME), it would be an error, because MX records can't point to CNAMEs. It's just another name for the same address, which is perfectly acceptable. It does seem a little baroque. > 2a) Why do we even bother with secondary MXes since all they do is relay > back to svr1 anyway? It doesn't actualliy *help* us anything that i can > see, it only makes the configuration more complex. If svr1 goes down, the secondaries queue up the mail. This is a robust answer, and a good one, because it prevents mail from getting queued up all over the Internet. > 2b) If we do relay, then the secondary MX must *also* know the list of > users, so it can give a proper bounce. No. The classic way of relaying through a secondary that you normally don't use except in emergency is just to relay everything that arrives on the secondary. When it arrives at the restrictive server, that server bounces. You get into trouble from the soft error you're encountering, because if the user isn't in the map and that server is the final destination, then you really do need to bounce and be done. > 2c) mx3 is then *graylisted* by svr1. A backup MX must *NOT* be graylisted > by the primary machine. I know I have mentioned this several times before > wrt other machines. Absolutely. > 3) At the risk of soundling like a real broken record again, we really > need *some kind of basic documentation* of this system. +1 > infrastructure, thus making things a lot simpler and less likely to break. I don't think removing the secondaries makes things less likely to break; mail servers are easily overwhelmed and can produce soft errors all the time. Having store-and-forward secondaries is an extremely good idea, and I'd hate to see that feature abandones. Your other remarks are right on the money, though. A -- Andrew Sullivan | ajs@crankycanuck.ca A certain description of men are for getting out of debt, yet are against all taxes for raising money to pay it off. --Alexander Hamilton
On Tue, Oct 16, 2007 at 10:07:50AM -0400, Andrew Sullivan wrote: > On Tue, Oct 16, 2007 at 10:52:09AM +0200, Magnus Hagander wrote: <snip> > > 2a) Why do we even bother with secondary MXes since all they do is relay > > back to svr1 anyway? It doesn't actualliy *help* us anything that i can > > see, it only makes the configuration more complex. > > If svr1 goes down, the secondaries queue up the mail. This is a > robust answer, and a good one, because it prevents mail from getting > queued up all over the Internet. Sure, but does it help us in any way at all? Why do we care where the mail is queued up, reall? > > 2b) If we do relay, then the secondary MX must *also* know the list of > > users, so it can give a proper bounce. > > No. The classic way of relaying through a secondary that you > normally don't use except in emergency is just to relay everything > that arrives on the secondary. When it arrives at the restrictive > server, that server bounces. You get into trouble from the soft > error you're encountering, because if the user isn't in the map and > that server is the final destination, then you really do need to > bounce and be done. If we reject it on the secondary MX, we'll be creating a whole bunch of bounces for invalid addresses that spammers sent to. If our secondary MX can just drop them, that never happens since they get a reject at the SMTP protocol level. > > infrastructure, thus making things a lot simpler and less likely to break. > > I don't think removing the secondaries makes things less likely to > break; mail servers are easily overwhelmed and can produce soft > errors all the time. Having store-and-forward secondaries is an > extremely good idea, and I'd hate to see that feature abandones. > Your other remarks are right on the money, though. Well, we clearly don't entirely agree. But if we *do* want store-and-forward secondaries, I would propose we use dedicated ones so we can do things like push our own userlists over there. //Magnus
On Tue, Oct 16, 2007 at 05:02:48PM +0200, Magnus Hagander wrote: > > Sure, but does it help us in any way at all? Why do we care where the mail > is queued up, reall? We can't control the policies on all those servers, and some of them may not queue as long as we like. Also, it's polite to have more than one mail server, and not force others to queue mail when you have an outage. This is part of the reason one has more than one MX possible, after all. > If we reject it on the secondary MX, we'll be creating a whole bunch of > bounces for invalid addresses that spammers sent to. If our secondary MX > can just drop them, that never happens since they get a reject at the SMTP > protocol level. You mustn't _ever_ "just drop them". Yes, I know people are doing that instead of bouncing, but it's wrong, bad, evil, and completely in contradiction of the totally plain MUSTs in the relevant RFCs. I think you meant refuse, though, which is a different matter. It's not actually hard to rsync the user map among the various servers using postfix (I do it myself), so that seems to me to be an alternative, yes. And that can be done with multiple user lists. There is another thing we could do, BTW, to try to reduce the spam-induced bounces, and still have multiple servers in place. What you do is add an MX with priority 0 that always gives a soft error. Most spambots won't try the next MX, so your "real" MX (with, say, priority 1) doesn't get the spam attempt. A -- Andrew Sullivan | ajs@crankycanuck.ca Never get involved in litigation. Your hair will fall out, your bones will turn to sand. And it will still be going on. --Tom Waits
On Tue, Oct 16, 2007 at 11:22:46AM -0400, Andrew Sullivan wrote: > On Tue, Oct 16, 2007 at 05:02:48PM +0200, Magnus Hagander wrote: > > > > Sure, but does it help us in any way at all? Why do we care where the mail > > is queued up, reall? > > We can't control the policies on all those servers, and some of them > may not queue as long as we like. Also, it's polite to have more > than one mail server, and not force others to queue mail when you > have an outage. This is part of the reason one has more than one MX > possible, after all. > > > If we reject it on the secondary MX, we'll be creating a whole bunch of > > bounces for invalid addresses that spammers sent to. If our secondary MX > > can just drop them, that never happens since they get a reject at the SMTP > > protocol level. > > You mustn't _ever_ "just drop them". Yes, I know people are doing > that instead of bouncing, but it's wrong, bad, evil, and completely > in contradiction of the totally plain MUSTs in the relevant RFCs. I meant reject, not drop. But it's better for us to reject them at the SMTP level than it is to generate our own bounce. > I think you meant refuse, though, which is a different matter. It's > not actually hard to rsync the user map among the various servers > using postfix (I do it myself), so that seems to me to be an > alternative, yes. And that can be done with multiple user lists. As do I, so yeah, it's fairly simple. But if you have to interface with an external system (in this case, hub.org) that makes things a lot more complex quickly. > There is another thing we could do, BTW, to try to reduce the > spam-induced bounces, and still have multiple servers in place. What > you do is add an MX with priority 0 that always gives a soft error. > Most spambots won't try the next MX, so your "real" MX (with, say, > priority 1) doesn't get the spam attempt. Hmm. Interesting idea :) But I'm not sure how big of a problem that part really is. //Magnus
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On Tuesday, October 16, 2007 10:07:50 -0400 Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Tue, Oct 16, 2007 at 10:52:09AM +0200, Magnus Hagander wrote: >> 1) Tries to deliver to svr1.postgresql.org. This machine response that the >> user is unknown, *but does so with a 450 error code indicating that this >> is a temporary error*. This is of course wrong, it should be responding >> with 550. > > This does appear to be an error. Changed ... postfix's default is 550, but we had: unknown_local_recipient_reject_code = 450 Funny thing is, looking on google, this isn't that particularly unusual: "On my Fedora Core box the following settings and comments were in main.cf # The default setting is 550 (reject mail) but it is safer to start # with 450 (try again later) until you are certain that your # local_recipient_maps settings are OK. # #unknown_local_recipient_reject_code = 550 unknown_local_recipient_reject_code = 450"- <http://www.webservertalk.com/message1926259.html> I suspect that that is what I did when I set things up originally, went with the 'safter to start with 450', but didn't go back and change it to 550 ... > Nit: it's not an alias; if it were (i.e. a CNAME or probably a > DNAME), it would be an error, because MX records can't point to > CNAMEs. It's just another name for the same address, which is > perfectly acceptable. It does seem a little baroque. Fixed, now MX 0 == mail.postgresql.org, also an A record ... >> 2c) mx3 is then *graylisted* by svr1. A backup MX must *NOT* be graylisted >> by the primary machine. I know I have mentioned this several times before >> wrt other machines. > > Absolutely. Fixed ... I had missed the mx3 IP in the mynetworks file on postgresql.org ... the other 3 mx servers should never have been affected, only the offsite one ... - ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHFOC24QvfyHIvDvMRAtV+AJ9DuZgoPCeCtSRXtD28R1TIr8EKMgCgqEQF VtDfSmsYdb+MSfW8wZPnX9U= =lT2N -----END PGP SIGNATURE-----
Marc G. Fournier wrote: > > > --On Tuesday, October 16, 2007 10:07:50 -0400 Andrew Sullivan > <ajs@crankycanuck.ca> wrote: > >> On Tue, Oct 16, 2007 at 10:52:09AM +0200, Magnus Hagander wrote: >>> 1) Tries to deliver to svr1.postgresql.org. This machine response that the >>> user is unknown, *but does so with a 450 error code indicating that this >>> is a temporary error*. This is of course wrong, it should be responding >>> with 550. >> This does appear to be an error. > > Changed ... postfix's default is 550, but we had: Great. > unknown_local_recipient_reject_code = 450 > > Funny thing is, looking on google, this isn't that particularly unusual: <snip> Yeah, IIRC there was a recommendation when this feature was added to postfix that you do this. And it is a good recommendation - you just have to remember to switch it back to 550 when you've tested :-) (yes, I've forgotten that on my servers as well a couple of times) >> Nit: it's not an alias; if it were (i.e. a CNAME or probably a >> DNAME), it would be an error, because MX records can't point to >> CNAMEs. It's just another name for the same address, which is >> perfectly acceptable. It does seem a little baroque. > > Fixed, now MX 0 == mail.postgresql.org, also an A record ... Good. >>> 2c) mx3 is then *graylisted* by svr1. A backup MX must *NOT* be graylisted >>> by the primary machine. I know I have mentioned this several times before >>> wrt other machines. >> Absolutely. > > Fixed ... I had missed the mx3 IP in the mynetworks file on postgresql.org ... > the other 3 mx servers should never have been affected, only the offsite one ... Ah, that's why it reappeared. Thanks for the quick fixes! //Magnus
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On Tuesday, October 16, 2007 18:46:28 +0200 Magnus Hagander <magnus@hagander.net> wrote: > Thanks for the quick fixes! Thanks for pointing them out, specially the 450 -> 550 change ... should reduce some of the repeat hits: 4902 <BritneysBoudreaux@postgresql.org>:933 <CarrolljeffreyTyler@postgresql.org>:876 <EvanoctagonBall@postgresql.org>:863<ErikasegregatePhipps@postgresql.org>:848 <JeremiahanthraciteCurry@postgresql.org>:845<JenniehabitualMeade@postgresql.org>:805 <KipmoliereRocha@postgresql.org>:801<GwendolynsolonWeston@postgresql.org>:793 <AngeliqueshipwreckZapata@postgresql.org>:788<OlliedaggerJuarez@postgresql.org>: And that was just today :) - ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHFO4V4QvfyHIvDvMRAgZpAJ9G51zLMcTF/N4j4sw7d7XMsZkMKwCfa/OJ zVvSh5mpuQe0L2oLYN5kbdM= =ycaf -----END PGP SIGNATURE-----
Marc G. Fournier wrote: > > > --On Tuesday, October 16, 2007 18:46:28 +0200 Magnus Hagander > <magnus@hagander.net> wrote: > > >> Thanks for the quick fixes! > > Thanks for pointing them out, specially the 450 -> 550 change ... should reduce > some of the repeat hits: > > 4902 <BritneysBoudreaux@postgresql.org>: > 933 <CarrolljeffreyTyler@postgresql.org>: > 876 <EvanoctagonBall@postgresql.org>: > 863 <ErikasegregatePhipps@postgresql.org>: > 848 <JeremiahanthraciteCurry@postgresql.org>: > 845 <JenniehabitualMeade@postgresql.org>: > 805 <KipmoliereRocha@postgresql.org>: > 801 <GwendolynsolonWeston@postgresql.org>: > 793 <AngeliqueshipwreckZapata@postgresql.org>: > 788 <OlliedaggerJuarez@postgresql.org>: > > And that was just today :) What? Several of those aliases *should* exist, no? ;-) I wonder how the hell they come up with those... I mean, what's the percentage that they exist at all. I can understand those that fake bill@ and joe@ and such addresses, but this... //Magnus
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On Tuesday, October 16, 2007 19:01:26 +0200 Magnus Hagander <magnus@hagander.net> wrote: > Marc G. Fournier wrote: >> >> >> --On Tuesday, October 16, 2007 18:46:28 +0200 Magnus Hagander >> <magnus@hagander.net> wrote: >> >> >>> Thanks for the quick fixes! >> >> Thanks for pointing them out, specially the 450 -> 550 change ... should >> reduce some of the repeat hits: >> >> 4902 <BritneysBoudreaux@postgresql.org>: >> 933 <CarrolljeffreyTyler@postgresql.org>: >> 876 <EvanoctagonBall@postgresql.org>: >> 863 <ErikasegregatePhipps@postgresql.org>: >> 848 <JeremiahanthraciteCurry@postgresql.org>: >> 845 <JenniehabitualMeade@postgresql.org>: >> 805 <KipmoliereRocha@postgresql.org>: >> 801 <GwendolynsolonWeston@postgresql.org>: >> 793 <AngeliqueshipwreckZapata@postgresql.org>: >> 788 <OlliedaggerJuarez@postgresql.org>: >> >> And that was just today :) > > What? Several of those aliases *should* exist, no? ;-) > > I wonder how the hell they come up with those... I mean, what's the > percentage that they exist at all. I can understand those that fake > bill@ and joe@ and such addresses, but this... I don't know, but the first one with 4902 attempts comes from mx3.cc.teu.ac.jp ... so its not even coming from random 'English' sites :) We've had 48535 distinct names attempted 647009 times so far today ... and two of them were for bill@postgresql.org, and none for joe :) - ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHFO/+4QvfyHIvDvMRAnv4AKDVrabAldtlxsT3mesy2mBte2rQqwCeO7rV P9wUG7piSThK69RXma4XPQ8= =PUfH -----END PGP SIGNATURE-----
On Tue, 16 Oct 2007 14:08:14 -0300 "Marc G. Fournier" <scrappy@hub.org> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > - --On Tuesday, October 16, 2007 19:01:26 +0200 Magnus Hagander > <magnus@hagander.net> wrote: > > > Marc G. Fournier wrote: > >> > >> > >> --On Tuesday, October 16, 2007 18:46:28 +0200 Magnus Hagander > >> <magnus@hagander.net> wrote: > >> > >> > >>> Thanks for the quick fixes! > >> > >> Thanks for pointing them out, specially the 450 -> 550 change ... > >> should reduce some of the repeat hits: > >> > >> 4902 <BritneysBoudreaux@postgresql.org>: > >> 933 <CarrolljeffreyTyler@postgresql.org>: > >> 876 <EvanoctagonBall@postgresql.org>: > >> 863 <ErikasegregatePhipps@postgresql.org>: > >> 848 <JeremiahanthraciteCurry@postgresql.org>: > >> 845 <JenniehabitualMeade@postgresql.org>: > >> 805 <KipmoliereRocha@postgresql.org>: > >> 801 <GwendolynsolonWeston@postgresql.org>: > >> 793 <AngeliqueshipwreckZapata@postgresql.org>: > >> 788 <OlliedaggerJuarez@postgresql.org>: > >> > >> And that was just today :) > > > > What? Several of those aliases *should* exist, no? ;-) > > > > I wonder how the hell they come up with those... I mean, what's the > > percentage that they exist at all. I can understand those that fake > > bill@ and joe@ and such addresses, but this... > > I don't know, but the first one with 4902 attempts comes from > ... so its not even coming from random 'English' > sites :) So why haven't we just blocked mx3.cc.teu.ac.jp at the firewall? Joshua D. Drake > > We've had 48535 distinct names attempted 647009 times so far > today ... and two of them were for bill@postgresql.org, and none for > joe :) > > > - ---- > Marc G. Fournier Hub.Org Networking Services > (http://www.hub.org) Email . > scrappy@hub.org MSN . scrappy@hub.org > Yahoo . yscrappy Skype: hub.org ICQ . 7615664 > -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) > > iD8DBQFHFO/+4QvfyHIvDvMRAnv4AKDVrabAldtlxsT3mesy2mBte2rQqwCeO7rV > P9wUG7piSThK69RXma4XPQ8= > =PUfH > -----END PGP SIGNATURE----- > > > ---------------------------(end of > broadcast)--------------------------- TIP 7: You can help support the > PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On Tue, Oct 16, 2007 at 10:13:30AM -0700, Joshua D. Drake wrote: > > So why haven't we just blocked mx3.cc.teu.ac.jp at the firewall? > Just because an MX happens to initiate spam doesn't mean it's not also a legitimate mail source. On a private domain, I would have no difficulty blocking it, but on a community service like this, it seems dodgier to me, A -- Andrew Sullivan | ajs@crankycanuck.ca This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie
> --On Tuesday, October 16, 2007 18:46:28 +0200 Magnus Hagander > <magnus@hagander.net> wrote: >>> 4902 <BritneysBoudreaux@postgresql.org>: >>> 933 <CarrolljeffreyTyler@postgresql.org>: >>> 876 <EvanoctagonBall@postgresql.org>: >>> 863 <ErikasegregatePhipps@postgresql.org>: >>> 848 <JeremiahanthraciteCurry@postgresql.org>: >>> 845 <JenniehabitualMeade@postgresql.org>: >>> 805 <KipmoliereRocha@postgresql.org>: >>> 801 <GwendolynsolonWeston@postgresql.org>: >>> 793 <AngeliqueshipwreckZapata@postgresql.org>: >>> 788 <OlliedaggerJuarez@postgresql.org>: >> I wonder how the hell they come up with those... I mean, what's the >> percentage that they exist at all. I can understand those that fake >> bill@ and joe@ and such addresses, but this... I've seen a fair amount of spam that has forged return addresses that look like those. I'm thinking that spammer A's name-generator has fooled spammer B's address-harvester ... regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On Tuesday, October 16, 2007 10:13:30 -0700 "Joshua D. Drake" <jd@commandprompt.com> wrote: > So why haven't we just blocked mx3.cc.teu.ac.jp at the firewall? Several reasons, but foremost ... until Magnus pointed out the need to change 450 -> 550, nobody knew they were hitting us ... but, I just checked the log file, and that is the only address that that host has tried to get through, so would be a bit overkill to block one host for retrying one address repeatedly because we didn't tell them it was a permanent failure vs temporary one, no? :) - ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHFPOm4QvfyHIvDvMRAlOpAJ4sYXvzcv4yV/q6sionUvxmUjzxnwCdEw1J Lg5S/QQ321e7t/F2cMZHL/I= =lhKG -----END PGP SIGNATURE-----
On Tue, Oct 16, 2007 at 05:31:33PM +0200, Magnus Hagander wrote: > > I meant reject, not drop. But it's better for us to reject them at the SMTP > level than it is to generate our own bounce. Yeah, that can be more efficient. > As do I, so yeah, it's fairly simple. But if you have to interface with an > external system (in this case, hub.org) that makes things a lot more > complex quickly. Well, not a _lot_ more complex, I think. We just need maps special to each domain, and while I've never done it before, the examples seem to me to suggest it's straightforward. I think the trick would be to design the postgresql.org site the same way as the hub site, and then leave parts out on postgresql.org, rather than the other way around. It _is_ more complex, but with careful choices of include-file names, it oughta be managable? (Especially if we have some documentation ;-) > Hmm. Interesting idea :) But I'm not sure how big of a problem that part > really is. We get a remarkable number of spam on the lists I moderate, at least, and none of those are -general. A -- Andrew Sullivan | ajs@crankycanuck.ca