Thread: Bogus duplicate-message complaints from PG mail lists

Bogus duplicate-message complaints from PG mail lists

From
Tom Lane
Date:
Several times over the past couple of days, I have gotten complaints
like the one attached about duplicate message IDs.  I can see from my
sendmail daemon's log that I sent only one copy of this, and the
complaint is dated about twenty minutes after the fact, so I'm pretty
sure the duplicate is not of my making.

Usually when I get this type of thing I can tell from the Received:
lines that it's the fault of some list subscriber's MUA re-submitting
a message to the lists.  But the Received: lines in this example and
the last couple don't show that the message has gone anywhere outside
postgresql.org.  So I'm thinking it's a recently-introduced glitch in
the mail list server arrangements.

            regards, tom lane

------- Forwarded Message

Received: from malur.postgresql.org (malur.postgresql.org [217.196.149.56])
    by sss.pgh.pa.us (8.14.5/8.14.5) with ESMTP id r0FKV40w002187
    for <tgl@sss.pgh.pa.us>; Tue, 15 Jan 2013 15:31:04 -0500 (EST)
Received: from localhost ([127.0.0.1] helo=postgresql.org)
    by malur.postgresql.org with esmtp (Exim 4.72)
    (envelope-from <pgsql-hackers-owner@postgresql.org>)
    id 1TvD9z-00056W-US
    for tgl@sss.pgh.pa.us; Tue, 15 Jan 2013 20:31:03 +0000
MIME-Version: 1.0
X-Mailer: MIME-tools 5.428 (Entity 5.428)
Date: Tue, 15 Jan 2013 20:31:03 +0000
From: pgsql-hackers-owner@postgresql.org
To: Tom Lane <tgl@sss.pgh.pa.us>
Subject: Denied post to pgsql-hackers
Content-Type: multipart/mixed; boundary="----------=_1358281863-9659-3"
Message-ID: <c0d7609df0037c0e90c84e3266a9588ea777b9f6@postgresql.org>

This is a multi-part message in MIME format...

------------=_1358281863-9659-3
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
Content-Language: en

Your message to the pgsql-hackers list has been denied
for the following reason(s):

A message was previous posted with this Message-ID
Duplicate Message-ID - <1723.1358280662@sss.pgh.pa.us> (Tue Jan 15 20:11:10 2013)
Duplicate Message Checksum (Tue Jan 15 20:11:10 2013)
Duplicate Partial Message Checksum (Tue Jan 15 20:11:10 2013)


------------=_1358281863-9659-3
Content-Type: message/rfc822
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Description: Original message

Received: from makus.postgresql.org ([98.129.198.125])
    by malur.postgresql.org with esmtp (Exim 4.72)
    (envelope-from <tgl@sss.pgh.pa.us>)
    id 1TvD9z-00056R-Hi
    for pgsql-hackers@postgresql.org; Tue, 15 Jan 2013 20:31:03 +0000
Received: from sss.pgh.pa.us ([66.207.139.130])
    by makus.postgresql.org with esmtp (Exim 4.72)
    (envelope-from <tgl@sss.pgh.pa.us>)
    id 1TvCqf-0005ve-W6
    for pgsql-hackers@postgresql.org; Tue, 15 Jan 2013 20:11:09 +0000
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
    by sss.pgh.pa.us (8.14.5/8.14.5) with ESMTP id r0FKB2UL001724;
    Tue, 15 Jan 2013 15:11:03 -0500 (EST)
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Bruce Momjian <bruce@momjian.us>
cc: Robert Haas <robertmhaas@gmail.com>, Simon Riggs <simon@2ndquadrant.com>,
        Jeff Janes <jeff.janes@gmail.com>,
        pgsql-hackers <pgsql-hackers@postgresql.org>
Subject: Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles
In-reply-to: <20130115194639.GG27934@momjian.us>
References: <CA+U5nMLjf2-kTa4-AR-0XLKKwbc+=_fb4237i_UAWYzowW+-1Q@mail.gmail.com> <8539.1357513385@sss.pgh.pa.us>
<12791.1357580151@sss.pgh.pa.us><CA+U5nMKbOGVfQXfJi5_vOUPEatF_V_+e_HX4P5R=tb9JSo2ceA@mail.gmail.com>
<13842.1357583258@sss.pgh.pa.us><13967.1357866454@sss.pgh.pa.us>
<CA+TgmoYQ6Nq-tpHiDPCUH3CkH2N9D67=oDKJtLxuRRC=dRteSQ@mail.gmail.com><23869.1358184197@sss.pgh.pa.us>
<CA+Tg!moa+wzu9RBUK75veRn6UTWjSZZJa2aOjfvn0LD1_mx+rRg@mail.gmail.com><24605.1358186197@sss.pgh.pa.us>
<20130115194639.GG27934@momjian.us>
Comments: In-reply-to Bruce Momjian <bruce@momjian.us>
    message dated "Tue, 15 Jan 2013 14:46:39 -0500"
Date: Tue, 15 Jan 2013 15:11:02 -0500
Message-ID: <1723.1358280662@sss.pgh.pa.us>
X-Pg-Spam-Score: -1.9 (-)

Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Jan 14, 2013 at 12:56:37PM -0500, Tom Lane wrote:
>> Remember also that "enable_seqscan=off" merely adds 1e10 to the
>> estimated cost of seqscans.  For sufficiently large tables this is not
>> exactly a hard disable, just a thumb on the scales.  But I don't know
>> what your definition of "extremely large indexes" is.

> Wow, do we need to bump up that value based on larger modern hardware?

I'm disinclined to bump it up very much.  If it's more than about 1e16,
ordinary cost contributions would disappear into float8 roundoff error,
causing the planner to be making choices that are utterly random except
for minimizing the number of seqscans.  Even at 1e14 or so you'd be
losing a lot of finer-grain distinctions.  What we want is for the
behavior to be "minimize the number of seqscans but plan normally
otherwise", so those other cost contributions are still important.

Anyway, at this point we're merely speculating about what's behind
Robert's report --- I'd want to see some concrete real-world examples
before changing anything.

            regards, tom lane


------------=_1358281863-9659-3--

------- End of Forwarded Message



Re: Bogus duplicate-message complaints from PG mail lists

From
Magnus Hagander
Date:
On Tue, Jan 15, 2013 at 9:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Several times over the past couple of days, I have gotten complaints
> like the one attached about duplicate message IDs.  I can see from my
> sendmail daemon's log that I sent only one copy of this, and the
> complaint is dated about twenty minutes after the fact, so I'm pretty
> sure the duplicate is not of my making.
>
> Usually when I get this type of thing I can tell from the Received:
> lines that it's the fault of some list subscriber's MUA re-submitting
> a message to the lists.  But the Received: lines in this example and
> the last couple don't show that the message has gone anywhere outside
> postgresql.org.  So I'm thinking it's a recently-introduced glitch in
> the mail list server arrangements.

Hi!

Stefan and Alvaro were looking into this earlier, and it looks like a
weird networking issue that we've had for a couple of days into one of
our hosting boxes. Exactly what it's coming from is unknown at this
point, but it seems to soimetimes cause a connection drop - which yet
more sometimes happens in a position where the sending box never gets
the acknowledgement that tht email was sent - and thus retries.

Hopefully the ycan add some more details to the discussion, but I
wanted to throw out a quick "yeah, confirmed, we've seen something is
afoot" message.

--Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Bogus duplicate-message complaints from PG mail lists

From
Stefan Kaltenbrunner
Date:
On 01/15/2013 09:45 PM, Tom Lane wrote:
> Several times over the past couple of days, I have gotten complaints
> like the one attached about duplicate message IDs.  I can see from my
> sendmail daemon's log that I sent only one copy of this, and the
> complaint is dated about twenty minutes after the fact, so I'm pretty
> sure the duplicate is not of my making.
> 
> Usually when I get this type of thing I can tell from the Received:
> lines that it's the fault of some list subscriber's MUA re-submitting
> a message to the lists.  But the Received: lines in this example and
> the last couple don't show that the message has gone anywhere outside
> postgresql.org.  So I'm thinking it's a recently-introduced glitch in
> the mail list server arrangements.

those are caused by a network issue between one of the .eu server
locations and the .eu location the listserver itself is in.
The problem is that TCP sessions will get cut off in-transmission and
both ends of the communication end up seeing different states of the
session.
In the particular cases of the "mail duplication" it is basically the
sender (makus.postgresql.org - inbound MX) getting a TCP-session timeout
after the the end of DATA in the SMTP session and the receiving
side(malur.postgresql.org - the listserver) actually got the full mail
AND acknowledged it but the sender never got the reply back.
The sender (having no ack from the receiver) retries, generating a dup...

We are working on fully diagnosing what is causing this (other parts of
the infrastructure are impacted as well just not as visible) but the
intermediate nature and the complexity(and number) of networks involved
are not helping.


Stefan