Re: mailing list archiver chewing patches - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: mailing list archiver chewing patches
Date
Msg-id 9837222c1002010603u5b51a22dqfb67799dc64241e4@mail.gmail.com
Whole thread Raw
In response to Re: mailing list archiver chewing patches  (Matteo Beccati <php@beccati.com>)
Responses Re: mailing list archiver chewing patches
List pgsql-hackers
2010/2/1 Matteo Beccati <php@beccati.com>:
> On 01/02/2010 10:26, Magnus Hagander wrote:
>>
>> Does the MBOX importer support incremental loading? Because majordomo
>> spits out MBOX files for us already.
>
> Unfortunately the aoximport shell command doesn't support incremental loading.
>
>> One option could be to use SMTP with a subscription as the primary way
>> (and we could set up a dedicated relaying from the mailserver for this
>> of course, so it's not subject to graylisting or anything like that),
>> and then daily or so load the MBOX files to cover anything that was
>> lost?
>
> I guess we could write a script that parses the mbox and adds whatever is missing, as long as we keep it as a last
resortif we can't make the primary delivery a fail proof. 
>
> My main concern is that we'd need to overcomplicate the thread detection algorithm so that it better deals with
delayedmessages: as it currently works, the replies to a missing message get linked to the "grand-parent". Injecting
themissing message afterwards will put it at the same level as its replies. If it happens only once in a while I guess
wecan live with it, but definitely not if it happens tens of times a day. 

That can potentially be a problem.

Consider the case where message A it sent. Mesasge B is a response to
A, and message C is a response to B. Now assume B is held for
moderation (because the poser is not on the list, or because it trips
some other thing), then message C will definitely arrive before
message B. Is that going to cause problems with this method?

Another case where the same thing will happen is if message delivery
of B gets for example graylisted, or is just slow from sender B, but
gets quickly delivered to the author of message A (because of a direct
CC). In this case, the author of message A may respond to it (making
message D),and this will again arrive before message B because author
A is not graylisted.

So the system definitely needs to deal with out-of-order delivery.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: Thom Brown
Date:
Subject: Re: Review: listagg aggregate
Next
From: Matteo Beccati
Date:
Subject: Re: mailing list archiver chewing patches