On Tue, 12 Oct 2004 11:01:08 -0400, Jerry LeVan <jerry.levan@eku.edu> wrote:
> Hi,
> I am futzing around with Andrew Stuarts "Catchmail" program
> that stores emails into a postgresql database.
>
> I want to avoid inserting the same email more than once...
> (pieces of the email actually get emplaced into several
> tables).
>
> Is the "Message-ID" header field a globally unique identifer?
>
> I eventually want to have a cron job process my inbox and don't
> want successive cron tasks to keep re-entering the same email :)
In terms of Internet mail? Answer is... almost.
The idea is that each mail has an unique Message-ID, but there are
cases when few "different" mails get same Message-ID. Such can be
the case with mailing lists, like the one you are reading right now.
Suppose you are "crosssending" a message, telling:
To: pgsql-general@postgresql.org
Cc: linux-kernel@kernel.org
the message will arrive here and a copy will be sent to each mailing list.
Then these twin messages will be processed by mailing list software,
subjects will have [something] prepended in case of pgsql-general,
the linux-kernel will have custom "signature" at the end of a message,
pgsql-general will have "TIPS" as a signature.
Then suppose you are subscribed to both lists. You will receive both
messages (which look slightly different) but with same Message-ID.
Oh, and if you store a "Sent-mail" in same/similar fold^H^H^H^Htable,
be warned that when this message comes back from pgsql-general or
most any other mailing list it will have the same Message-ID.
So... I think you might want to discard messages with duplicate
Message-IDs (loosing one of lkml- or pgsql-general- list, whichever
comes later), but you should do it silently. Mail should not be rejected
or you're risking getting bounced of the mailing list.
HTH, HAND,
Dawid