Re: Post-2018 messages in archives - Mailing list pgsql-www

From Noah Misch
Subject Re: Post-2018 messages in archives
Date
Msg-id 20181206061418.GC2945370@rfd.leadboat.com
Whole thread Raw
In response to Re: Post-2018 messages in archives  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Post-2018 messages in archives
List pgsql-www
On Wed, Dec 05, 2018 at 11:31:39PM -0500, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Wed, Dec 05, 2018 at 09:39:18AM +0100, Magnus Hagander wrote:
> >>> Unfortunately we don't keep the ingest time separately. But for the future,
> >>> doing so would probably be a good idea, for other reasons as well.
> 
> > Works for me.  Pondering it more, the timestamp that matters most for archive
> > purposes is the timestamp at which list subscribers started to receive their
> > copies of the message.  Based on that, I'm thinking we should ignore the Date
> > header and always use the timestamp from a particular "Received ... by
> > HOSTNAME.postgresql.org" header.  Before settling on that, I'd want to check
> > how many messages change timestamp by more than ~100s, and I'd want to spot
> > check a few messages to see whether the change looks like an improvement.
> 
> Another point worth considering here is moderation queue delays, which
> are not infrequently measured in days :-(.  I am not quite sure whether
> it'd be better to tag a moderation-delayed message with the timestamp
> when it entered the queue or the time when it exited.  But either one
> would be better than believing the Date: header.

Good point.  I'd prefer to use the time when it exited the queue, which
conforms to "timestamp at which list subscribers started to receive their
copies of the message" mentioned above.  I usually download November's mbox in
the first few days of December.  If we use the timestamp of entering the queue
(or the Date header), there's no particular upper bound on when the November
mbox stops accruing new messages.


pgsql-www by date:

Previous
From: Tom Lane
Date:
Subject: Re: Post-2018 messages in archives
Next
From: Magnus Hagander
Date:
Subject: Re: Post-2018 messages in archives