Re: mailing list archiver chewing patches - Mailing list pgsql-hackers

From Dimitri Fontaine
Subject Re: mailing list archiver chewing patches
Date
Msg-id 87bpgzbsln.fsf@hi-media-techno.com
Whole thread Raw
In response to Re: mailing list archiver chewing patches  (Dave Page <dpage@pgadmin.org>)
Responses Re: mailing list archiver chewing patches  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
Dave Page <dpage@pgadmin.org> writes:
> 2010/1/12 Matteo Beccati <php@beccati.com>:
>> So, I've decided to spend a bit more time on this and here is a proof of concept web app that displays mailing list
archivesreading from the AOX database:
 
>>
>> http://archives.beccati.org/
>
> Seems to work.

Hehe, nice a beginning!

> So just to put this into perspective and give anyone paying attention
> an idea of the pain that lies ahead should they decide to work on
> this:
>
> - We need to import the old archives (of which there are hundreds of
> thousands of messages, the first few years of which have, umm, minimal
> headers.

Anyone having a local copy of this in his mailboxes? At some point there
were some NNTP gateway, so maybe there's a copy this way.

> - We need to generate thread indexes

We have CTEs :)

> - We need to re-generate the original URLs for backwards compatibility

I guess the message-id one ain't the tricky one... and it should be
possible to fill a relation table like monharc_compat(message_id, list, year, month, message_number);

Then we'd need some help from the webserver (rewrite rules I guess) so
that the current URL is transformed to call a catch-all script:
http://archives.postgresql.org/pgsql-xxx/YYYY-MM/msg01234.php
-> http://archives.postgresql.org/compat.php?l=xxx&y=YYYY&m=MM&n=01234

In that compat.php script you then issue the following query or the like
to get the message_id, then use the newer infrastructure to get to
display it:
 SELECT message_id    FROM monharc_compat   WHERE list = ? and year = ? and month = ? and message_number = ?;

Regards,
-- 
dim


pgsql-hackers by date:

Previous
From: Takahiro Itagaki
Date:
Subject: Pretty printed trigger in psql
Next
From: Dimitri Fontaine
Date:
Subject: Re: damage control mode