mail list traffic - Mailing list pgsql-general

From Daniel Verite
Subject mail list traffic
Date
Msg-id 391ffd2c-0ef2-4edb-bf3f-44b0bec7ccd6@mm
Whole thread Raw
In response to Re: Postgres mail list traffic over time  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: mail list traffic
List pgsql-general
    Gregory Stark wrote:

> I would be curious to see the average lifespan of threads over time.

I happen to have the mail archives stored in a database, so I've
expressed this in SQL and below are some results for hackers and
general, 2007-2008. count is the number of distinct threads whose
oldest message is in the specified month. A thread is started as
soon as a message has an In-Reply-To field pointing to an
existing Message-Id.

Results for pgsql-hackers:

  month  |     avg span     |    median span   | count
---------+------------------+-----------------+-------
 2007-01 | 7 days 10:00:00  | 1 day 04:18:00  |   211
 2007-02 | 7 days 10:00:00  | 1 day 00:23:48  |   186
 2007-03 | 16 days 30:00:00 | 1 day 05:45:37  |   171
 2007-04 | 13 days 26:00:00 | 19:07:00          |   142
 2007-05 | 19 days 30:00:00 | 1 day 04:46:36  |   122
 2007-06 | 15 days 19:00:00 | 23:38:13          |   111
 2007-07 | 19 days 25:00:00 | 21:04:04          |   106
 2007-08 | 13 days 30:00:00 | 20:26:39          |   133
 2007-09 | 21 days 32:00:00 | 1 day 16:43:10  |   121
 2007-10 | 13 days 19:00:00 | 17:23:24          |   148
 2007-11 | 16 days 15:00:00 | 16:23:00          |   140
 2007-12 | 17 days 16:00:00 | 1 day 07:28:05  |    81
 2008-01 | 13 days 12:00:00 | 23:02:33          |   127
 2008-02 | 9 days 11:00:00  | 12:44:28          |   130
 2008-03 | 10 days 14:00:00 | 22:57:18          |   140
 2008-04 | 10 days 14:00:00 | 1 day 00:32:34  |   132
 2008-05 | 13 days 09:00:00 | 1 day 20:57:57  |   113
 2008-06 | 7 days 27:00:00  | 1 day 05:42:46  |   102
 2008-07 | 13 days 26:00:00 | 2 days 07:43:34 |   133
 2008-08 | 9 days 33:00:00  | 1 day 07:47:09  |   121
 2008-09 | 7 days 25:00:00  | 1 day 19:00:50  |   125
 2008-10 | 6 days 14:00:00  | 1 day 10:31:01  |   178

 Results for pgsql-general:

  month  |    avg span       | median span | count
---------+-----------------+-------------+-------
 2007-01 | 1 day 25:00:00  | 10:57:11     |   329
 2007-02 | 2 days 28:00:00 | 10:50:38     |   295
 2007-03 | 3 days 08:00:00 | 14:54:08     |   310
 2007-04 | 6 days 18:00:00 | 17:40:55     |   244
 2007-05 | 3 days 22:00:00 | 16:43:54     |   287
 2007-06 | 2 days 13:00:00 | 11:26:46     |   297
 2007-07 | 2 days 19:00:00 | 11:59:40     |   263
 2007-08 | 3 days 14:00:00 | 16:35:16     |   335
 2007-09 | 3 days 14:00:00 | 13:23:09     |   245
 2007-10 | 2 days 16:00:00 | 08:46:09     |   302
 2007-11 | 3 days 07:00:00 | 08:28:06     |   294
 2007-12 | 2 days 31:00:00 | 10:25:14     |   255
 2008-01 | 2 days 14:00:00 | 13:23:12     |   248
 2008-02 | 2 days 14:00:00 | 10:02:16     |   257
 2008-03 | 1 day 25:00:00  | 13:20:06     |   245
 2008-04 | 1 day 30:00:00  | 08:26:06     |   238
 2008-05 | 3 days 22:00:00 | 18:58:27     |   211
 2008-06 | 2 days 24:00:00 | 14:46:02     |   191
 2008-07 | 1 day 29:00:00  | 10:37:17     |   221
 2008-08 | 1 day 22:00:00  | 14:14:45     |   205
 2008-09 | 1 day 24:00:00  | 14:26:26     |   202
 2008-10 | 1 day 19:00:00  | 12:32:56     |   219

"median span" is the median computed with the pl/R median function
applied to intervals as a number of seconds and then cast back to
intervals for display. I believe the median is good to mitigate the
contribution of messages with wrong dates and posters that reply to
very old messages. And median span appears to differs a lot from the
average span.

If people feel like playing with the database to build other queries,
feel free to bug me off-list about it. I can arrange to make a dump
available or share the scripts to build it yourself from the
mailboxes archives.

 Best regards,
--
 Daniel
 PostgreSQL-powered mail user agent and storage:
http://www.manitou-mail.org



pgsql-general by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: [Q]updating multiple rows with Different values
Next
From: Alvaro Herrera
Date:
Subject: Re: Postgres mail list traffic over time