Let me give you a better dataset:
msg_sender_num | env_sender_num
----------------+---------------- 118 | 53003 118 | 51778 118 |
49679 118 | 49457 118 | 37434 118 | 37389 118 |
33644 118 | 33609 118 | 26043 118 | 26004 118 |
24288 118 | 23357 118 | 16246 118 | 16103 118
| 12967 118 | 12140 118 | 4191 118 | 122
118| 860
with the SQL:
SELECT record_of_claims.msg_sender_num, count(DISTINCT
record_of_claims.env_sender_num) AS env_from_claims_count FROM
record_of_claims WHERE (record_of_claims.msg_sender_num =118) GROUP BY
record_of_claims.msg_sender_num;
You get:
msg_sender_num | env_from_claims_count
----------------+----------------------- 118 | 19
Which is correct for the following reason:
msg_sender_num | envelope_from
----------------+--------------------------------------------- 118 |
ABVQ3QQBAQAFfLcB9QAAAAACAAAAAA@b.tpcper.com 118 | AjEywAQBAQAFgHcB9QAAOw4CAAAAAA@b.tpcper.com 118 |
AjEywAQBAQAFJkQB9QAAOw4CAAAAAA@b.tpcper.com 118 | AjEywAQBAQAFKhMB9QAAOw4CAAAAAA@b.tpcper.com 118 |
ABVQ3QQBAQAFKhMB9QAAAAACAAAAAA@b.tpcper.com 118 | ABVQ3QQBAQAFKz0B9QAAAAACAAAAAA@b.tpcper.com 118 |
AjEywAQBAQAFKz0B9QAAOw4CAAAAAA@b.tpcper.com 118 | ABVQ3QQBAQAFKiMB9QAAAAACAAAAAA@b.tpcper.com 118 |
AjEywAQBAQAFKiMB9QAAOw4CAAAAAA@b.tpcper.com 118 | AjEywAQBAQAFKxoB9QAAOw4CAAAAAA@b.tpcper.com 118 |
ABVQ3QQBAQAFKxoB9QAAAAACAAAAAA@b.tpcper.com 118 | AjEywAQBAQAFK0QB9QAAOw4CAAAAAA@b.tpcper.com 118 |
ABVQ3QQBAQAFK0QB9QAAAAACAAAAAA@b.tpcper.com 118 | ABVQ3QQBAQAFLuEB9QAAAAACAAAAAA@b.tpcper.com 118 |
AjEywAQBAQAFLuEB9QAAOw4CAAAAAA@b.tpcper.com 118 | AjEywAQBAQAFf8wB9QAAOw4CAAAAAA@b.tpcper.com 118 |
ABVQ3QQBAQAFf8wB9QAAAAACAAAAAA@b.tpcper.com 118 | AjEywAQBAQAFgAoB9QAAOw4CAAAAAA@b.tpcper.com 118 |
ABVQ3QQBAQAFgA4B9QAAAAACAAAAAA@b.tpcper.com
19 different envelope from names all claiming to be the same Message
from: 118 ("TopOffers TopOffers@Topica.com")
All of the above address would be added to blacklist for 120 days.
If I say every 15 minutes or so create a new table full of the results
of the SQL view it sort of solves my problem. Still takes forever to
process but the next view that needs these results can do an index scan
on the resulting table and not have to build the list all over again. In
this was I can cut my processing time to 6/10th of a second. But I have
to create/drop/rename tables on a time interval. Not the best solution.
Jerry Wintrode
Network Administrator
Tripos, Inc.