Thread: PostgreSQL to host e-mail?
I'm building an e-mail service that has two requirements: It should index messages on the fly to have lightening search results, and it should be able to handle large amounts of space. The server is going to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd like to know how PostgreSQL could handle such a large amount of data. How much RAM would I need? I expect my users to have a 10GB quota per e-mail account. Thanks for your advice, -- Charles A. Landemaine.
I'm building an e-mail service that has two requirements: It should index messages on the fly to have lightening search results, and it should be able to handle large amounts of space. The server is going to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd like to know how PostgreSQL could handle such a large amount of data. How much RAM would I need? I expect my users to have a 10GB quota per e-mail account. Thanks for your advice, -- Charles A. Landemaine.
On Thu, 2007-01-04 at 15:00 -0300, Charles A. Landemaine wrote: > I'm building an e-mail service that has two requirements: It should > index messages on the fly to have lightening search results, and it > should be able to handle large amounts of space. The server is going > to be dedicated only for e-mail with 250GB of storage in Raid-5. Well Raid 5 is likely a mistake. Consider RAID 10. > I'd > like to know how PostgreSQL could handle such a large amount of data. 250GB is not really that much data for PostgreSQL I have customers with much larger data sets. > How much RAM would I need? Lots... which is about all I can tell you without more information. How many customers? Are you using table partitioning? How will you be searching? Full text or regex? Joshua D. Drake > I expect my users to have a 10GB quota per > e-mail account. > Thanks for your advice, > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
On Thu, 4 Jan 2007 15:00:05 -0300 "Charles A. Landemaine" <landemaine@gmail.com> wrote: > I'm building an e-mail service that has two requirements: It should > index messages on the fly to have lightening search results, and it > should be able to handle large amounts of space. The server is going > to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd > like to know how PostgreSQL could handle such a large amount of data. > How much RAM would I need? I expect my users to have a 10GB quota per > e-mail account. Well this is a bit like asking "what's the top speed for a van that can carry 8 people", it really isn't enough information to be able to give you a good answer. It depends on everything from the data model you use to represent your E-mail messages, to configuration, to hardware speeds, etc. In general, you want as much RAM as you can afford for the project, the more the better. I'd say 2-4GB is the minimum. And RAID-5 isn't very good for database work in general, you'll get better performance from RAID 1+0. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
Frank Wiles wrote: > On Thu, 4 Jan 2007 15:00:05 -0300 > "Charles A. Landemaine" <landemaine@gmail.com> wrote: > >> I'm building an e-mail service that has two requirements: It should >> index messages on the fly to have lightening search results, and it >> should be able to handle large amounts of space. The server is going >> to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd >> like to know how PostgreSQL could handle such a large amount of data. >> How much RAM would I need? I expect my users to have a 10GB quota per >> e-mail account. > I wouldn't do it this way, I would use cyrus. It stores the messages in plain text each in it's own file (like maildir) but then it also indexes the headers in bdb format and also can index the search database in bdb format. The result is a very simple mail store that can perform searches very fast. The only problem is that the search index in only updated periodically. If you need more information then look at the cyrus-users list as this is WAY off topic. schu
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 Charles A. Landemaine wrote: | I'm building an e-mail service that has two requirements: It should | index messages on the fly to have lightening search results, and it | should be able to handle large amounts of space. The server is going | to be dedicated only for e-mail with 250GB of storage in Raid-5. I'd | like to know how PostgreSQL could handle such a large amount of data. | How much RAM would I need? I expect my users to have a 10GB quota per | e-mail account. | Thanks for your advice, | Hello, Charles. I'll second people's suggestions to stay away from RAID5; the kind of workload a mail storage will have is one that is approximately an even mix of writes (in database terms, INSERTs, UPDATEs and DELETEs) and reads, and we all know RAID5 is a loser when it comes to writing a lot, at least when you're building arrays with less than 10-15 drives. I'd suggest you go for RAID10 for the database cluster and an extra drive for WAL. Another point of interest I'd like to mention is one particular aspect of the workflow of an e-mail user: we will typically touch the main inbox a lot and leave most of the other folders pretty much intact for most of the time. This suggests per-inbox quota might be useful, maybe in addition to the overall quota, because then you can calculate your database working set more easily, based on usage statistics for a typical account. Namely, if the maximum size of an inbox is x MB, with y% average utilization, and you plan for z users, of which w% will be typically active in one day, your database working set will be somewhere in the general area of (x * y%) * (z * w%) MB. Add to that the size of the indexes you create, and you have a very approximate idea of the amount of RAM you need to place in your machines to keep your performance from becoming I/O-bound. The main reason I'm writing this mail though, is to suggest you take a look at Oryx, http://www.oryx.com/; They used to have this product called Mailstore, which was designed to be a mail store using PostgreSQL as a backend, and has since evolved to a bit more than just that, it seems. Perhaps it could be of help to you while building your system, and I'm sure the people at Oryx will be glad to hear from you while, and after you've built your system. Kind regards, - -- ~ Grega Bremec ~ gregab at p0f dot net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFFncGcfu4IwuB3+XoRA9Y9AJ0WA+0aooVvGMOpQXGStzkRNVDCjwCeNdfs CArTFwo6geR1oRBFDzFRY/U= =Y1Lf -----END PGP SIGNATURE-----
On Fri, 2007-01-05 at 04:10 +0100, Grega Bremec wrote: > he main reason I'm writing this mail though, is to suggest you take a > look > at Oryx, http://www.oryx.com/; They used to have this product called > Mailstore, which was designed to be a mail store using PostgreSQL as a > backend, and has since evolved to a bit more than just that, it seems. > Perhaps it could be of help to you while building your system, and I'm > sure > the people at Oryx will be glad to hear from you while, and after > you've > built your system. > > Kind regards, > -- > ~ Grega Bremec re above... http://www.archiveopteryx.org/1.10.html
On Fri, Jan 05, 2007 at 01:15:44PM -0500, Reid Thompson wrote: > On Fri, 2007-01-05 at 04:10 +0100, Grega Bremec wrote: > > he main reason I'm writing this mail though, is to suggest you take a > > look > > at Oryx, http://www.oryx.com/; They used to have this product called > > Mailstore, which was designed to be a mail store using PostgreSQL as a > > backend, and has since evolved to a bit more than just that, it seems. > > Perhaps it could be of help to you while building your system, and I'm > > sure > > the people at Oryx will be glad to hear from you while, and after > > you've > > built your system. > > > > Kind regards, > > -- > > ~ Grega Bremec > re above... > http://www.archiveopteryx.org/1.10.html You should also look at http://dbmail.org/ , which runs on several databases (PostgreSQL included). -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)