Home > mailing lists

Re: Performance nightmare with dspam (urgent) (resolved) - Mailing list pgsql-performance

From	John A Meinel
Subject	Re: Performance nightmare with dspam (urgent) (resolved)
Date	June 6, 2005 12:08:35
Msg-id	42A466E7.4080808@arbash-meinel.com Whole thread Raw
In response to	Re: Performance nightmare with dspam (urgent) (resolved) (Casey Allen Shobe <lists@seattleserver.com>)
Responses	Re: Performance nightmare with dspam (urgent) (resolved) Re: Performance nightmare with dspam (urgent) (resolved)
List	pgsql-performance

Tree view

Casey Allen Shobe wrote:
> On Wednesday 01 June 2005 20:19, Casey Allen Shobe wrote:
>
...
> Long-term, whenever we hit the I/O limit again, it looks like we really don't
> have much of a solution except to throw more hardware (mainly lots of disks
> in RAID0's) at the problem. :(  Fortunately, with the above two changes I/O
> usage on the PG data disk is a quarter of what it was, so theoretically we
> should be able to quadruple the number of users on current hardware.
>

Be very careful in this situation. If any disks in a RAID0 fails, the
entire raid is lost. You *really* want a RAID10. It takes more drives,
but then if anything dies you don't lose everything.

If you are running RAID0 and you *really* want performance, and aren't
concerned about safety (at all), you could also set fsync=false. That
should also speed things up. But you are really risking corruption/data
loss on your system.

> Our plan forward is to increase the number of disks in the two redundant mail
> servers, so that each has a single ultra320 disk for O/S and pg_xlog, and a
> 3-disk RAID0 for the data.  This should triple our current capacity.

I don't know if you can do it, but it would be nice to see this be 1
RAID1 for OS, 1 RAID10 for pg_xlog, and another RAID10 for data. That is
the recommended performance layout. It takes quite a few drives (minimum
of 10). But it means your data is safe, and your performance should be
very good.

>
> The general opinion of the way dspam uses the database among people I've
> talked to on #postgresql is not very good, but of course the dspam folk blame
> PostgreSQL and say to use MySQL if you want reasonable performance.  Makes it
> real fun to be a DSpam+PostgreSQL user when limits are reached, since
> everyone denies responsibility.  Fortunately, PostgreSQL people are pretty
> helpful even if they think the client software sucks. :)
>

I can't say how dspam uses the database. But they certainly could make
assumptions about how certain actions are done by the db, which are not
quite true with postgres. (For instance MySQL can use an index to return
information, because Postgres supports transactions, it cannot, because
even though a row is in the index, it may not be visible to the current
transaction.)

They also might be doing stuff like "select max(row)" instead of "select
row ORDER BY row DESC LIMIT 1". In postgres the former will be a
sequential scan, the latter will be an index scan. Though I wonder about
"select max(row) ORDER BY row DESC LIMIT 1". to me, that should still
return the right answer, but I'm not sure.

> Cheers,

Good luck,
John
=:->

Attachment

signature.asc

pgsql-performance by date:

From: "Mindaugas Riauba"
Date: 06 June 2005, 11:58:14
Subject: Re: How to avoid database bloat

From: PFC
Date: 06 June 2005, 12:11:40
Subject: Re: Performance nightmare with dspam (urgent) (resolved)

Re: Performance nightmare with dspam (urgent) (resolved) - Mailing list pgsql-performance

Attachment

Previous

Next