Re: Scaling concerns - Mailing list pgsql-performance

From Greg Smith
Subject Re: Scaling concerns
Date
Msg-id Pine.GSO.4.64.0612171856150.17147@westnet.com
Whole thread Raw
In response to Re: Scaling concerns  (tsuraan <tsuraan@gmail.com>)
Responses Re: Scaling concerns
List pgsql-performance
On Sun, 17 Dec 2006, tsuraan wrote:

> Since my application is constantly adding to the database (far more is
> written than is ever read), it would be nice to have a multiple-write,
> single reader solution, if such a thing exists.

You seem to be working from the assumption that you have a scaling issue,
and that therefore you should be researching how to scale your app to more
machines.  I'm not so sure you do; I would suggest that you drop that
entire idea for now, spend some time doing basic performance tuning for
Postgres instead, and only after then consider adding more machines.  It
does little good to add more incorrectly setup servers to the mix, and
solving the multiple-write problem is hard.  Let's take a quick tour
through your earlier messages:

> My postgres settings are entirely default with the exception of
> shared_buffers being set to 40,000 and max_connections set to 400. I'm
> not sure what the meaning of most of the other settings are, so I
> haven't touched them.

http://www.powerpostgresql.com/Downloads/annotated_conf_80.html is a good
intro to the various parameters you might set, with some valuable hints on
the effective range you should be considering.  I'd suggest you use that
to identify the most likely things to increase, then read the manuals at
http://www.postgresql.org/docs/8.2/interactive/runtime-config.html for
more detail on what you're actually adjusting.  To get you started,
consider increasing effective_cache_size, checkpoint_segments, and
work_mem; those are three whose defaults are very low for your
application, relative to your hardware.  The thought of how your poor
database is suffering when trying to manage a heavy write load with the
default checkpoint_segments in particular makes me sad, especially when we
add:

> The machines running the database servers are my home desktop (a
> dual-core athlon 3200+ with 2GB RAM and a 120GB SATA II drive), and a
> production server with two dual-core Intel chips, 4 GB RAM, and a RAID 5
> array of SATA II drives on a 3Ware 9550 controller.

One big RAID 5 volume is probably the worst setup available for what
you're doing.  Luke already gave you a suggestion for testing write speed;
you should run that test, but I wouldn't expect happy numbers there.  You
might be able to get by with the main database running like that, but
think about what you'd need to do to add more disks (or reorganize the
ones you have) so that you could dedicate a pair to a RAID-1 volume for
holding the WAL.  If you're limited by write performance, I think you'd
find adding a separate WAL drive set a dramatically more productive
upgrade than trying to split the app to another machine.  Try it on your
home machine first; that's a cheap upgrade, to add another SATA drive to
there, and you should see a marked improvement (especially once you get
the server parameters set to more appropriate values).

I'd also suggest that you'd probably be able to get more help from people
here if you posted a snippet of output from vmstat and iostat -x with a
low interval (say 5 seconds) during a period where the machine was busy;
that's helpful for figuring out where the bottleneck on your machine
really is.

> Trying to do "INSERT INTO Messages(path, msgid) SELECT (path, msgid)
> FROM tmpMessages" took a really long time before psql died with an
> out-of-memory error.

Do you have the exact text of the error?  I suspect you're falling victim
to the default parameters being far too low here as well, but without the
error it's hard to know exactly which.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-performance by date:

Previous
From: tsuraan
Date:
Subject: Re: Scaling concerns
Next
From: tsuraan
Date:
Subject: Re: Scaling concerns