Thread: Configuring a Large RAM PostgreSQL Server

Configuring a Large RAM PostgreSQL Server

From
Alex Hochberger
Date:
Does anyone have any white papers or basic guides for a large RAM
server?

We are consolidating two databases to enable better data-mining that
currently run on a 4 GB and 2 GB machine.  The data issues on the 4
GB machine are numerous, things like "create index" fail and update
queries fail from out of memory issues.  Re-factoring the data is
helping, but isn't finishing the job.

The new machine will have 48 GB of RAM, so figuring out starting
points for the Shared Buffers and Work_mem/Maintenance_work_mem is
going to be a crap shoot, since the defaults still seem to be based
upon 256MB of RAM or less.

Usage:
    Most of the time, the database is being hit with a handle of
poorly written and unoptimized queries from a Ruby on Rails app that
is being refactored as a simple Ruby-DBI app since we need to support
our legacy code but don't need the Rails environment, just a lot of
SQL writes.  Some stored procedures should streamline this.  However,
each transaction will do about 5 or 6 writes.
    Common Usage: we have a reporting tool that is also being
refactored, but does a lot of aggregate queries.  None of these take
more than 500 ms after indexing on the 2 GB database, so assuming
that more RAM should help and eliminate the problems.
    Problem Usage: we have a 20GB table with 120m rows that we are
splitting into some sub-tables.  Generally, we do large data pulls
from here, 1 million - 4 million records at a time, stored in a new
table for export.  These queries are problematic because we are
unable to index the database for the queries that we run because we
get out of memory errors.  Most of my cleanup has restored to FOR-IN
loops via pl-pgsql to manage the data one row at a time.  This is
problematic because many of these scripts are taking 4-5 days to run.
    Other usage: we will import between 10k and 10m rows at one time
out of CSVs into the big database table.  I got my gig here because
this was all failing and the data was becoming worthless.  These
imports involve a lot of writes.

    Our simultaneous queries are small, and currently run
acceptably.  It's the big imports, data-mining pulls, and system
manipulation were we routinely wait days on the query that we are
looking to speed up.

Thanks,
Alex

Re: Configuring a Large RAM PostgreSQL Server

From
Richard Huxton
Date:
Alex Hochberger wrote:
> Does anyone have any white papers or basic guides for a large RAM server?
>
> We are consolidating two databases to enable better data-mining that
> currently run on a 4 GB and 2 GB machine.  The data issues on the 4 GB
> machine are numerous, things like "create index" fail and update queries
> fail from out of memory issues.

>    Problem Usage: we have a 20GB table with 120m rows that we are
> splitting into some sub-tables.  Generally, we do large data pulls from
> here, 1 million - 4 million records at a time, stored in a new table for
> export.  These queries are problematic because we are unable to index
> the database for the queries that we run because we get out of memory
> errors.

Would it not make sense to find out why you are getting these errors first?

It's not normal to get "out of memory" when rebuilding an index.

--
   Richard Huxton
   Archonet Ltd

Re: Configuring a Large RAM PostgreSQL Server

From
Alex Hochberger
Date:
It's not on rebuilding the index, it's on CREATE INDEX.

I attribute it to wrong setting, Ubuntu bizarre-ness, and general
problems.

We need new hardware, the servers are running on aging
infrastructure, and we decided to get a new system that will last us
the next 3-4 years all at once.

But many large queries are getting Out of Memory errors.

Alex

On Nov 29, 2007, at 2:15 PM, Richard Huxton wrote:

> Alex Hochberger wrote:
>> Does anyone have any white papers or basic guides for a large RAM
>> server?
>> We are consolidating two databases to enable better data-mining
>> that currently run on a 4 GB and 2 GB machine.  The data issues on
>> the 4 GB machine are numerous, things like "create index" fail and
>> update queries fail from out of memory issues.
>
>>    Problem Usage: we have a 20GB table with 120m rows that we are
>> splitting into some sub-tables.  Generally, we do large data pulls
>> from here, 1 million - 4 million records at a time, stored in a
>> new table for export.  These queries are problematic because we
>> are unable to index the database for the queries that we run
>> because we get out of memory errors.
>
> Would it not make sense to find out why you are getting these
> errors first?
>
> It's not normal to get "out of memory" when rebuilding an index.
>
> --
>   Richard Huxton
>   Archonet Ltd


Re: Configuring a Large RAM PostgreSQL Server

From
Alvaro Herrera
Date:
> On Nov 29, 2007, at 2:15 PM, Richard Huxton wrote:

>> Alex Hochberger wrote:
>>>    Problem Usage: we have a 20GB table with 120m rows that we are
>>> splitting into some sub-tables.  Generally, we do large data pulls from
>>> here, 1 million - 4 million records at a time, stored in a new table for
>>> export.  These queries are problematic because we are unable to index the
>>> database for the queries that we run because we get out of memory errors.
>>
>> Would it not make sense to find out why you are getting these errors
>> first?

Alex Hochberger wrote:
> It's not on rebuilding the index, it's on CREATE INDEX.
>
> I attribute it to wrong setting, Ubuntu bizarre-ness, and general problems.

Please do not top-post.  I reformatted your message for clarity.

Richard is still correct: it is not normal to get out-of-memory errors
during index building, regardless of age of servers and Linux distro.
Perhaps you just have a maintenance_work_mem setting that's too large
for your server.

--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"

Re: Configuring a Large RAM PostgreSQL Server

From
Josh Berkus
Date:
Alex,

> The new machine will have 48 GB of RAM, so figuring out starting  
> points for the Shared Buffers and Work_mem/Maintenance_work_mem is  
> going to be a crap shoot, since the defaults still seem to be based  
> upon 256MB of RAM or less.

Why a crap shoot?

Set shared_buffers to 12GB.  Set work_mem to 20GB / # of concurrent active
connections (check your logs).  Set Maint_mem to 2GB (I don't think we can
actually use more). Then tune from there.

Also, use 8.2 or later, and you'd better compile 64-bit.

--
Josh Berkus
PostgreSQL @ Sun
San Francisco