Re: Help tuning a large table off disk and into RAM - Mailing list pgsql-general

From Bill Moran
Subject Re: Help tuning a large table off disk and into RAM
Date
Msg-id 20070926112429.74087e33.wmoran@potentialtech.com
Whole thread Raw
In response to Help tuning a large table off disk and into RAM  ("James Williams" <james.wlms@googlemail.com>)
Responses Re: Help tuning a large table off disk and into RAM  ("Jimmy Choi" <JCHOI@altera.com>)
Re: Help tuning a large table off disk and into RAM  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
In response to "James Williams" <james.wlms@googlemail.com>:

> I'm stuck trying to tune a big-ish postgres db and wondering if anyone
> has any pointers.
>
> I cannot get Postgres to make good use of plenty of available RAM and
> stop thrashing the disks.
>
> One main table. ~30 million rows, 20 columns all integer, smallint or
> char(2).  Most have an index.  It's a table for holding webserver
> logs.  The main table is all foreign key ids.  Row size is ~100bytes.
>
> The typical query is an aggregate over a large number of rows (~25% say).
>
>  SELECT COUNT(*), COUNT(DISTINCT user_id)
>  FROM table
>  WHERE epoch > ...
>  AND epoch < ...
>  AND country = ...
>
> The box has 4 x Opterons, 4Gb RAM & five 15k rpm disks, RAID 5.  We
> wanted fast query/lookup.  We know we can get fast disk IO.
>
> Running a typical query like above seems to:
>
> * hardly tax a single CPU
> * plenty of RAM free
> * disks thrash about
>
> The last is based mostly on the observation that another tiddly
> unrelated mysql db which normally runs fast, grinds to a halt when
> we're querying the postgres db (and cpu, memory appear to have spare
> capacity).
>
> We've currently got these settings, and have tried doubling/halving
> them, restarted and benchmarked a test query.  They don't appear to
> materially alter our query time.
>
>  shared_buffers  = 128MB

shared_buffers = 1.5GB

Unless you've got a lot of stuff other than PostgreSQL on this machine.

>  temp_buffers    = 160MB
>  work_mem        = 200MB
>  max_stack_depth = 7MB

These look reasonable, although I can't be sure without more details.

>
> We're less concerned about insert speed.  Typically 1 or 2 users, but
> want fast queries.
>
> Perhaps a little extreme, but I'm trying to find a way to express this
> in a way that Postgres understands:
>
> * Load this table, and one or two indexes (epoch, user_id) into RAM.

Give it enough shared_buffers and it will do that.  You're estimating
the size of your table @ 3G (try a pg_relation_size() on it to get an
actual size)  If you really want to get _all_ of it in all the time,
you're probably going to need to add RAM to the machine.  With 8G, you
could allocate about 3G to shared_buffers, but that would be ignoring
the size of indexes.

However, I think you'll be surprised how much performance improves
with 1.5G of shared_buffers.  You may not need any more.  128M is
really forcing PG to work within limited space.

> * All of the table, all of those indexes.
> * Keep them there, but keep a disk based backup for integrity.
> * Run all selects against the in RAM copy.  Always.

This is what PG does if you allocate enough shared_buffers.

--
Bill Moran
http://www.potentialtech.com

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Dumping from older version
Next
From: Carlos Moreno
Date:
Subject: Re: Dumping from older version