Re: Data Warehousing Tuning - Mailing list pgsql-performance

From Frank Wosczyna
Subject Re: Data Warehousing Tuning
Date
Msg-id BB05A27C22288540A3A3E8F3749B45ABD6A955@MI8NYCMAIL06.Mi8.com
Whole thread Raw
In response to Data Warehousing Tuning  ("Paul Johnson" <paul@oxton.com>)
List pgsql-performance
Hi Paul, just some quick thoughts:


> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
> Paul Johnson
> Sent: Wednesday, July 06, 2005 10:16 AM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] Data Warehousing Tuning
>

>
> Questions are as follows:
>
> 1) Should we have set the page size to 32MB when we compiled Postgres?
>
> We mainly do bulk loads using 'copy', full-table scans and
> large joins so this would seem sensible. Tables are typically
> 10 million rows at present.

I would defer changing page size to the "fine-tuning" category, our
experience with that has not produced substantial gains.  Would focus on
the other categories you mention below first.

Also, for heavy use of COPY, you may consider using the latest release
of Bizgres 0.6, which should speed loads:
http://www.bizgres.org/pages.php?pg=downloads or
http://www.greenplum.com/prod_download.html for compiled version.


>
> 2) What are the obvious changes to make to postgresql.conf?
>
> Things like shared_buffers, work_mem, maintenance_work_mem
> and checkpoint_segments seem like good candidates given our
> data warehousing workloads.

You're on the right track, it depends on nature of queries (sorry for
giving you the "consulting" answer on that one), but here are some
PostgreSQL configurations to consider:

a - Consider using separate disk partitions for transaction log, temp
space and WAL.  See separate note about WAL and directio in Solaris
tuning note, link below.  May put temp space on a separate partition, in
anticipation of forthcoming changes which take advantage of this.

b - Sizing temp space?  Should be as large as the largest index.  Set
max speed read/write config: minimal journaling, use write-through cache
on this.

c - Might try increasing checkpoint segments (64?).  More logs produces
significant benefit.  And turn checkpoint warnings on (Off by default).

d - Sort mem and work mem - What queries are you running?  Workmem used
in aggregation/sorts.  How many concurrent reports?  For 3 complex
queries, might try 256MB at work mem?

e - You probably do this already, but always ANALYZE after loads.

f - Maintenance work mem - used in vacuum, analyze, creating bulk
indexes, bulk checking for keys.  Might consider using 512 or 750?


>
> 3) Ditto /etc/system?

See http://www.bizgres.org/bb/viewtopic.php?t=6 for Solaris.


>
> 4) We moved the pg_xlog files off /data/postgres (disks 2-7)
> and into /opt/pg_xlog (disks 0-1), but it seemed like
> performance decreased, so we moved them back again.
>
> Has anyone experienced real performance gains by moving the
> pg_xlog files?
>

Likely to help only with COPY.

Feel free to contact me directly if you have any questions on my
statements above.  There is a Configurator in development which you
might find helpful when it is complete:
http://www.bizgres.org/pages.php?pg=developers%7Cprojects%7Cconfigurator



Kind Regards,
Frank

Frank Wosczyna
Systems Engineer
+1 650 224 7374

http://www.greenplum.com

GreenPlum, Inc.
1900 South Norfolk Street, Suite 224
San Mateo, California 94403 USA


pgsql-performance by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: join and query planner
Next
From: Josh Berkus
Date:
Subject: Re: Data Warehousing Tuning