Re: Looking for tips - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Looking for tips
Date
Msg-id s2dd0700.024@gwmta.wicourts.gov
Whole thread Raw
In response to Looking for tips  (Oliver Crosby <ryusei@gmail.com>)
List pgsql-performance
Hi Oliver,

We had low resource utilization and poor throughput on inserts of
thousands of rows within a single database transaction.  There were a
lot of configuration parameters we changed, but the one which helped the
most was wal_buffers -- we wound up setting it to 1000.  This may be
higher than it needs to be, but when we got to something which ran well,
we stopped tinkering.  The default value clearly caused a bottleneck.

You might find this page useful:

http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html

-Kevin


>>> Oliver Crosby <ryusei@gmail.com> 07/19/05 1:21 PM >>>
I was hoping to start with tuning postgres to match the hardware, but
in any case..

The queries are all simple insert or select statements on single tables.
Eg. select x from table where y=?; or insert into table (a, b, c)
values (?, ?, ?);
In the case of selects where it's a large table, there's an index on
the column being searched, so in terms of the example above, x is
either a pkey column or other related field, and y is a non-pkey
column.

I'm not sure what you mean by structure.

I tried explain analyse on the individual queries, but I'm not sure
what can be done to manipulate them when they don't do much.

My test environment has about 100k - 300k rows in each table, and for
production I'm expecting this to be in the order of 1M+.

The OS is Redhat Enterprise 3.

I'm using a time command when I call the scripts to get a total
running time from start to finish.

I don't know what we have for RAID, but I suspect it's just a single
10k or 15k rpm hdd.

------------------------------------------------------------------------------------------------------------------------
I'll try your recommendations for shared_buffers and
effective_cache_size. Thanks John!

We're trying to improve performance on a log processing script to the
point where it can be run as close as possible to realtime. A lot of
what gets inserted depends on what's already in the db, and it runs
item-by-item... so unfortunately I can't take advantage of copy.

We tried dropping indices, copying data in, then rebuilding. It works
great for a bulk import, but the processing script went a lot slower
without them. (Each insert is preceeded by a local cache check and
then a db search to see if an ID already exists for an item.)

We have no foreign keys at the moment. Would they help?


On 7/19/05, Joshua D. Drake <jd@commandprompt.com> wrote:
> Oliver Crosby wrote:
> > Hi,
> > I'm running Postgres 7.4.6 on a dedicated server with about 1.5gigs
of ram.
> > Running scripts locally, it takes about 1.5x longer than mysql, and
the
> > load on the server is only about 21%.
>
> What queries?
> What is your structure?
> Have you tried explain analyze?
> How many rows in the table?
> Which OS?
> How are you testing the speed?
> What type of RAID?
>
>
>
> --
> Your PostgreSQL solutions company - Command Prompt, Inc.
1.800.492.2240
> PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
> Managed Services, Shared and Dedicated Hosting
> Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
>

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Impact of checkpoint_segments under continual load conditions
Next
From: Tom Lane
Date:
Subject: Re: Looking for tips