Re: Performace Optimization for Dummies - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: Performace Optimization for Dummies
Date
Msg-id 20060928184413.GU34238@nasby.net
Whole thread Raw
In response to Re: Performace Optimization for Dummies  ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
List pgsql-performance
On Thu, Sep 28, 2006 at 01:47:44PM -0400, Carlo Stonebanks wrote:
> > How are you loading the tables? Copy? Insert?
>
> Once the data is transformed, it is inserted. I don't have stats, but the
> programs visual feedback does not spend a lot of time on the "inserting
> data" message. Then again, if there is an asynchronous component to an
> insert, perhaps I am not seeing how slow an insert really is until I query
> the table.

Well, individual inserts are slow, especially if they're not wrapped up
in a transaction. And you also mentioned checking for dupes. I suspect
that you're not going to find any huge gains in tuning the database...
it sounds like the application (as in: how it's using the database) is
what needs help.

> >> work_mem = 32768
> >
> > Depending on what you are doing, this is could be to low or to high.
>
> Is this like "You could be too fat or too thin"? Aren't you impressed with
> the fact that I managed to pick the one number that was not right for
> anything?

For what you're doing, it's probably fine where it is... but while
you're in the single-thread case, you can safely make that pretty big
(like 1000000).

> >
> >> maintenance_work_mem = 32768
> >> checkpoint_segments = 128
> >> effective_cache_size = 10000
> >
> > This coudl probably be higher.

I'd suggest setting it to about 3G, or 375000.
> >
> >> random_page_cost = 3
> >> stats_start_collector = on
> >> stats_command_string = on
> >> stats_row_level = on
> >> autovacuum = on
> >
> > Stats are a hit... you need to determine if you actually need them.
>
> Unfortunately, this is the only way I know of of getting the query string to
> appear in the PostgreSQL server status display. While trying to figure out
> what is slowing things down, having that is really helpful. I also imagined
> that this sort of thing would be a performance hit when you are getting lots
> of small, concurrent queries. In my case, we have queries which are taking
> around a second to perform outer joins. They aren't competing with any other
> requests as the site is not running, we are just running one app to seed the
> data.

stats_command_string can extract a huge penalty pre-8.2, on the order of
30%. I'd turn it off unless you *really* need it. Command logging (ie:
log_min_duration_statement) is much less of a burden.

The fact that you're doing outer joins while loading data really makes
me suspect that the application needs to be changed for any real
benefits to be had. But you should still look at what EXPLAIN ANALYZE is
showing you on those queries; you might be able to find some gains
there.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

pgsql-performance by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Performace Optimization for Dummies
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Performace Optimization for Dummies