Re: Performance Optimization for Dummies 2 - the SQL - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: Performance Optimization for Dummies 2 - the SQL
Date
Msg-id b42b73150610061153i3d11d7a4r23e1d7c5317feb9f@mail.gmail.com
Whole thread Raw
In response to Re: Performance Optimization for Dummies 2 - the SQL  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: Performance Optimization for Dummies 2 - the SQL
List pgsql-performance
On 10/6/06, Scott Marlowe <smarlowe@g2switchworks.com> wrote:
> On Fri, 2006-10-06 at 11:44, Carlo Stonebanks wrote:
> > This didn't work right away, but DID work after running a VACUUM FULL. In
> > other words, i was still stuck with a sequential scan until after the
> > vacuum.
> >
> > I turned autovacuum off in order to help with the import, but was perfoming
> > an ANALYZE with every 500 rows imported.

how did you determine that it is done every 500 rows? this is the
default autovacuum paramater.  if you followed my earlier
recommendations, you are aware that autovacuum (which also analyzes)
is not running during bulk inserts, right?

imo, best way to do big data import/conversion is to:
1. turn off all extra features, like stats, logs, etc
2. use copy interface to load data into scratch tables with probably
all text fields
3. analyze (just once)
4. use big queries to transform, normalize, etc
5. drop scratch tables
6. set up postgresql.conf for production use, fsync, stats, etc

important feature of analyze is to tell the planner approx. how big
the tables are.

merlin

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Performance Optimization for Dummies 2 - the SQL
Next
From: "Carlo Stonebanks"
Date:
Subject: Re: Performance Optimization for Dummies 2 - the SQL