Re: Insert performance, what should I expect? - Mailing list pgsql-performance

From Andrew McMillan
Subject Re: Insert performance, what should I expect?
Date
Msg-id 1098258644.22373.161.camel@lamb.mcmillan.net.nz
Whole thread Raw
In response to Insert performance, what should I expect?  (Brock Henry <brock.henry@gmail.com>)
List pgsql-performance
On Wed, 2004-10-20 at 11:53 +1000, Brock Henry wrote:
>
> Test 1, For each import, I'm dropping all indexes and pkeys/fkeys,
> then importing, then adding keys and indexes. Then I've got successive
> runs. I figure the reindexing will get more expensive as the database
> grows?

Sounds like the right approach to me, if the tables are empty before the
import.


> Successive Imports: 44,49,50,57,55,61,72 (seconds)
> = average 1051inserts/second (which now that I've written this seems
> fairly good)

(A) Are you doing the whole thing inside a transaction?  This will be
significantly quicker.  COPY would probably be quicker still, but the
biggest difference will be a single transaction.

(B) If you are starting with empty files, are you ensuring that the dead
records are vacuumed before you start?  I would recommend a "vacuum
full" on the affected tables prior to the first import run (i.e. when
the tables are empty).  This is likely to be the reason that the timing
on your successive imports increases so much.



> sort_mem = 4096

You probably want to increase this - if you have 1G of RAM then there is
probably some spare.  But if you actually expect to use 32 connections
then 32 * 4M = 128M might mean a careful calculation is needed.  If you
are really only likely to have 1-2 connections running concurrently then
increase it to (e.g.) 32768.

> max_fsm_relations = 300

If you do a "vacuum full verbose;" the last line will give you some
clues as to what to set this (and max_fsm_pages) too.


> effective_cache_size = 16000

16000 * 8k = 128M seems low for a 1G machine - probably you could say
64000 without fear of being wrong.  What does "free" show as "cached"?
Depending on how dedicated the machine is to the database, the effective
cache size may be as much as 80-90% of that.


> Can I expect it to go faster than this? I'll see where I can make my
> script itself go faster, but I don't think I'll be able to do much.
> I'll do some pre-prepare type stuff, but I don't expect significant
> gains, maybe 5-10%. I'd could happily turn off fsync for this job, but
> not for some other databases the server is hosting.

You can probably double the speed - maybe more.

Cheers,
                    Andrew,
-------------------------------------------------------------------------
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/            PHYS: Level 2, 150-154 Willis St
DDI: +64(4)803-2201      MOB: +64(272)DEBIAN      OFFICE: +64(4)499-2267
              How many things I can do without! -- Socrates
-------------------------------------------------------------------------


Attachment

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: how much mem to give postgres?
Next
From: Aaron Werman
Date:
Subject: Re: Free PostgreSQL Training, Philadelphia, Oct 30