Re: Insert performance vs Table size - Mailing list pgsql-performance

From Praveen Raja
Subject Re: Insert performance vs Table size
Date
Msg-id 002401c57bc6$ca56e2c0$4c0ca8c0@sto.netlight.se
Whole thread Raw
In response to Re: Insert performance vs Table size  (Jacques Caron <jc@directinfos.com>)
Responses Re: Insert performance vs Table size
Re: Insert performance vs Table size
List pgsql-performance
I assume you took size to mean the row size? What I really meant was
does the number of rows a table has affect the performance of new
inserts into the table (just INSERTs) all other things remaining
constant. Sorry for the confusion.

I know that having indexes on the table adds an overhead but again does
this overhead increase (for an INSERT operation) with the number of rows
the table contains?

My instinct says no to both. If I'm wrong can someone explain why the
number of rows in a table affects INSERT performance?

Thanks again

-----Original Message-----
From: Jacques Caron [mailto:jc@directinfos.com]
Sent: 27 June 2005 14:05
To: Praveen Raja
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Insert performance vs Table size

Hi,

At 13:50 27/06/2005, Praveen Raja wrote:
>Just to clear things up a bit, the scenario that I'm interested in is a
>table with a large number of indexes on it (maybe 7-8).

If you're after performance you'll want to carefully consider which
indexes
are really useful and/or redesign your schema so that you can have less
indexes on that table. 7 or 8 indexes is quite a lot, and that really
has a
cost.

>  In this scenario
>other than the overhead of having to maintain the indexes (which I'm
>guessing is the same regardless of the size of the table)

Definitely not: indexes grow with the size of the table. Depending on
what
columns you index (and their types), the indexes may be a fraction of
the
size of the table, or they may be very close in size (in extreme cases
they
may even be larger). With 7 or 8 indexes, that can be quite a large
volume
of data to manipulate, especially if the values of the columns inserted
can
span the whole range of the index (rather than being solely id- or
time-based, for instance, in which case index updates are concentrated
in a
small area of each of the indexes), as this means you'll need to have a
majority of the indexes in RAM if you want to maintain decent
performance.

>does the size of the table play a role in determining insert
performance
>(and I mean
>only insert performance)?

In this case, it's really the indexes that'll cause you trouble, though
heavily fragmented tables (due to lots of deletes or updates) will also
incur a penalty just for the data part of the inserts.

Also, don't forget the usual hints if you are going to do lots of
inserts:
- batch them in large transactions, don't do them one at a time
- better yet, use COPY rather than INSERT
- in some situations, you might be better of dropping the indexes, doing

large batch inserts, then re-creating the indexes. YMMV depending on the

existing/new ratio, whether you need to maintain indexed access to the
tables, etc.
- pay attention to foreign keys

Jacques.



pgsql-performance by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: Postgresql7.4.5 running slow on plpgsql function
Next
From: Jacques Caron
Date:
Subject: Re: Insert performance vs Table size