Re: Insert performance vs Table size - Mailing list pgsql-performance

From Jacques Caron
Subject Re: Insert performance vs Table size
Date
Msg-id 6.2.0.14.0.20050628123516.03ac59e0@wheresmymailserver.com
Whole thread Raw
In response to Re: Insert performance vs Table size  ("Praveen Raja" <praveen.raja@netlight.se>)
List pgsql-performance
Hi,

At 11:50 28/06/2005, Praveen Raja wrote:
>I assume you took size to mean the row size?

Nope, the size of the table.

>  What I really meant was
>does the number of rows a table has affect the performance of new
>inserts into the table (just INSERTs) all other things remaining
>constant. Sorry for the confusion.

As I said previously, in most cases it does. One of the few cases where it
doesn't would be an append-only table, no holes, no indexes, no foreign keys...

>I know that having indexes on the table adds an overhead but again does
>this overhead increase (for an INSERT operation) with the number of rows
>the table contains?

It depends on what you are indexing. If the index key is something that
grows monotonically (e.g. a unique ID or a timestamp), then the size of the
table (and hence of the indexes) should have a very limited influence on
the INSERTs. If the index key is anything else (and that must definitely be
the case if you have 7 or 8 indexes!), then that means updates will happen
all over the indexes, which means a lot of read and write activity, and
once the total size of your indexes exceeds what can be cached in RAM,
performance will decrease quite a bit. Of course if your keys are
concentrated in a few limited areas of the key ranges it might help.

>My instinct says no to both. If I'm wrong can someone explain why the
>number of rows in a table affects INSERT performance?

As described above, maintaining indexes when you "hit" anywhere in said
indexes is very costly. The larger the table, the larger the indexes, the
higher the number of levels in the trees, etc. As long as it fits in RAM,
it shouldn't be a problem. Once you exceed that threshold, you start
getting a lot of random I/O, and that's expensive.

Again, it depends a lot on your exact schema, the nature of the data, the
spread of the different values, etc, but I would believe it's more often
the case than not.

Jacques.



pgsql-performance by date:

Previous
From: "Praveen Raja"
Date:
Subject: Re: Insert performance vs Table size
Next
From: Christopher Kings-Lynne
Date:
Subject: Re: Too slow querying a table of 15 million records