Thread: inserting into brand new database faster than old database

inserting into brand new database faster than old database

From

"Missner, T. R."

Date:

07 July 2004, 14:24:34

Hello,

I have been a happy postgresql developer for a few years now.  Recently
I have discovered a very strange phenomenon in regards to inserting
rows.

My app inserts millions of records a day, averaging about 30 rows a
second. I use autovac to make sure my stats and indexes are up to date.
Rarely are rows ever deleted.  Each day a brand new set of tables is
created and eventually the old tables are dropped. The app calls
functions which based on some simple logic perform the correct inserts.


The problem I am seeing is that after a particular database gets kinda
old, say a couple of months, performance begins to degrade.  Even after
creating brand new tables my insert speed is slow in comparison ( by a
magnitude of 5 or more ) with a brand new schema which has the exact
same tables.  I am running on an IBM 360 dual processor Linux server
with a 100 gig raid array spanning 5 scsi disks.  The machine has 1 gig
of ram of which 500 meg is dedicated to Postgresql.

Just to be clear, the question I have is why would a brand new db schema
allow inserts faster than an older schema with brand new tables?  Since
the tables are empty to start, vacuuming should not be an issue at all.
Each schema is identical in every way except the db name and creation
date.

Any ideas are appreciated.

Thanks,

T.R. Missner

Re: inserting into brand new database faster than old database

From

"Matthew T. O'Connor"

Date:

07 July 2004, 16:25:21

I don't think I have enough detail about your app.  Couple of questions,
are there any tables that recieve a lot of inserts / updates / deletes
that are not deleted and recreated often?  If so, one possibility is
that you don't have a large enough FSM settings and your table is
actually growing despite using autovac.  Does that sounds possbile to you?

Missner, T. R. wrote:

> Hello,
>
> I have been a happy postgresql developer for a few years now.  Recently
> I have discovered a very strange phenomenon in regards to inserting
> rows.
>
> My app inserts millions of records a day, averaging about 30 rows a
> second. I use autovac to make sure my stats and indexes are up to date.
> Rarely are rows ever deleted.  Each day a brand new set of tables is
> created and eventually the old tables are dropped. The app calls
> functions which based on some simple logic perform the correct inserts.
>
>
> The problem I am seeing is that after a particular database gets kinda
> old, say a couple of months, performance begins to degrade.  Even after
> creating brand new tables my insert speed is slow in comparison ( by a
> magnitude of 5 or more ) with a brand new schema which has the exact
> same tables.  I am running on an IBM 360 dual processor Linux server
> with a 100 gig raid array spanning 5 scsi disks.  The machine has 1 gig
> of ram of which 500 meg is dedicated to Postgresql.
>
> Just to be clear, the question I have is why would a brand new db schema
> allow inserts faster than an older schema with brand new tables?  Since
> the tables are empty to start, vacuuming should not be an issue at all.
> Each schema is identical in every way except the db name and creation
> date.
>
> Any ideas are appreciated.
>
> Thanks,
>
> T.R. Missner
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>

Re: inserting into brand new database faster than old database

From

"Missner, T. R."

Date:

07 July 2004, 16:28:58

I do have one table that acts as a lookup table and grows in size as the
app runs, however in the tests I have been doing I have dropped and
recreated all tables including the lookup table.

I keep wondering how disk is allocated to a particular DB. Also is there
any way I could tell whether the writes to disk are the bottleneck?



T.R. Missner
Level(3) Communications
SSID tools
Senior Software Engineer


-----Original Message-----
From: Matthew T. O'Connor [mailto:matthew@zeut.net]
Sent: Wednesday, July 07, 2004 1:17 PM
To: Missner, T. R.
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] inserting into brand new database faster than old
database

I don't think I have enough detail about your app.  Couple of questions,

are there any tables that recieve a lot of inserts / updates / deletes
that are not deleted and recreated often?  If so, one possibility is
that you don't have a large enough FSM settings and your table is
actually growing despite using autovac.  Does that sounds possbile to
you?

Missner, T. R. wrote:

> Hello,
>
> I have been a happy postgresql developer for a few years now.
Recently
> I have discovered a very strange phenomenon in regards to inserting
> rows.
>
> My app inserts millions of records a day, averaging about 30 rows a
> second. I use autovac to make sure my stats and indexes are up to
date.
> Rarely are rows ever deleted.  Each day a brand new set of tables is
> created and eventually the old tables are dropped. The app calls
> functions which based on some simple logic perform the correct
inserts.
>
>
> The problem I am seeing is that after a particular database gets kinda
> old, say a couple of months, performance begins to degrade.  Even
after
> creating brand new tables my insert speed is slow in comparison ( by a
> magnitude of 5 or more ) with a brand new schema which has the exact
> same tables.  I am running on an IBM 360 dual processor Linux server
> with a 100 gig raid array spanning 5 scsi disks.  The machine has 1
gig
> of ram of which 500 meg is dedicated to Postgresql.
>
> Just to be clear, the question I have is why would a brand new db
schema
> allow inserts faster than an older schema with brand new tables?
Since
> the tables are empty to start, vacuuming should not be an issue at
all.
> Each schema is identical in every way except the db name and creation
> date.
>
> Any ideas are appreciated.
>
> Thanks,
>
> T.R. Missner
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
your
>       joining column's datatypes do not match
>

Re: inserting into brand new database faster than old database

From

Shridhar Daithankar

Date:

08 July 2004, 06:02:49

Missner, T. R. wrote:

> Hello,
>
> I have been a happy postgresql developer for a few years now.  Recently
> I have discovered a very strange phenomenon in regards to inserting
> rows.
>
> My app inserts millions of records a day, averaging about 30 rows a
> second. I use autovac to make sure my stats and indexes are up to date.
> Rarely are rows ever deleted.  Each day a brand new set of tables is
> created and eventually the old tables are dropped. The app calls
> functions which based on some simple logic perform the correct inserts.

Have you profiled where the time goes in a brand new schema and a degraded
database? Is it IO? Is it CPU? Is the function making decision becoming bottleneck?

> The problem I am seeing is that after a particular database gets kinda
> old, say a couple of months, performance begins to degrade.  Even after
> creating brand new tables my insert speed is slow in comparison ( by a
> magnitude of 5 or more ) with a brand new schema which has the exact
> same tables.  I am running on an IBM 360 dual processor Linux server
> with a 100 gig raid array spanning 5 scsi disks.  The machine has 1 gig
> of ram of which 500 meg is dedicated to Postgresql.
>
> Just to be clear, the question I have is why would a brand new db schema
> allow inserts faster than an older schema with brand new tables?  Since
> the tables are empty to start, vacuuming should not be an issue at all.
> Each schema is identical in every way except the db name and creation
> date.

You can do few things.

- Get explain analyze. See the difference between actual and projected timings.
The difference is the hint about where planner is going wrong.

- Is IO your bottleneck? Are vacuum taking longer and longer? If yes then you
could try the vacuum delay patch. If your IO is saturated for any reason,
everything is going to crawl

- Are your indexes bloat free? If you are using pre7.x,vacuum does not clean up
indexes. You need to reindex.

- Have you vacuumed the complete database? If the catalogs collect dead space it
could cause degradation too but that is just a guess.

Basically monitor slow inserts and try to find out where time is spent.

HTH

  Shridhar

Re: inserting into brand new database faster than old database

From

Tom Lane

Date:

09 July 2004, 23:52:55

"Missner, T. R." <T.R.Missner@Level3.com> writes:
> ... Each day a brand new set of tables is
> created and eventually the old tables are dropped.

You did not say which PG version you are using (tut tut) but my first
thought is that it's a pre-7.4 release and your problems trace to bloat
in the system-catalog indexes.  The indexes on pg_class and pg_attribute
would be quite likely to suffer serious bloat if you continually create
and drop tables, because the range of useful table OIDs will be
continually shifting.  We didn't fix this until 7.4.

If you are seeing this in 7.4.* then more investigation is needed...

            regards, tom lane