Thread: inserting into brand new database faster than old database
Hello, I have been a happy postgresql developer for a few years now. Recently I have discovered a very strange phenomenon in regards to inserting rows. My app inserts millions of records a day, averaging about 30 rows a second. I use autovac to make sure my stats and indexes are up to date. Rarely are rows ever deleted. Each day a brand new set of tables is created and eventually the old tables are dropped. The app calls functions which based on some simple logic perform the correct inserts. The problem I am seeing is that after a particular database gets kinda old, say a couple of months, performance begins to degrade. Even after creating brand new tables my insert speed is slow in comparison ( by a magnitude of 5 or more ) with a brand new schema which has the exact same tables. I am running on an IBM 360 dual processor Linux server with a 100 gig raid array spanning 5 scsi disks. The machine has 1 gig of ram of which 500 meg is dedicated to Postgresql. Just to be clear, the question I have is why would a brand new db schema allow inserts faster than an older schema with brand new tables? Since the tables are empty to start, vacuuming should not be an issue at all. Each schema is identical in every way except the db name and creation date. Any ideas are appreciated. Thanks, T.R. Missner
I don't think I have enough detail about your app. Couple of questions, are there any tables that recieve a lot of inserts / updates / deletes that are not deleted and recreated often? If so, one possibility is that you don't have a large enough FSM settings and your table is actually growing despite using autovac. Does that sounds possbile to you? Missner, T. R. wrote: > Hello, > > I have been a happy postgresql developer for a few years now. Recently > I have discovered a very strange phenomenon in regards to inserting > rows. > > My app inserts millions of records a day, averaging about 30 rows a > second. I use autovac to make sure my stats and indexes are up to date. > Rarely are rows ever deleted. Each day a brand new set of tables is > created and eventually the old tables are dropped. The app calls > functions which based on some simple logic perform the correct inserts. > > > The problem I am seeing is that after a particular database gets kinda > old, say a couple of months, performance begins to degrade. Even after > creating brand new tables my insert speed is slow in comparison ( by a > magnitude of 5 or more ) with a brand new schema which has the exact > same tables. I am running on an IBM 360 dual processor Linux server > with a 100 gig raid array spanning 5 scsi disks. The machine has 1 gig > of ram of which 500 meg is dedicated to Postgresql. > > Just to be clear, the question I have is why would a brand new db schema > allow inserts faster than an older schema with brand new tables? Since > the tables are empty to start, vacuuming should not be an issue at all. > Each schema is identical in every way except the db name and creation > date. > > Any ideas are appreciated. > > Thanks, > > T.R. Missner > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
I do have one table that acts as a lookup table and grows in size as the app runs, however in the tests I have been doing I have dropped and recreated all tables including the lookup table. I keep wondering how disk is allocated to a particular DB. Also is there any way I could tell whether the writes to disk are the bottleneck? T.R. Missner Level(3) Communications SSID tools Senior Software Engineer -----Original Message----- From: Matthew T. O'Connor [mailto:matthew@zeut.net] Sent: Wednesday, July 07, 2004 1:17 PM To: Missner, T. R. Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] inserting into brand new database faster than old database I don't think I have enough detail about your app. Couple of questions, are there any tables that recieve a lot of inserts / updates / deletes that are not deleted and recreated often? If so, one possibility is that you don't have a large enough FSM settings and your table is actually growing despite using autovac. Does that sounds possbile to you? Missner, T. R. wrote: > Hello, > > I have been a happy postgresql developer for a few years now. Recently > I have discovered a very strange phenomenon in regards to inserting > rows. > > My app inserts millions of records a day, averaging about 30 rows a > second. I use autovac to make sure my stats and indexes are up to date. > Rarely are rows ever deleted. Each day a brand new set of tables is > created and eventually the old tables are dropped. The app calls > functions which based on some simple logic perform the correct inserts. > > > The problem I am seeing is that after a particular database gets kinda > old, say a couple of months, performance begins to degrade. Even after > creating brand new tables my insert speed is slow in comparison ( by a > magnitude of 5 or more ) with a brand new schema which has the exact > same tables. I am running on an IBM 360 dual processor Linux server > with a 100 gig raid array spanning 5 scsi disks. The machine has 1 gig > of ram of which 500 meg is dedicated to Postgresql. > > Just to be clear, the question I have is why would a brand new db schema > allow inserts faster than an older schema with brand new tables? Since > the tables are empty to start, vacuuming should not be an issue at all. > Each schema is identical in every way except the db name and creation > date. > > Any ideas are appreciated. > > Thanks, > > T.R. Missner > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
Missner, T. R. wrote: > Hello, > > I have been a happy postgresql developer for a few years now. Recently > I have discovered a very strange phenomenon in regards to inserting > rows. > > My app inserts millions of records a day, averaging about 30 rows a > second. I use autovac to make sure my stats and indexes are up to date. > Rarely are rows ever deleted. Each day a brand new set of tables is > created and eventually the old tables are dropped. The app calls > functions which based on some simple logic perform the correct inserts. Have you profiled where the time goes in a brand new schema and a degraded database? Is it IO? Is it CPU? Is the function making decision becoming bottleneck? > The problem I am seeing is that after a particular database gets kinda > old, say a couple of months, performance begins to degrade. Even after > creating brand new tables my insert speed is slow in comparison ( by a > magnitude of 5 or more ) with a brand new schema which has the exact > same tables. I am running on an IBM 360 dual processor Linux server > with a 100 gig raid array spanning 5 scsi disks. The machine has 1 gig > of ram of which 500 meg is dedicated to Postgresql. > > Just to be clear, the question I have is why would a brand new db schema > allow inserts faster than an older schema with brand new tables? Since > the tables are empty to start, vacuuming should not be an issue at all. > Each schema is identical in every way except the db name and creation > date. You can do few things. - Get explain analyze. See the difference between actual and projected timings. The difference is the hint about where planner is going wrong. - Is IO your bottleneck? Are vacuum taking longer and longer? If yes then you could try the vacuum delay patch. If your IO is saturated for any reason, everything is going to crawl - Are your indexes bloat free? If you are using pre7.x,vacuum does not clean up indexes. You need to reindex. - Have you vacuumed the complete database? If the catalogs collect dead space it could cause degradation too but that is just a guess. Basically monitor slow inserts and try to find out where time is spent. HTH Shridhar
"Missner, T. R." <T.R.Missner@Level3.com> writes: > ... Each day a brand new set of tables is > created and eventually the old tables are dropped. You did not say which PG version you are using (tut tut) but my first thought is that it's a pre-7.4 release and your problems trace to bloat in the system-catalog indexes. The indexes on pg_class and pg_attribute would be quite likely to suffer serious bloat if you continually create and drop tables, because the range of useful table OIDs will be continually shifting. We didn't fix this until 7.4. If you are seeing this in 7.4.* then more investigation is needed... regards, tom lane