Thread: indexes greatly slowing data entry
I'm having a problem with performance and indexing seems to be the cause. I have a database with two tables with exactly the same structure. There are two fields in each which I need to have indexed. One is a varchar(12), the other a timestamp. Data comes from another computer as an ASCII delimited text file at about 20,000 records per hour. Everything was fine until the database grew to about 10 Gigabytes, then the computer could no longer keep up with putting the data into the postgres database (using a perl script with DBI.) I found that when I dropped one of the indexes on varchar, the problems went away. It did not matter from which table I dropped the index. With both varchar indexes in place, data for the table with the varchar index created second would take 10 times as long to store data into compared to the other table. From vmstat: blocks in (bi) is 10 times higher with both varchar indexes when compared to with only one, and it looks like the raid array is maxed out. The indexes on timestamp did not seem to impinge on performance at all. There is no swapping to/from disk going on. Can anyone help me figure out what is the problem/solution? I do an automated vacuum analyze nightly. Performance is not increased afterward. pg 7.0.3 on Red Hat Linux 7.1. ************************************************************************ This e-mail and any of its attachments may contain Exelon Corporation proprietary information, which is privileged, confidential, or subject to copyright belonging to the Exelon Corporation family of Companies. This e-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately and permanently delete the original and any copy of this e-mail and any printout. Thank You. ************************************************************************
"DE Sesa, Michael C." <michael.desesa@exeloncorp.com> writes: > With both varchar indexes in place, data for the table with the varchar > index created second would take 10 times as long to store data into compared > to the other table. From vmstat: blocks in (bi) is 10 times higher with > both varchar indexes when compared to with only one, and it looks like the > raid array is maxed out. How much RAM do you have, and what is your shared_buffers setting? I'm speculating that you're simply thrashing because you don't have enough buffers to retain the top layers of all four indexes. Also, is the varchar data reasonably unique, or are there many duplicate values? btree is known to have some performance problems with large numbers of equal keys, particularly in pre-7.1 releases. > pg 7.0.3 on Red Hat Linux 7.1. Running a more modern version might help. regards, tom lane
> I do an automated vacuum analyze nightly. Performance is not increased > afterward. I have experienced problems with indexes on varchar columns causing updates to be very slow. You might also try doing a REINDEX TABLE command on the relevant tables. I have found this to give a performance increase for queries/updates, but generally still find character indexes to drag down insert/update performance. Tom Burke Eppend http://www.eppend.com/
"Tom Burke" <lists@spamex.com> writes: > I have experienced problems with indexes on varchar columns causing > updates to be very slow. You might also try doing a REINDEX TABLE > command on the relevant tables. I have found this to give a > performance increase for queries/updates, but generally still find > character indexes to drag down insert/update performance. Coupla questions: did you build with --enable-locale? How about --enable-multibyte? If yes to either, which locale and encoding settings are you using exactly? Also, what platforms are involved here? The known problems with strcoll() in early 2.2.* glibc releases are in the back of my mind here, though the symptoms we have seen from that bug ran in the "backend dumps core" line rather than "backend is slow" line. Finally, do your indexed varchar columns tend to have lots and lots of repeated values? regards, tom lane
We are having the exact same problem on PostgreSQL 7.1.3. In our situation, vacuum analyze, fixes the problem! There is hardly any hard disk activity (P3 933 MHz server with 512 MB RAM, SCSI HDU 10,000 RPM). I have increased the SHARED_MEM buffers in postgres.conf and change the commit to be async (if I recall correctly). We are importing anywhere from 10,000 - 100,000 records each day and after this import, performance takes a major nose dive on us. And, because this data gets imported during normal business hours we can not tie up the server with a vacuum. I understand 7.2 has a better vacuum but this doesn't seem to solve our problem because we need to be able to use the database during the 30 minutes it takes to import the data. After about 5,000 records is when we notice the performance drop. There are approximately 25 users on the database at any given time. Problem seems slightly amplified when users logged goes from 1 to 25. Any ideas?!?!?! Thanks, Eric. ""DE Sesa, Michael C."" <michael.desesa@exeloncorp.com> wrote in message news:AFAB5F8466DAD411B586000255B03E1416CCC1@mobntxch12.peco.com... > I'm having a problem with performance and indexing seems to be the cause. I > have a database with two tables with exactly the same structure. There are > two fields in each which I need to have indexed. One is a varchar(12), the > other a timestamp. > > Data comes from another computer as an ASCII delimited text file at about > 20,000 records per hour. Everything was fine until the database grew to > about 10 Gigabytes, then the computer could no longer keep up with putting > the data into the postgres database (using a perl script with DBI.) I found > that when I dropped one of the indexes on varchar, the problems went away. > It did not matter from which table I dropped the index. > > With both varchar indexes in place, data for the table with the varchar > index created second would take 10 times as long to store data into compared > to the other table. From vmstat: blocks in (bi) is 10 times higher with > both varchar indexes when compared to with only one, and it looks like the > raid array is maxed out. > > The indexes on timestamp did not seem to impinge on performance at all. > There is no swapping to/from disk going on. Can anyone help me figure out > what is the problem/solution? > > I do an automated vacuum analyze nightly. Performance is not increased > afterward. > > pg 7.0.3 on Red Hat Linux 7.1. > > > ************************************************************************ > This e-mail and any of its attachments may contain Exelon Corporation > proprietary information, which is privileged, confidential, or subject > to copyright belonging to the Exelon Corporation family of Companies. > This e-mail is intended solely for the use of the individual or entity > to which it is addressed. If you are not the intended recipient of this > e-mail, you are hereby notified that any dissemination, distribution, > copying, or action taken in relation to the contents of and attachments > to this e-mail is strictly prohibited and may be unlawful. If you have > received this e-mail in error, please notify the sender immediately and > permanently delete the original and any copy of this e-mail and any > printout. Thank You. > ************************************************************************ > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > >
On Tue, Jun 25, 2002 at 04:16:05PM -0400, Eric wrote: > We are importing anywhere from 10,000 - 100,000 records each day and after > this import, performance takes a major nose dive on us. > > And, because this data gets imported during normal business hours we can not > tie up the server with a vacuum. I understand 7.2 has a better vacuum but > this doesn't seem to solve our problem because we need to be able to use the > database during the 30 minutes it takes to import the data. 30 minutes for 100,000 records seems really high. How are you importing the data? Are you using COPY? If you're using INSERTs, are you putting multiple INSERTs inside a single transaction? > There are approximately 25 users on the database at any given time. Problem > seems slightly amplified when users logged goes from 1 to 25. Upgrading to 7.2 would help the VACUUM performance, at any rate. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
I cannot find how to rename an index? This should be easy no? David Blood Matraex, Inc
On Thu, Jul 11, 2002 at 04:02:01PM -0600, David Blood wrote: > I cannot find how to rename an index? This should be easy no? The same way you rename a table. Just use the index name as the table name. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.