Thread: indexes greatly slowing data entry

indexes greatly slowing data entry

From
"DE Sesa, Michael C."
Date:
I'm having a problem with performance and indexing seems to be the cause.  I
have a database with two tables with exactly the same structure.  There are
two fields in each which I need to have indexed.  One is a varchar(12), the
other a timestamp.

Data comes from another computer as an ASCII delimited text file at about
20,000 records per hour. Everything was fine until the database grew to
about 10 Gigabytes, then the computer could no longer keep up with putting
the data into the postgres database (using a perl script with DBI.)  I found
that when I dropped one of the indexes on varchar, the problems went away.
It did not matter from which table I dropped the index.

With both varchar indexes in place, data for the table with the varchar
index created second would take 10 times as long to store data into compared
to the other table.  From vmstat: blocks in (bi) is 10 times higher with
both varchar indexes when compared to with only one, and it looks like the
raid array is maxed out.

The indexes on timestamp did not seem to impinge on performance at all.
There is no swapping to/from disk going on.  Can anyone help me figure out
what is the problem/solution?

I do an automated vacuum analyze nightly.  Performance is not increased
afterward.

pg 7.0.3 on Red Hat Linux 7.1.


************************************************************************
This e-mail and any of its attachments may contain Exelon Corporation
proprietary information, which is privileged, confidential, or subject
to copyright belonging to the Exelon Corporation family of Companies.
This e-mail is intended solely for the use of the individual or entity
to which it is addressed.  If you are not the intended recipient of this
e-mail, you are hereby notified that any dissemination, distribution,
copying, or action taken in relation to the contents of and attachments
to this e-mail is strictly prohibited and may be unlawful.  If you have
received this e-mail in error, please notify the sender immediately and
permanently delete the original and any copy of this e-mail and any
printout. Thank You.
************************************************************************




Re: indexes greatly slowing data entry

From
Tom Lane
Date:
"DE Sesa, Michael C." <michael.desesa@exeloncorp.com> writes:
> With both varchar indexes in place, data for the table with the varchar
> index created second would take 10 times as long to store data into compared
> to the other table.  From vmstat: blocks in (bi) is 10 times higher with
> both varchar indexes when compared to with only one, and it looks like the
> raid array is maxed out.

How much RAM do you have, and what is your shared_buffers setting?
I'm speculating that you're simply thrashing because you don't have
enough buffers to retain the top layers of all four indexes.

Also, is the varchar data reasonably unique, or are there many duplicate
values?  btree is known to have some performance problems with large
numbers of equal keys, particularly in pre-7.1 releases.

> pg 7.0.3 on Red Hat Linux 7.1.

Running a more modern version might help.

            regards, tom lane



Re: indexes greatly slowing data entry

From
"Tom Burke"
Date:
> I do an automated vacuum analyze nightly.  Performance is not increased
> afterward.

I have experienced problems with indexes on varchar columns causing
updates to be very slow. You might also try doing a REINDEX TABLE
command on the relevant tables. I have found this to give a
performance increase for queries/updates, but generally still find
character indexes to drag down insert/update performance.

Tom Burke
Eppend
http://www.eppend.com/



Re: indexes greatly slowing data entry

From
Tom Lane
Date:
"Tom Burke" <lists@spamex.com> writes:
> I have experienced problems with indexes on varchar columns causing
> updates to be very slow. You might also try doing a REINDEX TABLE
> command on the relevant tables. I have found this to give a
> performance increase for queries/updates, but generally still find
> character indexes to drag down insert/update performance.

Coupla questions: did you build with --enable-locale?  How about
--enable-multibyte?  If yes to either, which locale and encoding
settings are you using exactly?

Also, what platforms are involved here?  The known problems with
strcoll() in early 2.2.* glibc releases are in the back of my mind
here, though the symptoms we have seen from that bug ran in the
"backend dumps core" line rather than "backend is slow" line.

Finally, do your indexed varchar columns tend to have lots and
lots of repeated values?

            regards, tom lane



Re: indexes greatly slowing data entry

From
"Eric"
Date:
We are having the exact same problem on PostgreSQL 7.1.3.  In our situation,
vacuum analyze, fixes the problem!
There is hardly any hard disk activity (P3 933 MHz server with 512 MB RAM,
SCSI HDU 10,000 RPM).

I have increased the SHARED_MEM buffers in postgres.conf and change the
commit to be async (if I recall correctly).

We are importing anywhere from 10,000 - 100,000 records each day and after
this import, performance takes a major nose dive on us.

And, because this data gets imported during normal business hours we can not
tie up the server with a vacuum.  I understand 7.2 has a better vacuum but
this doesn't seem to solve our problem because we need to be able to use the
database during the 30 minutes it takes to import the data.  After about
5,000 records is when we notice the performance drop.

There are approximately 25 users on the database at any given time.  Problem
seems slightly amplified when users logged goes from 1 to 25.

Any ideas?!?!?!

Thanks, Eric.

""DE Sesa, Michael C."" <michael.desesa@exeloncorp.com> wrote in message
news:AFAB5F8466DAD411B586000255B03E1416CCC1@mobntxch12.peco.com...
> I'm having a problem with performance and indexing seems to be the cause.
I
> have a database with two tables with exactly the same structure.  There
are
> two fields in each which I need to have indexed.  One is a varchar(12),
the
> other a timestamp.
>
> Data comes from another computer as an ASCII delimited text file at about
> 20,000 records per hour. Everything was fine until the database grew to
> about 10 Gigabytes, then the computer could no longer keep up with putting
> the data into the postgres database (using a perl script with DBI.)  I
found
> that when I dropped one of the indexes on varchar, the problems went away.
> It did not matter from which table I dropped the index.
>
> With both varchar indexes in place, data for the table with the varchar
> index created second would take 10 times as long to store data into
compared
> to the other table.  From vmstat: blocks in (bi) is 10 times higher with
> both varchar indexes when compared to with only one, and it looks like the
> raid array is maxed out.
>
> The indexes on timestamp did not seem to impinge on performance at all.
> There is no swapping to/from disk going on.  Can anyone help me figure out
> what is the problem/solution?
>
> I do an automated vacuum analyze nightly.  Performance is not increased
> afterward.
>
> pg 7.0.3 on Red Hat Linux 7.1.
>
>
> ************************************************************************
> This e-mail and any of its attachments may contain Exelon Corporation
> proprietary information, which is privileged, confidential, or subject
> to copyright belonging to the Exelon Corporation family of Companies.
> This e-mail is intended solely for the use of the individual or entity
> to which it is addressed.  If you are not the intended recipient of this
> e-mail, you are hereby notified that any dissemination, distribution,
> copying, or action taken in relation to the contents of and attachments
> to this e-mail is strictly prohibited and may be unlawful.  If you have
> received this e-mail in error, please notify the sender immediately and
> permanently delete the original and any copy of this e-mail and any
> printout. Thank You.
> ************************************************************************
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>



Re: indexes greatly slowing data entry

From
nconway@klamath.dyndns.org (Neil Conway)
Date:
On Tue, Jun 25, 2002 at 04:16:05PM -0400, Eric wrote:
> We are importing anywhere from 10,000 - 100,000 records each day and after
> this import, performance takes a major nose dive on us.
>
> And, because this data gets imported during normal business hours we can not
> tie up the server with a vacuum.  I understand 7.2 has a better vacuum but
> this doesn't seem to solve our problem because we need to be able to use the
> database during the 30 minutes it takes to import the data.

30 minutes for 100,000 records seems really high. How are you
importing the data? Are you using COPY? If you're using INSERTs,
are you putting multiple INSERTs inside a single transaction?

> There are approximately 25 users on the database at any given time.  Problem
> seems slightly amplified when users logged goes from 1 to 25.

Upgrading to 7.2 would help the VACUUM performance, at any rate.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC



how to rename an index

From
"David Blood"
Date:
I cannot  find how to rename an index? This should be easy no?

David Blood
Matraex, Inc





Re: how to rename an index

From
Martijn van Oosterhout
Date:
On Thu, Jul 11, 2002 at 04:02:01PM -0600, David Blood wrote:
> I cannot  find how to rename an index? This should be easy no?

The same way you rename a table. Just use the index name as the table name.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.