Thread: slow update but have an index

slow update but have an index

From
Feite Brekeveld
Date:
Hi,

I have a table with no relations to other tables. It has a sequence
number field (integer) and a status field being a char.

There is a unique index on the seqno field.

Now this table has about 80,000 records. I need to update 74,000 status
fields. So I made a dump, and hacked the dump into SQL statements like:

update accounting set status = 'C' where seqno = 1566385;
....
and the other 74,000



This is awfully slow. How come ? The index on the seqno should give
speedy access to the record.

Thanks,

--
Feite Brekeveld
feite.brekeveld@osiris-it.nl
http://www.osiris-it.nl




Re: slow update but have an index

From
Martijn van Oosterhout
Date:
On Fri, Aug 17, 2001 at 01:08:29PM +0200, Feite Brekeveld wrote:
> Now this table has about 80,000 records. I need to update 74,000 status
> fields. So I made a dump, and hacked the dump into SQL statements like:
>
> update accounting set status = 'C' where seqno = 1566385;
> ....
> and the other 74,000
>
> This is awfully slow. How come ? The index on the seqno should give
> speedy access to the record.

Well, an index speeds it up, but that times 80,000 will still take a while.
Is there any trickery or will this work?

update accounting set status = 'C';

If so, that will be much faster.

One sequential scan is faster than 80,000 index scans.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: slow update but have an index

From
Feite Brekeveld
Date:
Martijn van Oosterhout wrote:

> On Fri, Aug 17, 2001 at 01:08:29PM +0200, Feite Brekeveld wrote:
> > Now this table has about 80,000 records. I need to update 74,000 status
> > fields. So I made a dump, and hacked the dump into SQL statements like:
> >
> > update accounting set status = 'C' where seqno = 1566385;
> > ....
> > and the other 74,000
> >
> > This is awfully slow. How come ? The index on the seqno should give
> > speedy access to the record.
>
> Well, an index speeds it up, but that times 80,000 will still take a while.
> Is there any trickery or will this work?
>
> update accounting set status = 'C';
>
> If so, that will be much faster.

No that will not work, because they other 6000 need not to be changed. Of
course I could update the this way and change the other 6000 back to their
original status, but the query I issued is so slow that I think something is
wrong.



>
>
> One sequential scan is faster than 80,000 index scans.
>
> --
> Martijn van Oosterhout <kleptog@svana.org>
> http://svana.org/kleptog/
> > It would be nice if someone came up with a certification system that
> > actually separated those who can barely regurgitate what they crammed over
> > the last few weeks from those who command secret ninja networking powers.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Feite Brekeveld
feite.brekeveld@osiris-it.nl
http://www.osiris-it.nl




Re: slow update but have an index

From
Martijn van Oosterhout
Date:
On Fri, Aug 17, 2001 at 01:40:32PM +0200, Feite Brekeveld wrote:
> > Well, an index speeds it up, but that times 80,000 will still take a while.
> > Is there any trickery or will this work?
> >
> > update accounting set status = 'C';
> >
> > If so, that will be much faster.
>
> No that will not work, because they other 6000 need not to be changed. Of
> course I could update the this way and change the other 6000 back to their
> original status, but the query I issued is so slow that I think something is
> wrong.

Well, there's a bit of an issue here. Each time you do an insert, the table
gets larger, the index gets larger, etc. Disk accesses everywhere. If you
can do it one query then the sequential is much friendlier to disk caches
and the performance will be much more consistant.

Can you codify in an SQL query how you decide which records to change. I've
found the best way to improve performance is to minimise the number of
queries, letting the database do the maximum optimisation possible.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: slow update but have an index

From
Feite Brekeveld
Date:
Martijn van Oosterhout wrote:

> On Fri, Aug 17, 2001 at 01:40:32PM +0200, Feite Brekeveld wrote:
> > > Well, an index speeds it up, but that times 80,000 will still take a while.
> > > Is there any trickery or will this work?
> > >
> > > update accounting set status = 'C';
> > >
> > > If so, that will be much faster.
> >
> > No that will not work, because they other 6000 need not to be changed. Of
> > course I could update the this way and change the other 6000 back to their
> > original status, but the query I issued is so slow that I think something is
> > wrong.
>
> Well, there's a bit of an issue here. Each time you do an insert, the table
> gets larger, the index gets larger, etc. Disk accesses everywhere. If you
> can do it one query then the sequential is much friendlier to disk caches
> and the performance will be much more consistant.
>
> Can you codify in an SQL query how you decide which records to change. I've
> found the best way to improve performance is to minimise the number of
> queries, letting the database do the maximum optimisation possible.
>

hacked it with perl into several

update ... where seqno between x and y statements.

That went smoothly.



>
> --
> Martijn van Oosterhout <kleptog@svana.org>
> http://svana.org/kleptog/
> > It would be nice if someone came up with a certification system that
> > actually separated those who can barely regurgitate what they crammed over
> > the last few weeks from those who command secret ninja networking powers.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Feite Brekeveld
feite.brekeveld@osiris-it.nl
http://www.osiris-it.nl




Re: slow update but have an index

From
Tom Lane
Date:
Feite Brekeveld <feite.brekeveld@osiris-it.nl> writes:
> fields. So I made a dump, and hacked the dump into SQL statements like:
> update accounting set status = 'C' where seqno = 1566385;
> ....
> and the other 74,000
> This is awfully slow. How come ? The index on the seqno should give
> speedy access to the record.

I see you've already solved your problem, but for the archives here's
a couple of suggestions:

1. Make sure you actually *are* getting an index scan --- use EXPLAIN
on the query to check.  If not, have you VACUUM ANALYZEd lately?

2. Wrap the series of commands into a single transaction to avoid
per-update transaction overhead:
    BEGIN;
    update ...;
    update ...;
    ...
    COMMIT;
This can save a good deal of disk activity, since each commit forces
fsync.

3. Rethink whether you can't accomplish the same thing in fewer SQL
commands.  The overhead of parsing and planning a query is way more
than the time taken to find and update one single record.  So, the
more work done per command, the better.

            regards, tom lane

Re: slow update but have an index

From
"Jeff Eckermann"
Date:
My experience with bulk inserts/updates is that indexes cause a major
performance hit, because they have to be rebuilt after every insert/update.
Also, each insert/update is wrapped in a transaction by default, so there is
transaction overhead for each one too.
Wrapping all of the statements in a BEGIN; .... COMMIT; would probably solve
both of these problems.  Dropping and recreating the index might help too.
Results will vary according to the situation.

----- Original Message -----
From: "Feite Brekeveld" <feite.brekeveld@osiris-it.nl>
To: "Martijn van Oosterhout" <kleptog@svana.org>
Cc: <pgsql-general@postgresql.org>
Sent: Friday, August 17, 2001 7:18 AM
Subject: Re: [GENERAL] slow update but have an index


> Martijn van Oosterhout wrote:
>
> > On Fri, Aug 17, 2001 at 01:40:32PM +0200, Feite Brekeveld wrote:
> > > > Well, an index speeds it up, but that times 80,000 will still take a
while.
> > > > Is there any trickery or will this work?
> > > >
> > > > update accounting set status = 'C';
> > > >
> > > > If so, that will be much faster.
> > >
> > > No that will not work, because they other 6000 need not to be changed.
Of
> > > course I could update the this way and change the other 6000 back to
their
> > > original status, but the query I issued is so slow that I think
something is
> > > wrong.
> >
> > Well, there's a bit of an issue here. Each time you do an insert, the
table
> > gets larger, the index gets larger, etc. Disk accesses everywhere. If
you
> > can do it one query then the sequential is much friendlier to disk
caches
> > and the performance will be much more consistant.
> >
> > Can you codify in an SQL query how you decide which records to change.
I've
> > found the best way to improve performance is to minimise the number of
> > queries, letting the database do the maximum optimisation possible.
> >
>
> hacked it with perl into several
>
> update ... where seqno between x and y statements.
>
> That went smoothly.
>
>
>
> >
> > --
> > Martijn van Oosterhout <kleptog@svana.org>
> > http://svana.org/kleptog/
> > > It would be nice if someone came up with a certification system that
> > > actually separated those who can barely regurgitate what they crammed
over
> > > the last few weeks from those who command secret ninja networking
powers.
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
> --
> Feite Brekeveld
> feite.brekeveld@osiris-it.nl
> http://www.osiris-it.nl
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>


Re: slow update but have an index

From
Tod McQuillin
Date:
On Fri, 17 Aug 2001, Feite Brekeveld wrote:

> I have a table with no relations to other tables. It has a sequence
> number field (integer) and a status field being a char.
>
> There is a unique index on the seqno field.
>
> Now this table has about 80,000 records. I need to update 74,000 status
> fields. So I made a dump, and hacked the dump into SQL statements like:
>
> update accounting set status = 'C' where seqno = 1566385;
> ...
> and the other 74,000
>
> This is awfully slow. How come ? The index on the seqno should give
> speedy access to the record.

Unless you are putting all these updates in one transaction, each UPDATE
will be in its own separate transaction.  This means you get
per-transaction overhead on every query.

I bet you will find this runs a lot faster if you 'BEGIN TRANSACTION'
before all the updates and 'COMMIT TRANSACTION' at the end.
--
Tod McQuillin