Thread: updates (postgreSQL) very slow

updates (postgreSQL) very slow

From
"Bobbie van der Westhuizen"
Date:
Hi,

Can someone please help me. My PostgreSQL queries
are very slow, especially update statements. What
can I do to improve the speed? I have already try
VACUUM and ANALYZE. " From the command line I have
vacuumdb -z -a -f
from inside psql:
database=# VACUUM FULL ANALYZE;"

I work on a Linux (Mandake 9.1) computer with 2 Zeon
Prosessors and 4Gig of ram. The HD on which
postgreSQL is running is a 80Gig drive and read
55Mbit/Sec. The next query is an example. The table
in which I work here contains 747 524 records and 14
fields (columns).


bons_acc=# explain update edc_ww set edc=null;
               QUERY PLAN
           -------------------------------------
---------------------------
               Seq Scan on edc_ww
(cost=0.00..156793.91 rows=3491 width=184)
           (1 row)

            Time: 0.61 ms



bons_acc=# update edc_ww set edc=null;

UPDATE 747524
Time: 7628686.23 ms

This is just a Seq Scan where a numeric field must be updated to
NULL but if I run it you can see that this “simple” query takes
forever (7628686.23 ms this is over 2 hours for only updating
747524 records!). I don’t think that the tables are to big? Could it
be my hardware/software/postgreSQL? What can I do to
optimise postgreSQL? I already increased the shared buffer in
the conf. file aswell.

Bobbie______________________________________________
____________
Bobbie van der Westhuizen
Quantitative Animal Breeding (BLUP)
ARC - Animal Improvement Institute
+27-12-672-9128 (o/h)
+27-12-665-1419 (fax)
bobbie@irene.agric.za
____________________________________________________
______


Re: updates (postgreSQL) very slow

From
"Fred Moyer"
Date:
> Can someone please help me. My PostgreSQL queries
> are very slow, especially update statements. What
> can I do to improve the speed? I have already try
> VACUUM and ANALYZE. " From the command line I have
> vacuumdb -z -a -f
> from inside psql:
> database=# VACUUM FULL ANALYZE;"

Good start to taking performance measurements - the planner's statistics
are updated you will be able to get accurate analyses.

> I work on a Linux (Mandake 9.1) computer with 2 Zeon
> Prosessors and 4Gig of ram. The HD on which
> postgreSQL is running is a 80Gig drive and read
> 55Mbit/Sec. The next query is an example. The table
> in which I work here contains 747 524 records and 14
> fields (columns).

So you have some decent sized hardware here but you may want a second
drive for backups or housing the logfiles.  Also your database is not
exceptionally large - PostgreSQL can handle much larger.

> bons_acc=# explain update edc_ww set edc=null;
>                Seq Scan on edc_ww
> (cost=0.00..156793.91 rows=3491 width=184)
>            (1 row)
>             Time: 0.61 ms
>
> bons_acc=# update edc_ww set edc=null;
>
> UPDATE 747524
> Time: 7628686.23 ms
>
> This is just a Seq Scan where a numeric field must be updated to
> NULL but if I run it you can see that this “simple” query takes
> forever (7628686.23 ms this is over 2 hours for only updating
> 747524 records!). I don’t think that the tables are to big? Could it
> be my hardware/software/postgreSQL? What can I do to
> optimise postgreSQL? I already increased the shared buffer in
> the conf. file aswell.

For improving the performance of large updates such as this query you will
need to adjust some other parameters of postgresql.conf such as increasing
the number of checkpoint segments and setting logging to a minimum level.
Or to accomplish this particular update quickly, you can drop the edc
column, re-add it and set the default to null.

However updating every row to null with 700k rows is going to take a while
and this query is probably not a good test case to judge your database
performance.  Try testing some of your other queries.  Post the EXPLAIN
ANALYZE results of those queries to the psql-performance@postgresql.org
list along with a complete copy of your postgresql.conf file.  I think you
are not getting a good measurement of your actual database performance by
judging it with this simple test case where every row is updated.

Regards,

Fred

Re: updates (postgreSQL) very slow

From
Tom Lane
Date:
"Fred Moyer" <fred@redhotpenguin.com> writes:
>> This is just a Seq Scan where a numeric field must be updated to
>> NULL but if I run it you can see that this �simple� query takes
>> forever (7628686.23 ms this is over 2 hours for only updating
>> 747524 records!).

> However updating every row to null with 700k rows is going to take a while

A while, sure, but 2 hours seems excessive to me too.  I'm betting that
there are triggers or foreign keys on the table being updated, and that
that's where the time is going.  It might be possible to improve that,
but Bobbie hasn't given us enough information.

Another thing that jumps out at me is that this table hasn't been
vacuumed or analyzed recently.  The planner thinks there are 3491 rows
when really there are 747524.  That's a bit of a big difference.  It
won't matter for the UPDATE itself --- a seqscan is a seqscan --- but
it might matter for planning foreign-key queries.

            regards, tom lane

Re: updates (postgreSQL) very slow

From
"scott.marlowe"
Date:
Have you run this update query again and again with vacuuming?

http://www.postgresql.org/docs/7.4/static/maintenance.html#ROUTINE-VACUUMING

If so, you might have millions and millions of dead tuples taking up
space and slowing things down.  If you're running 7.4, install the
autovacuum daemon and turn it on.  Nice little program that should mostly
take care of this issue for you.

Got any foreign keys on that field?  Triggers?




Re: updates (postgreSQL) very slow

From
Fred Moyer
Date:
On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
> "Fred Moyer" <fred@redhotpenguin.com> writes:
> >> This is just a Seq Scan where a numeric field must be updated to
> >> NULL but if I run it you can see that this simple query takes
> >> forever (7628686.23 ms this is over 2 hours for only updating
> >> 747524 records!).
>
> > However updating every row to null with 700k rows is going to take a while
>
> A while, sure, but 2 hours seems excessive to me too.  I'm betting that
> there are triggers or foreign keys on the table being updated, and that
> that's where the time is going.  It might be possible to improve that,
> but Bobbie hasn't given us enough information.

If there are no foreign keys or triggers and updating each row is taking
one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
the time of 7628686 ms for the update above.  Is this is an accurate
estimate or are these numbers just coincidence?  It seems like this could
represent the least efficient update scenario.


Re: updates (postgreSQL) very slow

From
Tom Lane
Date:
Fred Moyer <fred@redhotpenguin.com> writes:
> On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
>> A while, sure, but 2 hours seems excessive to me too.

> If there are no foreign keys or triggers and updating each row is taking
> one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
> here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
> the time of 7628686 ms for the update above.  Is this is an accurate
> estimate or are these numbers just coincidence?

Probably coincidence.  There's no reason to think that a large UPDATE
would expend one disk seek per updated row on average --- there's enough
buffering between the UPDATE and the drive heads that under normal
circumstances the cost should be lots less.

If I had to bet at this point I'd bet on inefficient foreign-key checks,
but since we haven't seen any schema details that's purely speculation.

            regards, tom lane

Re: updates (postgreSQL) very slow

From
"Bobbie van der Westhuizen"
Date:
On 11 Mar 2004 at 2:01, Tom Lane wrote:

Fred Moyer <fred@redhotpenguin.com> writes:
> On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
>> A while, sure, but 2 hours seems excessive to me too.

> If there are no foreign keys or triggers and updating each row is taking
> one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
> here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
> the time of 7628686 ms for the update above.  Is this is an accurate
> estimate or are these numbers just coincidence?

Probably coincidence.  There's no reason to think that a large UPDATE
would expend one disk seek per updated row on average --- there's
enough
buffering between the UPDATE and the drive heads that under normal
circumstances the cost should be lots less.

If I had to bet at this point I'd bet on inefficient foreign-key checks,
but since we haven't seen any schema details that's purely
speculation.

            regards, tom lane

There are no foreign-keys in this table.  What schema details do you
need, then I can give it to you. I am a new user of postgreSQL so I am
not clude-up with all of the stuff.
________________________________________________________
__
Bobbie van der Westhuizen
Quantitative Animal Breeding (BLUP)
ARC - Animal Improvement Institute
+27-12-672-9128 (o/h)
+27-12-665-1419 (fax)
bobbie@irene.agric.za
________________________________________________________
__


Re: updates (postgreSQL) very slow

From
"Joshua D. Drake"
Date:
Hello,

Is this a place where increasing default_statistics_target will help?


Sincerely,

J



Bobbie van der Westhuizen wrote:
> On 11 Mar 2004 at 2:01, Tom Lane wrote:
>
> Fred Moyer <fred@redhotpenguin.com> writes:
>
>>On Wed, 2004-03-10 at 15:30, Tom Lane wrote:
>>
>>>A while, sure, but 2 hours seems excessive to me too.
>
>
>>If there are no foreign keys or triggers and updating each row is taking
>>one drive seek ( approximately 9 ms with the 80 gig IDE drive being used
>>here ) then to do 747524 seeks will take 6727716 ms, about 10% less than
>>the time of 7628686 ms for the update above.  Is this is an accurate
>>estimate or are these numbers just coincidence?
>
>
> Probably coincidence.  There's no reason to think that a large UPDATE
> would expend one disk seek per updated row on average --- there's
> enough
> buffering between the UPDATE and the drive heads that under normal
> circumstances the cost should be lots less.
>
> If I had to bet at this point I'd bet on inefficient foreign-key checks,
> but since we haven't seen any schema details that's purely
> speculation.
>
>             regards, tom lane
>
> There are no foreign-keys in this table.  What schema details do you
> need, then I can give it to you. I am a new user of postgreSQL so I am
> not clude-up with all of the stuff.
> ________________________________________________________
> __
> Bobbie van der Westhuizen
> Quantitative Animal Breeding (BLUP)
> ARC - Animal Improvement Institute
> +27-12-672-9128 (o/h)
> +27-12-665-1419 (fax)
> bobbie@irene.agric.za
> ________________________________________________________
> __
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match


Attachment

Re: updates (postgreSQL) very slow

From
Tom Lane
Date:
"Bobbie van der Westhuizen" <Bobbie@idpi1.agric.za> writes:
>> If I had to bet at this point I'd bet on inefficient foreign-key checks,
>> but since we haven't seen any schema details that's purely
>> speculation.

> There are no foreign-keys in this table.  What schema details do you
> need, then I can give it to you. I am a new user of postgreSQL so I am
> not clude-up with all of the stuff.

"pg_dump -s" is the easiest way of exhibiting the schema of a database.

            regards, tom lane