Thread: slow full table update
Hi, I have table with cca 60.000 rows and when I run query as: Update table SET column=0; after 10 minutes i must stop query, but it still running :( I've Postgres 8.1 with all default settings in postgres.conf Where is the problem? Thak you for any tips. best regards. Marek Fiala
Sorry, but you have to provide much more information about the table. The information you've provided is really not sufficient - the rows might be large or small. I guess it's the second option, with a lots of dead rows. Try this: ANALYZE table; SELECT relpages, reltuples FROM pg_class WHERE relname = 'table'; Anyway, is the autovacuum running? What are the parameters? Try to execute VACUUM table; and then run the two commands above. That might 'clean' the table and improve the update performance. Don't forget each such UPDATE will actually create a copy of all the modified rows (that's how PostgreSQL works), so if you don't run VACUUM periodically or autovacuum demon, then the table will bloat (occupy much more disk space than it should). If it does not help, try do determine if the UPDATE is CPU or disk bound. I'd guess there are problems with I/O bottleneck (due to the bloating). regards Tomas > Hi, > > I have table with cca 60.000 rows and > when I run query as: > Update table SET column=0; > after 10 minutes i must stop query, but it still running :( > > I've Postgres 8.1 with all default settings in postgres.conf > > Where is the problem? > > Thak you for any tips. > > best regards. > Marek Fiala > > > > > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Hi, thank you for your reply. Here is some aditional information: the problem is on every tables with small and large rows too. autovacuum is running. relpages reltuples 6213 54743 tables are almost write-only Munin Graphs shows that problems is with I/O bottleneck. I found out that Update 100 rows takes 0.3s but update 1000 rows takes 50s Is this better information? Thanks for any help. best regards Marek Fiala ______________________________________________________________ > Od: tv@fuzzy.cz > Komu: firerox@centrum.cz > CC: pgsql-performance@postgresql.org > Datum: 10.11.2008 17:42 > Předmět: Re: [PERFORM] slow full table update > >Sorry, but you have to provide much more information about the table. The >information you've provided is really not sufficient - the rows might be >large or small. I guess it's the second option, with a lots of dead rows. > >Try this: > >ANALYZE table; >SELECT relpages, reltuples FROM pg_class WHERE relname = 'table'; > >Anyway, is the autovacuum running? What are the parameters? Try to execute > >VACUUM table; > >and then run the two commands above. That might 'clean' the table and >improve the update performance. Don't forget each such UPDATE will >actually create a copy of all the modified rows (that's how PostgreSQL >works), so if you don't run VACUUM periodically or autovacuum demon, then >the table will bloat (occupy much more disk space than it should). > >If it does not help, try do determine if the UPDATE is CPU or disk bound. >I'd guess there are problems with I/O bottleneck (due to the bloating). > >regards >Tomas > >> Hi, >> >> I have table with cca 60.000 rows and >> when I run query as: >> Update table SET column=0; >> after 10 minutes i must stop query, but it still running :( >> >> I've Postgres 8.1 with all default settings in postgres.conf >> >> Where is the problem? >> >> Thak you for any tips. >> >> best regards. >> Marek Fiala >> >> >> >> >> >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance >> > > > >-- >Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-performance >
Hi, so the table occupies about 50 MB, i.e. each row has about 1 kB, right? Updating 1000 rows should means about 1MB of data to be updated. There might be a problem with execution plan of the updates - I guess the 100 rows update uses index scan and the 1000 rows update might use seq scan. Anyway the table is not too big, so I wouldn't expect such I/O bottleneck on a properly tuned system. Have you checked the postgresql.conf settings? What are the values for 1) shared_buffers - 8kB pages used as a buffer (try to increase this a little, for example to 1000, i.e. 8MB, or even more) 2) checkpoint_segments - number of 16MB checkpoint segments, aka transaction logs, this usually improves the write / update performance a lot, so try to increase the default value (3) to at least 8 3) wal_buffers - 8kB pages used to store WAL (minimal effect usually, but try to increase it to 16 - 64, just to be sure) There is a nicely annotated config, with recommendations on how to set the values based on usage etc. See this: http://www.powerpostgresql.com/Downloads/annotated_conf_80.html http://www.powerpostgresql.com/PerfList regards Tomas > Hi, > > thank you for your reply. > > Here is some aditional information: > > the problem is on every tables with small and large rows too. > autovacuum is running. > > relpages reltuples > 6213 54743 > > tables are almost write-only > Munin Graphs shows that problems is with I/O bottleneck. > > I found out that > Update 100 rows takes 0.3s > but update 1000 rows takes 50s > > Is this better information? > > Thanks for any help. > > best regards > Marek Fiala > ______________________________________________________________ >> Od: tv@fuzzy.cz >> Komu: firerox@centrum.cz > > CC: pgsql-performance@postgresql.org >> Datum: 10.11.2008 17:42 >> P�edm�t: Re: [PERFORM] slow full table update >> >>Sorry, but you have to provide much more information about the table. The >>information you've provided is really not sufficient - the rows might be >>large or small. I guess it's the second option, with a lots of dead rows. >> >>Try this: >> >>ANALYZE table; >>SELECT relpages, reltuples FROM pg_class WHERE relname = 'table'; >> >>Anyway, is the autovacuum running? What are the parameters? Try to >> execute >> >>VACUUM table; >> >>and then run the two commands above. That might 'clean' the table and >>improve the update performance. Don't forget each such UPDATE will >>actually create a copy of all the modified rows (that's how PostgreSQL >>works), so if you don't run VACUUM periodically or autovacuum demon, then >>the table will bloat (occupy much more disk space than it should). >> >>If it does not help, try do determine if the UPDATE is CPU or disk bound. >>I'd guess there are problems with I/O bottleneck (due to the bloating). >> >>regards >>Tomas >> >>> Hi, >>> >>> I have table with cca 60.000 rows and >>> when I run query as: >>> Update table SET column=0; >>> after 10 minutes i must stop query, but it still running :( >>> >>> I've Postgres 8.1 with all default settings in postgres.conf >>> >>> Where is the problem? >>> >>> Thak you for any tips. >>> >>> best regards. >>> Marek Fiala >>> >>> >>> >>> >>> >>> >>> >>> -- >>> Sent via pgsql-performance mailing list >>> (pgsql-performance@postgresql.org) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-performance >>> >> >> >> >>-- >>Sent via pgsql-performance mailing list >> (pgsql-performance@postgresql.org) >>To make changes to your subscription: >>http://www.postgresql.org/mailpref/pgsql-performance >> > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Hi, I've changed settings, but with no effect on speed. I try explain query with this result for 10.000 rows > update songs set views = 0 where sid > 20000 and sid < 30000 Bitmap Heap Scan on songs (cost=151.59..6814.29 rows=8931 width=526) (actual time=4.848..167.855 rows=8945 loops=1) Recheck Cond: ((sid > 20000) AND (sid < 30000)) -> Bitmap Index Scan on pk_songs2 (cost=0.00..151.59 rows=8931 width=0) (actual time=4.071..4.071 rows=9579 loops=1) Index Cond: ((sid > 20000) AND (sid < 30000)) Is there a way to run this query on sigle throughpass with no Recheck Cond? Thank you. best regards Marek Fiala ______________________________________________________________ > Od: tv@fuzzy.cz > Komu: pgsql-performance@postgresql.org > Datum: 12.11.2008 17:48 > Předmět: Re: [PERFORM] slow full table update > >Hi, > >so the table occupies about 50 MB, i.e. each row has about 1 kB, right? >Updating 1000 rows should means about 1MB of data to be updated. > >There might be a problem with execution plan of the updates - I guess the >100 rows update uses index scan and the 1000 rows update might use seq >scan. > >Anyway the table is not too big, so I wouldn't expect such I/O bottleneck >on a properly tuned system. Have you checked the postgresql.conf settings? >What are the values for > >1) shared_buffers - 8kB pages used as a buffer (try to increase this a >little, for example to 1000, i.e. 8MB, or even more) > >2) checkpoint_segments - number of 16MB checkpoint segments, aka >transaction logs, this usually improves the write / update performance a >lot, so try to increase the default value (3) to at least 8 > >3) wal_buffers - 8kB pages used to store WAL (minimal effect usually, but >try to increase it to 16 - 64, just to be sure) > >There is a nicely annotated config, with recommendations on how to set the >values based on usage etc. See this: > >http://www.powerpostgresql.com/Downloads/annotated_conf_80.html >http://www.powerpostgresql.com/PerfList > >regards >Tomas > >> Hi, >> >> thank you for your reply. >> >> Here is some aditional information: >> >> the problem is on every tables with small and large rows too. >> autovacuum is running. >> >> relpages reltuples >> 6213 54743 >> >> tables are almost write-only >> Munin Graphs shows that problems is with I/O bottleneck. >> >> I found out that >> Update 100 rows takes 0.3s >> but update 1000 rows takes 50s >> >> Is this better information? >> >> Thanks for any help. >> >> best regards >> Marek Fiala >> ______________________________________________________________ >>> Od: tv@fuzzy.cz >>> Komu: firerox@centrum.cz >> > CC: pgsql-performance@postgresql.org >>> Datum: 10.11.2008 17:42 >>> PĹ�edmÄ�t: Re: [PERFORM] slow full table update >>> >>>Sorry, but you have to provide much more information about the table. The >>>information you've provided is really not sufficient - the rows might be >>>large or small. I guess it's the second option, with a lots of dead rows. >>> >>>Try this: >>> >>>ANALYZE table; >>>SELECT relpages, reltuples FROM pg_class WHERE relname = 'table'; >>> >>>Anyway, is the autovacuum running? What are the parameters? Try to >>> execute >>> >>>VACUUM table; >>> >>>and then run the two commands above. That might 'clean' the table and >>>improve the update performance. Don't forget each such UPDATE will >>>actually create a copy of all the modified rows (that's how PostgreSQL >>>works), so if you don't run VACUUM periodically or autovacuum demon, then >>>the table will bloat (occupy much more disk space than it should). >>> >>>If it does not help, try do determine if the UPDATE is CPU or disk bound. >>>I'd guess there are problems with I/O bottleneck (due to the bloating). >>> >>>regards >>>Tomas >>> >>>> Hi, >>>> >>>> I have table with cca 60.000 rows and >>>> when I run query as: >>>> Update table SET column=0; >>>> after 10 minutes i must stop query, but it still running :( >>>> >>>> I've Postgres 8.1 with all default settings in postgres.conf >>>> >>>> Where is the problem? >>>> >>>> Thak you for any tips. >>>> >>>> best regards. >>>> Marek Fiala >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Sent via pgsql-performance mailing list >>>> (pgsql-performance@postgresql.org) >>>> To make changes to your subscription: >>>> http://www.postgresql.org/mailpref/pgsql-performance >>>> >>> >>> >>> >>>-- >>>Sent via pgsql-performance mailing list >>> (pgsql-performance@postgresql.org) >>>To make changes to your subscription: >>>http://www.postgresql.org/mailpref/pgsql-performance >>> >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance >> > > > >-- >Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >To make changes to your subscription: >http://www.postgresql.org/mailpref/pgsql-performance >
firerox@centrum.cz wrote: > Hi, > > I've changed settings, > but with no effect on speed. > > I try explain query with this result > for 10.000 rows > update songs set views = 0 where sid > 20000 and sid < 30000 > > Bitmap Heap Scan on songs (cost=151.59..6814.29 rows=8931 width=526) (actual time=4.848..167.855 rows=8945 loops=1) This query says t is taking 167 milli-seconds, not 10 minutes as your first message said. Is this query actually slow? > > Recheck Cond: ((sid > 20000) AND (sid < 30000)) > > -> Bitmap Index Scan on pk_songs2 (cost=0.00..151.59 rows=8931 width=0) (actual time=4.071..4.071 rows=9579 loops=1) > > Index Cond: ((sid > 20000) AND (sid < 30000)) > > Is there a way to run this query on sigle throughpass with no Recheck Cond? Only a sequential scan. -- Richard Huxton Archonet Ltd
>
> Recheck Cond: ((sid > 20000) AND (sid < 30000))
>
> -> Bitmap Index Scan on pk_songs2 (cost=0.00..151.59 rows=8931 width=0) (actual time=4.071..4.071 rows=9579 loops=1)
>
> Index Cond: ((sid > 20000) AND (sid < 30000))
>
> Is there a way to run this query on sigle throughpass with no Recheck Cond?
"Recheck Cond" is somewhat misleading here.
Bitmap Index Scan has almost void "recheck" impact in case the whole bitmap fits in work_mem. That means bitmap scan degrades when the number of rows in table (not the total number of returned rows) is greater than work_mem*1024*8. 60'000 rows bitmap scan will require 60'000/8=7'500 bytes ~ 8Kbytes of memory to run without additional recheck, thus I do not believe it hurts you in this particular case
Regards,
Vladimir Sitnikov
Bitmap Index Scan has almost void "recheck" impact in case the whole bitmap fits in work_mem. That means bitmap scan degrades when the number of rows in table (not the total number of returned rows) is greater than work_mem*1024*8. 60'000 rows bitmap scan will require 60'000/8=7'500 bytes ~ 8Kbytes of memory to run without additional recheck, thus I do not believe it hurts you in this particular case
Regards,
Vladimir Sitnikov
Richard Huxton <dev@archonet.com> writes: > firerox@centrum.cz wrote: >> I try explain query with this result >> for 10.000 rows > update songs set views = 0 where sid > 20000 and sid < 30000 >> >> Bitmap Heap Scan on songs (cost=151.59..6814.29 rows=8931 width=526) (actual time=4.848..167.855 rows=8945 loops=1) > This query says t is taking 167 milli-seconds, not 10 minutes as your > first message said. Is this query actually slow? The explain plan tree only shows the time to fetch/compute the new rows, not to actually perform the update, update indexes, or fire triggers. If there is a big discrepancy then the extra time must be going into one of those steps. 8.1 does show trigger execution time separately, so the most obvious problem (unindexed foreign key reference) seems to be excluded, unless the OP just snipped that part of the output ... regards, tom lane
On Mon, Nov 10, 2008 at 9:30 AM, <firerox@centrum.cz> wrote: > Hi, > > I have table with cca 60.000 rows and > when I run query as: > Update table SET column=0; > after 10 minutes i must stop query, but it still running :( What does vacuum verbose table; say? I'm wondering if it's gotten overly bloated. How long does select count(*) from table; take to run (use \timing to time it)
hi, select count(*) from songs; count ------- 54909 (1 row) Time: 58.182 ms update songs set views = 0; UPDATE 54909 Time: 101907.837 ms time is actually less than 10 minutes, but it is still very long :( vacuum said> VACUUM VERBOSE songs; INFO: vacuuming "public.songs" INFO: index "pk_songs2" now contains 54909 row versions in 595 pages DETAIL: 193 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.06 sec. INFO: index "fk_albums_aid_index" now contains 54909 row versions in 1330 pages DETAIL: 193 index row versions were removed. 812 index pages have been deleted, 812 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.04 sec. INFO: index "fk_artists_artid_index" now contains 54910 row versions in 628 pages DETAIL: 193 index row versions were removed. 114 index pages have been deleted, 114 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.10 sec. INFO: index "fk_users_uid_karaoke_index" now contains 54910 row versions in 2352 pages DETAIL: 193 index row versions were removed. 2004 index pages have been deleted, 2004 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.95 sec. INFO: index "datum_tag_indx" now contains 54910 row versions in 2083 pages DETAIL: 193 index row versions were removed. 1728 index pages have been deleted, 1728 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.47 sec. INFO: index "datum_video_indx" now contains 54910 row versions in 1261 pages DETAIL: 193 index row versions were removed. 826 index pages have been deleted, 826 are currently reusable. CPU 0.01s/0.00u sec elapsed 0.06 sec. INFO: "songs": removed 193 row versions in 164 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: "songs": found 193 removable, 54909 nonremovable row versions in 6213 pages DETAIL: 0 dead row versions cannot be removed yet. There were 132969 unused item pointers. 0 pages are entirely empty. CPU 0.07s/0.04u sec elapsed 1.74 sec. INFO: vacuuming "pg_toast.pg_toast_28178" INFO: index "pg_toast_28178_index" now contains 2700 row versions in 13 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_toast_28178": found 0 removable, 2700 nonremovable row versions in 645 pages DETAIL: 0 dead row versions cannot be removed yet. There were 88 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. VACUUM Time: 1750.460 ms best regards Marek Fiala ______________________________________________________________ > Od: scott.marlowe@gmail.com > Komu: firerox@centrum.cz > CC: pgsql-performance@postgresql.org > Datum: 12.11.2008 21:55 > Předmět: Re: [PERFORM] slow full table update > >On Mon, Nov 10, 2008 at 9:30 AM, <firerox@centrum.cz> wrote: >> Hi, >> >> I have table with cca 60.000 rows and >> when I run query as: >> Update table SET column=0; >> after 10 minutes i must stop query, but it still running :( > >What does > >vacuum verbose table; > >say? I'm wondering if it's gotten overly bloated. > >How long does > >select count(*) from table; > >take to run (use timing to time it) >
This is the critical point. You have this line: There were 132969 unused item pointers. Which says there's 132k or so dead rows in your table. Which means vacuum / autovacuum isn't keeping up. Did you try and stop the update several times? Each time it starts then gets killed it creates dead rows. Try doing a vacuum full followed by a reindex OR a cluster on this table and see if that helps.
> This is the critical point. You have this line: > > There were 132969 unused item pointers. > > Which says there's 132k or so dead rows in your table. Which means > vacuum / autovacuum isn't keeping up. Did you try and stop the update > several times? Each time it starts then gets killed it creates dead > rows. Try to run just ANALYZE on the table and then run the SELECT relpages, reltuples FROM pg_class WHERE relname = 'table' again. It should report about 20k of pages, i.e. 160MB. That might slow the things down ;-) > Try doing a vacuum full followed by a reindex OR a cluster on this > table and see if that helps. Well, maybe the vacuum will fix the problem - have you executed the query that took 167ms (according to the explain analyze output posted by you) over a clean table? But I doubt the growth from 6.000 to 20.000 alone might cause degradation from 170ms to several minutes ... regards Tomas
> The explain plan tree only shows the time to fetch/compute the new rows, > not to actually perform the update, update indexes, or fire triggers. > If there is a big discrepancy then the extra time must be going into > one of those steps. > > 8.1 does show trigger execution time separately, so the most obvious > problem (unindexed foreign key reference) seems to be excluded, unless > the OP just snipped that part of the output ... Yeah, that quite frequent problem with updates. Try to separate create a copy of the table, i.e. CREATE TABLE test_table AS SELECT * FROM table; and try to execute the query on it. What tables do reference the original table using a foreign key? Do they have indexes on the foreign key column? How large are there referencing tables? Are these tables updated heavily and vacuumed properly (i.e. aren't they bloated with dead rows)? I'm not sure if the FK constraints are checked always, or just in case the referenced column is updated. I guess the FK check is performed only in case of DELETE or when the value in the FK column is modified (but update of the primary key is not very frequent I guess). Are there any triggers and / or rules on the table? regards Tomas
> update songs set views = 0; > UPDATE 54909 > Time: 101907.837 ms > time is actually less than 10 minutes, but it is still very long :( Wow. test=> CREATE TABLE test (id SERIAL PRIMARY KEY, value INTEGER); test=> INSERT INTO test (value) SELECT n FROM generate_series( 1,100000 ) AS n; Temps : 1706,495 ms test=> UPDATE test SET value=0; Temps : 1972,420 ms Note this is 8.3.3 on a desktop PC with the database and xlog on a Linux Software RAID1 of rather slow drives (about 50 MB/s). Anyway your 10 minutes are really wrong. First thing to check is if there is a problem with your IO subsystem, try the example queries above, you should get timings in the same ballpark. If you get 10x slower than that, you have a problem. Are the rows large ? I would believe so, because a "songs" table will probably contain things like artist, title, comments, and lots of other information in strings that are too small to be TOAST'ed. Perhaps your problem is in index updates, too. So, make a copy of the songs table, without any indices, and no foreign keys : CREATE TABLE songs2 AS SELECT * FROM songs; Then try your UPDATE on this. How slow is it ? Now drop this table, and recreate it with the foreign keys. Test the update again. Now drop this table, and recreate it with the foreign keys and indexes. Test the update again. This will give you some meaningful information. You will probably update the 'views' column quite often, it will even probably be the most often updated column in your application. In this case, you could try moving it to a separate table with just (song_id, view), that way you will update a very small table.