Thread: Slow update SQL
I'm experiencing a very slow query. The table contains 611,564 rows of data. I vaccumed the table:<br /><br /> VACUUM ANALYZEncccr10;<br /><br /> SELECT count(*) FROM ncccr10;<br /> count<br /> --------<br /> 611564<br /> (1 row)<br /><br/> When I try to analyze the query plan with:<br /><br /> EXPLAIN ANALYZE<br /> UPDATE ncccr10<br /> SET key = facilityno||'-'||<br/> lastname||'-'||<br /> sex||'-'||<br /> ssno||'-'||<br /> birthdate||'-'||<br /> primarysit||'-'||<br/> dxdate||'-'||<br /> morphology3<br /> WHERE date_part('year',dxdate) > '2000';<br /><br /> Thequery just never finishes (even 1 hour later). The colum key100 is indexed, and I'm setting the value of this<br /> columnfrom other columns. Why is this so slow? <br /><br />
On Mon, Feb 13, 2006 at 05:48:45PM -0800, Ken Hill wrote: > When I try to analyze the query plan with: > > EXPLAIN ANALYZE > UPDATE ncccr10 > SET key = facilityno||'-'|| > lastname||'-'|| > sex||'-'|| > ssno||'-'|| > birthdate||'-'|| > primarysit||'-'|| > dxdate||'-'|| > morphology3 > WHERE date_part('year',dxdate) > '2000'; > > The query just never finishes (even 1 hour later). The colum key100 is > indexed, and I'm setting the value of this > column from other columns. Why is this so slow? If EXPLAIN ANALYZE is taking too long then could we at least see the EXPLAIN output? How many rows does the condition match? SELECT count(*) FROM ncccr10 WHERE date_part('year',dxdate) > '2000'; Do you have an expression index on date_part('year',dxdate)? Does the table have any triggers or rules? Have you queried pg_locks to see if the update is blocked on an ungranted lock? Do other tables have foreign key references to ncccr10? If so then you might need indexes on the referring columns. What version of PostgreSQL are you running? -- Michael Fuhr
[Please copy the mailing list on replies.] On Mon, Feb 13, 2006 at 06:48:06PM -0800, Ken Hill wrote: > On Mon, 2006-02-13 at 19:14 -0700, Michael Fuhr wrote: > > How many rows does the condition match? > > csalgorithm=# SELECT count(*) FROM ncccr10 WHERE > date_part('year',dxdate) > '2000'; > count > -------- > 199209 > (1 row) You're updating about a third of the table; an expression index on date_part probably wouldn't help because the planner is likely to stick with a sequential scan for such a large update. Even if it did help it's likely to be a small fraction of the total time. The table definition you sent me showed nine indexes. You might see a substantial performance improvement by dropping all the indexes, doing the update, then creating the indexes again (don't forget to vacuum and analyze the table after the update). However, dropping the indexes has obvious implications for other queries so you might need to do the update at a time when that doesn't matter. > > Have you queried pg_locks > > to see if the update is blocked on an ungranted lock? > > I don't know what that is. How do I query pg_locks? SELECT * FROM pg_locks; http://www.postgresql.org/docs/7.4/static/monitoring-locks.html > > What version of PostgreSQL are you running? > > 7.4.8. Thank you for your help. Newer versions generally perform better; consider upgrading to 8.0 or 8.1 if possible. -- Michael Fuhr
On Mon, 2006-02-13 at 22:17 -0700, Michael Fuhr wrote: <blockquote type="CITE"><pre> <font color="#000000">[Please copy the mailing list on replies.]</font> <font color="#000000">On Mon, Feb 13, 2006 at 06:48:06PM -0800, Ken Hill wrote:</font> <font color="#000000">> On Mon, 2006-02-13 at 19:14 -0700, Michael Fuhr wrote:</font> <font color="#000000">> > How many rows does the condition match?</font> <font color="#000000">></font> <font color="#000000">> csalgorithm=# SELECT count(*) FROM ncccr10 WHERE</font> <font color="#000000">> date_part('year',dxdate) > '2000';</font> <font color="#000000">> count</font> <font color="#000000">> --------</font> <font color="#000000">> 199209</font> <font color="#000000">> (1 row)</font> <font color="#000000">You're updating about a third of the table; an expression index on</font> <font color="#000000">date_part probably wouldn't help because the planner is likely to</font> <font color="#000000">stick with a sequential scan for such a large update. Even if it</font> <font color="#000000">did help it's likely to be a small fraction of the total time.</font> <font color="#000000">The table definition you sent me showed nine indexes. You might</font> <font color="#000000">see a substantial performance improvement by dropping all the</font> <font color="#000000">indexes, doing the update, then creating the indexes again (don't</font> <font color="#000000">forget to vacuum and analyze the table after the update). However,</font> <font color="#000000">dropping the indexes has obvious implications for other queries so</font> <font color="#000000">you might need to do the update at a time when that doesn't matter.</font> <font color="#000000">> > Have you queried pg_locks</font> <font color="#000000">> > to see if the update is blocked on an ungranted lock?</font> <font color="#000000">> </font> <font color="#000000">> I don't know what that is. How do I query pg_locks?</font> <font color="#000000">SELECT * FROM pg_locks;</font> <font color="#000000"><a href="http://www.postgresql.org/docs/7.4/static/monitoring-locks.html">http://www.postgresql.org/docs/7.4/static/monitoring-locks.html</a></font> <font color="#000000">> > What version of PostgreSQL are you running?</font> <font color="#000000">> </font> <font color="#000000">> 7.4.8. Thank you for your help.</font> <font color="#000000">Newer versions generally perform better; consider upgrading to 8.0</font> <font color="#000000">or 8.1 if possible.</font> </pre></blockquote> Removing the indexes, running the update SQL, and then adding back the indexes worked much faster. Thankyou for you help.
Hi, Ken, Ken Hill wrote: > Removing the indexes, running the update SQL, and then adding back the > indexes worked much faster. Thank you for you help. It might be a good idea to run VACUUM FULL between updating and reindexing. If you want to CLUSTER on an index, it will be best to create this index first, then CLUSTER the table, and then recreate the other indices. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
bungsu,<br /><br /> That does make the query work a bit faster. Thanks!<br /><br /> On Mon, 2006-02-27 at 09:42 +0700, BungsuputraLinan wrote: <blockquote type="CITE"><pre> <font color="#000000">Hi Ken,</font> <font color="#000000">I used to have the same problem. In my computer, using date_part in WHERE</font> <font color="#000000">clause will always slow down the system when the table has plenty of rows.</font> <font color="#000000">My suggestion is try to change the query to:</font> <font color="#000000">... WHERE dxdate >= '2001-01-01';</font> <font color="#000000">I hope this helps.</font> <font color="#000000">Regards,</font> <font color="#000000">bungsu</font> <font color="#000000">----- Original Message -----</font> <font color="#000000">From: Ken Hill</font> <font color="#000000">To: Postgres Help</font> <font color="#000000">Sent: Tuesday, February 14, 2006 8:48 AM</font> <font color="#000000">Subject: [SQL] Slow update SQL</font> <font color="#000000">I'm experiencing a very slow query. The table contains 611,564 rows of data.</font> <font color="#000000">I vaccumed the table:</font> <font color="#000000">VACUUM ANALYZE ncccr10;</font> <font color="#000000">SELECT count(*) FROM ncccr10;</font> <font color="#000000">count</font> <font color="#000000">--------</font> <font color="#000000">611564</font> <font color="#000000">(1 row)</font> <font color="#000000">When I try to analyze the query plan with:</font> <font color="#000000">EXPLAIN ANALYZE</font> <font color="#000000">UPDATE ncccr10</font> <font color="#000000">SET key = facilityno||'-'||</font> <font color="#000000">lastname||'-'||</font> <font color="#000000">sex||'-'||</font> <font color="#000000">ssno||'-'||</font> <font color="#000000">birthdate||'-'||</font> <font color="#000000">primarysit||'-'||</font> <font color="#000000">dxdate||'-'||</font> <font color="#000000">morphology3</font> <font color="#000000">WHERE date_part('year',dxdate) > '2000';</font> <font color="#000000">The query just never finishes (even 1 hour later). The colum key100 is</font> <font color="#000000">indexed, and I'm setting the value of this</font> <font color="#000000">column from other columns. Why is this so slow?</font> </pre></blockquote>
Hi Ken, I used to have the same problem. In my computer, using date_part in WHERE clause will always slow down the system when the table has plenty of rows. My suggestion is try to change the query to: ... WHERE dxdate >= '2001-01-01'; I hope this helps. Regards, bungsu ----- Original Message ----- From: Ken Hill To: Postgres Help Sent: Tuesday, February 14, 2006 8:48 AM Subject: [SQL] Slow update SQL I'm experiencing a very slow query. The table contains 611,564 rows of data. I vaccumed the table: VACUUM ANALYZE ncccr10; SELECT count(*) FROM ncccr10; count -------- 611564 (1 row) When I try to analyze the query plan with: EXPLAIN ANALYZE UPDATE ncccr10 SET key = facilityno||'-'|| lastname||'-'|| sex||'-'|| ssno||'-'|| birthdate||'-'|| primarysit||'-'|| dxdate||'-'|| morphology3 WHERE date_part('year',dxdate) > '2000'; The query just never finishes (even 1 hour later). The colum key100 is indexed, and I'm setting the value of this column from other columns. Why is this so slow?
<p><font size="2"><font face="Arial">You can try this too:<br /><br />...<br />WHERE EXTRACT(YEAR FROM dxdate::Date) >2000</font></font><p><font size="2"><font face="Arial"><br /><br />Mauricio Fernández A.<br />Ingeniero de Sistemas<br/>Universidad Autónoma de Manizales (Colombia)</font></font><p><font size="2"><font face="Arial"></font><br /><br/>-----Mensaje original-----<br />De: pgsql-sql-owner@postgresql.org<br />[<a href="mailto:pgsql-sql-owner@postgresql.org">mailto:pgsql-sql-owner@postgresql.org</a>]Ennombre de Bungsuputra Linan<br />Enviadoel: lunes, 27 febrero, 2006 3:42<br />Para: ken@scottshill.com; Postgres Help<br />Asunto: Re: [SQL] Slow updateSQL<br /><br /><br />Hi Ken,<br /><br />I used to have the same problem. In my computer, using date_part in WHERE<br/>clause will always slow down the system when the table has plenty of rows.<br /><br />My suggestion is try to changethe query to:<br />... WHERE dxdate >= '2001-01-01';<br /><br />I hope this helps.<br /><br />Regards,<br />bungsu<br/><br />----- Original Message -----<br />From: Ken Hill<br />To: Postgres Help<br />Sent: Tuesday, February 14,2006 8:48 AM<br />Subject: [SQL] Slow update SQL<br /><br /><br />I'm experiencing a very slow query. The table contains611,564 rows of data.<br />I vaccumed the table:<br /><br />VACUUM ANALYZE ncccr10;<br /><br />SELECT count(*) FROMncccr10;<br />count<br />--------<br />611564<br />(1 row)<br /><br />When I try to analyze the query plan with:<br /><br/>EXPLAIN ANALYZE<br />UPDATE ncccr10<br />SET key = facilityno||'-'||<br />lastname||'-'||<br />sex||'-'||<br />ssno||'-'||<br/>birthdate||'-'||<br />primarysit||'-'||<br />dxdate||'-'||<br />morphology3<br />WHERE date_part('year',dxdate)> '2000';<br /><br />The query just never finishes (even 1 hour later). The colum key100 is<br/>indexed, and I'm setting the value of this<br />column from other columns. Why is this so slow?<br /><br /><br />---------------------------(endof broadcast)---------------------------<br />TIP 6: explain analyze is your friend<br /></font>