Thread: strategies for dealing with frequently updated tables
Hi all, I have a relation where a tuple typically undergoes a lifecycle something like: 1) Created 2) Updated maybe thousands of times (no updates to indexed columns though) 3) Rarely or never modified again The following query takes about 100 minutes (3 seconds per tuple): SELECT count(id) from T (2) is causing a lot of auto vacuum/analyze activity (which is OK). HOT seems to get used for about 90% of the updates, but there are enough updates that don't use it to cause issues. I'm using pg version 9.0.3 on a 32-bit windows xp machine with 3GB of RAM. The .conf file is using default settings. Table Info: n_live_tup 1799 n_dead_tup 191 pg_relation_size 2343mb indexsize 10mb toastsize 552kb toastindexsize 16kb This reports under 10kb for most tuples: psql -A -c "select * from T where id=123" | wc -c auto-vacuum and auto-analyze both ran yesterday with default settings. There are only one or two new tuples since yesterday. The database is fairly old (was probably created using pg_restore about when 9.0.3 came out). Here is the output from VACUUM VERBOSE: INFO: vacuuming "public.T" INFO: scanned index "idx1" to remove 249 row versions DETAIL: CPU 0.00s/0.00u sec elapsed 2.00 sec. INFO: scanned index "idx2" to remove 249 row versions DETAIL: CPU 0.00s/0.00u sec elapsed 2.12 sec. INFO: scanned index "idx3" to remove 249 row versions DETAIL: CPU 0.00s/0.00u sec elapsed 2.90 sec. INFO: scanned index "idx4" to remove 249 row versions DETAIL: CPU 0.00s/0.00u sec elapsed 2.32 sec. INFO: scanned index "idx5" to remove 249 row versions DETAIL: CPU 0.00s/0.00u sec elapsed 2.42 sec. INFO: "T": removed 249 row versions in 249 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.06 sec. INFO: index "idx1" now contains 1976 row versions in 252 pages DETAIL: 249 index row versions were removed. 210 index pages have been deleted, 210 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: index "idx2" now contains 1976 row versions in 258 pages DETAIL: 249 index row versions were removed. 209 index pages have been deleted, 209 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.07 sec. INFO: index "idx3" now contains 1976 row versions in 259 pages DETAIL: 249 index row versions were removed. 217 index pages have been deleted, 217 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.06 sec. INFO: index "idx4" now contains 1976 row versions in 250 pages DETAIL: 249 index row versions were removed. 206 index pages have been deleted, 206 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.07 sec. INFO: index "idx5" now contains 1976 row versions in 267 pages DETAIL: 249 index row versions were removed. 217 index pages have been deleted, 217 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.06 sec. INFO: "T": found 0 removable, 1444 nonremovable row versions in 1522 out of 299964 pages DETAIL: 30 dead row versions cannot be removed yet. There were 10035 unused item pointers. 0 pages are entirely empty. CPU 0.01s/0.00u sec elapsed 17.24 sec. INFO: vacuuming "pg_toast.pg_toast_17132" INFO: index "pg_toast_17132_index" now contains 279 row versions in 2 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.01 sec. INFO: "pg_toast_17132": found 0 removable, 279 nonremovable row versions in 69 out of 69 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.56 sec. VACUUM I imagine CLUSTERing the table would make things happier, but I'm hoping for a permanent solution that avoids periodic downtime. One thought was to partition so rows that are still changing live in a separate table from the more stable rows. I imagine the cardinality of (2) rarely exceeds 10. Can I still get into performance trouble with a table that small after enough updates? Anyone have other ideas? - Dave
On 02/27/2012 06:55 PM, Dave Vitek wrote: > Hi all, > > I have a relation where a tuple typically undergoes a lifecycle something like: > 1) Created > 2) Updated maybe thousands of times (no updates to indexed columns though) > 3) Rarely or never modified again > > The following query takes about 100 minutes (3 seconds per tuple): > SELECT count(id) from T > > (2) is causing a lot of auto vacuum/analyze activity (which is OK). HOT seems to get used for about 90% of the updates,but there are enough updates that don't use it to cause issues. I'm using pg version 9.0.3 on a 32-bit windows xpmachine with 3GB of RAM. The .conf file is using default settings. > > Table Info: > n_live_tup 1799 > n_dead_tup 191 > pg_relation_size 2343mb > indexsize 10mb > toastsize 552kb > toastindexsize 16kb > > This reports under 10kb for most tuples: > psql -A -c "select * from T where id=123" | wc -c > > auto-vacuum and auto-analyze both ran yesterday with default settings. There are only one or two new tuples since yesterday.The database is fairly old (was probably created using pg_restore about when 9.0.3 came out). > > Here is the output from VACUUM VERBOSE: > INFO: vacuuming "public.T" > 0 pages are entirely empty. > CPU 0.00s/0.00u sec elapsed 0.56 sec. > VACUUM > > > I imagine CLUSTERing the table would make things happier, but I'm hoping for a permanent solution that avoids periodicdowntime. > > One thought was to partition so rows that are still changing live in a separate table from the more stable rows. I imaginethe cardinality of (2) rarely exceeds 10. Can I still get into performance trouble with a table that small after enoughupdates? Anyone have other ideas? > > - Dave > I'm a little confused... whats the problem? > The following query takes about 100 minutes (3 seconds per tuple): > SELECT count(id) from T so table T has 18,000 rows? (100 * 60 * 3) > Table Info: > n_live_tup 1799 Oh, so the table has 1,800 rows? > This reports under 10kb for most tuples: > psql -A -c "select * from T where id=123" | wc -c 10 kb? what? So that's one row? And that one row is about 10,000 bytes? So you have a large row, who knows how many, and the problem is "SELECT count(id) from T" is slow? You say: > 2) Updated maybe thousands of times (no updates to indexed columns though) Are the updates slow? How slow, and how fast to you need them to be? Is ID a unique key? Or does "update ... where id=123"update more than one row? What does the actual table look like? Looks like it has lots of indexes, but I'm just guessing! Are you sure you need allthose indexes? updating a row has to update all those indexes.. I assume, but you give no details. Also... in the end, you're on windows, so you probably cant tell me if you are IO bound, but you probably need a better IOsubsystem. And stop running on raid 5 (which I'm assuming since you don't mention anything) Wow. Re-reading that I seem to be angry. Please don't take it that way. Blood sugar is high, makes me angry person. Otherswill be along to scold me. Bad Andy! Bad! Go to your room until you find your happy face. -Andy ps: based on you're email gramma, I almost changed all your to you're, just to see if it annoyed you. Sounds like you workfor grammar technology.
On 2/27/2012 10:29 PM, Andy Colson wrote: > On 02/27/2012 06:55 PM, Dave Vitek wrote: >> Hi all, >> >> I have a relation where a tuple typically undergoes a lifecycle >> something like: >> 1) Created >> 2) Updated maybe thousands of times (no updates to indexed columns >> though) >> 3) Rarely or never modified again >> >> The following query takes about 100 minutes (3 seconds per tuple): >> SELECT count(id) from T >> >> (2) is causing a lot of auto vacuum/analyze activity (which is OK). >> HOT seems to get used for about 90% of the updates, but there are >> enough updates that don't use it to cause issues. I'm using pg >> version 9.0.3 on a 32-bit windows xp machine with 3GB of RAM. The >> .conf file is using default settings. >> >> Table Info: >> n_live_tup 1799 >> n_dead_tup 191 >> pg_relation_size 2343mb >> indexsize 10mb >> toastsize 552kb >> toastindexsize 16kb >> >> This reports under 10kb for most tuples: >> psql -A -c "select * from T where id=123" | wc -c >> >> auto-vacuum and auto-analyze both ran yesterday with default >> settings. There are only one or two new tuples since yesterday. The >> database is fairly old (was probably created using pg_restore about >> when 9.0.3 came out). >> >> Here is the output from VACUUM VERBOSE: >> INFO: vacuuming "public.T" >> 0 pages are entirely empty. >> CPU 0.00s/0.00u sec elapsed 0.56 sec. >> VACUUM >> >> >> I imagine CLUSTERing the table would make things happier, but I'm >> hoping for a permanent solution that avoids periodic downtime. >> >> One thought was to partition so rows that are still changing live in >> a separate table from the more stable rows. I imagine the cardinality >> of (2) rarely exceeds 10. Can I still get into performance trouble >> with a table that small after enough updates? Anyone have other ideas? >> >> - Dave >> > > I'm a little confused... whats the problem? Sequential scans of this table seem to be pathologically slow for a table with only 2000 rows. > > >> The following query takes about 100 minutes (3 seconds per tuple): >> SELECT count(id) from T > > so table T has 18,000 rows? (100 * 60 * 3) 100 minutes * 60 (sec/min) / 3 (sec/tuple) = 2000 tuples > >> Table Info: >> n_live_tup 1799 > > Oh, so the table has 1,800 rows? Correct. > > >> This reports under 10kb for most tuples: >> psql -A -c "select * from T where id=123" | wc -c > > 10 kb? what? So that's one row? And that one row is about 10,000 bytes? Correct. I mentioned this since pg_relation_size would suggest that one tuple is using over a meg including overhead, fragmentation, and free space. So only about 1% of the space is used by the payload, which might be a red flag? > > > So you have a large row, who knows how many, and the problem is > "SELECT count(id) from T" is slow? Correct. I think anything requiring a sequential scan of the table takes 100 minutes to run. However, I wanted to be careful about drawing conclusions and just present data. > > You say: > >> 2) Updated maybe thousands of times (no updates to indexed columns >> though) > > Are the updates slow? How slow, and how fast to you need them to be? The updates are perfectly responsive. > Is ID a unique key? Or does "update ... where id=123" update more > than one row? It's the primary key, it's unique, and it updates a single row. > > What does the actual table look like? Looks like it has lots of > indexes, but I'm just guessing! Are you sure you need all those > indexes? updating a row has to update all those indexes.. I assume, > but you give no details. It has 5 indices, which are warranted, but writes are performing fine -- it's just sequential scans that seem to be pathological. Index scans are fine too. > > Also... in the end, you're on windows, so you probably cant tell me if > you are IO bound, but you probably need a better IO subsystem. And > stop running on raid 5 (which I'm assuming since you don't mention > anything) I think this is typical consumer hardware from maybe 3 years ago. Copying a 1G file on the disk takes... 11 minutes when the machine is under heavy load. There are many instances of this database schema+app, so making it work well on low end hardware is important. > > Wow. Re-reading that I seem to be angry. Please don't take it that > way. Blood sugar is high, makes me angry person. Others will be > along to scold me. Bad Andy! Bad! Go to your room until you find > your happy face. No problem, I should have been clearer about sequential scans being the issue. > > -Andy > ps: based on you're email gramma, I almost changed all your to you're, > just to see if it annoyed you. Sounds like you work for grammar > technology. How embarrassing! After writing code all day I tend to have some natural language issues.
On 2/28/2012 2:09 PM, Dave Vitek wrote: >>> The following query takes about 100 minutes (3 seconds per tuple): >>> SELECT count(id) from T >> >> so table T has 18,000 rows? (100 * 60 * 3) > 100 minutes * 60 (sec/min) / 3 (sec/tuple) = 2000 tuples Well... math was never my strong point :-) >> So you have a large row, who knows how many, and the problem is >> "SELECT count(id) from T" is slow? > Correct. I think anything requiring a sequential scan of the table takes > 100 minutes to run. However, I wanted to be careful about drawing > conclusions and just present data. >> >> >> -Andy >> ps: based on you're email gramma, I almost changed all your to you're, >> just to see if it annoyed you. Sounds like you work for grammar >> technology. > How embarrassing! After writing code all day I tend to have some natural > language issues. No, no, that's not what I meant. Your email address grammatech.com, makes it sound like you are doing grammar technology. Have you checked for table bloat? I thought I saw a sql statement someplace that did that. I'm not sure what it actually means, but this row would make me worry: INFO: "T": found 0 removable, 1444 nonremovable row versions in 1522 out of 299964 pages DETAIL: 30 dead row versions cannot be removed yet. Hopefully somebody else is reading along that can offer insight. Have you always had this problem, or is it something new? Or is "select count..." new? Also, not sure if you have seen it elsewhere, but "select count"'ing an entire table is probably one of PG slowest operations. If you really need the count you might consider a separate summary table that stores the count. -Andy
On 2/28/2012 5:21 PM, Andy Colson wrote: > On 2/28/2012 2:09 PM, Dave Vitek wrote: >>>> The following query takes about 100 minutes (3 seconds per tuple): >>>> SELECT count(id) from T >>> >>> so table T has 18,000 rows? (100 * 60 * 3) >> 100 minutes * 60 (sec/min) / 3 (sec/tuple) = 2000 tuples > > Well... math was never my strong point :-) > >>> So you have a large row, who knows how many, and the problem is >>> "SELECT count(id) from T" is slow? >> Correct. I think anything requiring a sequential scan of the table takes >> 100 minutes to run. However, I wanted to be careful about drawing >> conclusions and just present data. >>> >>> >>> -Andy >>> ps: based on you're email gramma, I almost changed all your to you're, >>> just to see if it annoyed you. Sounds like you work for grammar >>> technology. >> How embarrassing! After writing code all day I tend to have some natural >> language issues. > > No, no, that's not what I meant. Your email address grammatech.com, > makes it sound like you are doing grammar technology. > > Have you checked for table bloat? I thought I saw a sql statement > someplace that did that. > > I'm not sure what it actually means, but this row would make me worry: > > INFO: "T": found 0 removable, 1444 nonremovable row versions in 1522 > out of 299964 pages > DETAIL: 30 dead row versions cannot be removed yet. > > Hopefully somebody else is reading along that can offer insight. > > Have you always had this problem, or is it something new? Or is > "select count..." new? > > Also, not sure if you have seen it elsewhere, but "select count"'ing > an entire table is probably one of PG slowest operations. If you > really need the count you might consider a separate summary table that > stores the count. > > -Andy Andy, Thanks for the discussion. The name has more to do with attribute grammars, iirc. Anyway, I'm sure the table is quite bloated. I tried a query yesterday from somewhere on the internet that claimed to report the amount of bloat, but it reported 100.0% bloat, which seems suspicious given that there is some measurable payload in there. Still, probably not far from the truth. The count(id) query is an example. The application almost never does sequential scans of this table... except when the server is starting and for one user query that displays a significant portion of the table. I'd rather avoid creating even more indices to avoid the sequential scan -- these indices would interfere with HOT updates. I've made a schema change that moves a large, immutable text column out of this table. With this change, the remaining columns are simple scalars (integer, float, timestamp) with the exception of a username and file path. I'd be pretty shocked if any tuple had more than a kilobyte of payload now. It eliminates use of TOAST... I'll be clustering and I guess we'll see what happens over the next year or so. - Dave