Possible causes of sometimes slow single-row UPDATE with trivial indexed condition? - Mailing list pgsql-performance
From | Vlad Romascanu |
---|---|
Subject | Possible causes of sometimes slow single-row UPDATE with trivial indexed condition? |
Date | |
Msg-id | E0D4849884C14164BD0A19939F420018@accurevd3db132 Whole thread Raw |
Responses |
Re: Possible causes of sometimes slow single-row UPDATE
with trivial indexed condition?
Re: Possible causes of sometimes slow single-row UPDATE with trivial indexed condition? |
List | pgsql-performance |
Problem occurs when running (in production) Postgres 8.3.7 64-bit (from RPM) on Ubuntu 8.04.2, on an Amazon EC2 (xen) "Large" instance (8GB RAM), with the DB on a 50GB EC2 block device. Problem does not occur when running (in staging/pre-production) Postgres 8.3.5 32-bit (from RPM) on Ubuntu 8.04.1, on a less beefy Amazon EC2 (xen) "Small" instance, with the DB on a 5GB EC2 block device. I am running with slow query logging on, and somewhat inexplicably I've been getting the following slow UPDATE query several times in the past weeks (I'm also including some context lines above and below): 2009-09-14 08:07:06.238 UTC user@database pid=26524 ip=127.0.0.1(58380) sid=4aadf5ba.679c:1 LOG: duration: 103.905 ms statement: COMMIT 2009-09-14 08:10:19.025 UTC user@database pid=26524 ip=127.0.0.1(58380) sid=4aadf5ba.679c:2 LOG: duration: 124.341 ms statement: COMMIT 2009-09-14 08:10:47.359 UTC user@database pid=26524 ip=127.0.0.1(58380) sid=4aadf5ba.679c:3 LOG: duration: 126.896 ms statement: COMMIT >> 2009-09-14 08:12:30.363 UTC user@database pid=26474 ip=127.0.0.1(58364) >> sid=4aadf58d.676a:1 LOG: duration: 13472.892 ms >> statement: UPDATE "document_sets" SET "status" = E'rejected', >> "updated_at" = '2009-09-14 08:12:16.890054' WHERE "id" = 288 2009-09-14 08:13:41.237 UTC user@database pid=26474 ip=127.0.0.1(58364) sid=4aadf58d.676a:2 LOG: duration: 107.674 ms statement: SELECT * FROM "tenders" This is one of the "faster" occurrences; at times the query has been logged as having taken 100+ seconds: 2009-07-21 06:05:23.035 UTC user@database pid=24834 ip=127.0.0.1(34505) sid=4a6559e1.6102:1 LOG: duration: 156605.004 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-07-21 06:02:46.430176' WHERE "id" = 318 ... 2009-07-21 06:16:32.148 UTC user@database pid=23500 ip=127.0.0.1(38720) sid=4a6552dd.5bcc:2 LOG: duration: 14833.439 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-07-21 06:16:17.314905' WHERE "id" = 188 ... 2009-08-11 07:31:35.867 UTC user@database pid=1227 ip=127.0.0.1(55630) sid=4a811ded.4cb:1 LOG: duration: 29258.137 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-08-11 07:31:06.609191' WHERE "id" = 251 ... 2009-08-13 11:16:40.027 UTC user@database pid=13442 ip=127.0.0.1(41127) sid=4a83f557.3482:1 LOG: duration: 10287.765 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-08-13 11:16:29.738634' WHERE "id" = 273 ... 2009-08-16 05:30:09.082 UTC user@database pid=3505 ip=127.0.0.1(36644) sid=4a8798aa.db1:1 LOG: duration: 153523.612 ms statement: UPDATE "document_sets" SET "status" = E'approved', "updated_at" = '2009-08-16 05:27:35.558505' WHERE "id" = 369 ... 2009-08-16 05:30:09.673 UTC user@database pid=3518 ip=127.0.0.1(36655) sid=4a8798c8.dbe:1 LOG: duration: 114885.274 ms statement: UPDATE "document_sets" SET "status" = E'approved', "updated_at" = '2009-08-16 05:28:14.787689' WHERE "id" = 369 ... 2009-08-16 05:30:10.318 UTC user@database pid=3580 ip=127.0.0.1(36707) sid=4a879919.dfc:1 LOG: duration: 73107.179 ms statement: UPDATE "document_sets" SET "status" = E'approved', "updated_at" = '2009-08-16 05:28:57.210502' WHERE "id" = 369 ... 2009-08-20 06:27:54.961 UTC user@database pid=8312 ip=127.0.0.1(38488) sid=4a8cec7b.2078:1 LOG: duration: 18959.648 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-08-20 06:27:36.001030' WHERE "id" = 255 ... 2009-09-10 06:30:08.176 UTC user@database pid=25992 ip=127.0.0.1(59692) sid=4aa89ac1.6588:1 LOG: duration: 27495.609 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-09-10 06:29:40.680647' WHERE "id" = 346 ... 2009-09-14 08:12:30.363 UTC user@database pid=26474 ip=127.0.0.1(58364) sid=4aadf58d.676a:1 LOG: duration: 13472.892 ms statement: UPDATE "document_sets" SET "status" = E'rejected', "updated_at" = '2009-09-14 08:12:16.890054' WHERE "id" = 288 No other UPDATE or INSERT or DELETE operation has ever triggered the slow query logging. No other query of any kind has ever taken more than 500ms. When analyzing this UPDATE query (verbose output at end of message), as expected: > Index Scan using document_sets_pkey on document_sets (cost=0.00..8.27 > rows=1 width=47) (actual time=0.025..0.028 rows=1 loops=1) > Index Cond: (id = 288) > Total runtime: 0.137 ms Actual table rowcount: 423 rows. Table statistics: Sequential Scans 12174674 Sequential Tuples Read 2764651442 Index Scans 813 Index Tuples Fetched 813 Tuples Inserted 424 Tuples Updated 625 Tuples Deleted 1 Heap Blocks Read 9 Heap Blocks Hit 35888949 Index Blocks Read 11 Index Blocks Hit 1858 Toast Blocks Read 0 Toast Blocks Hit 0 Toast Index Blocks Read 0 Toast Index Blocks Hit 0 Table Size 40 kB Toast Table Size 8192 bytes Indexes Size 32 kB Only entries changed in postgresql.conf default (mainly to reflect running on an 8GB machine): > shared_buffers = 256MB # min 128kB or > max_connections*16kB > work_mem = 4MB # min 64kB > checkpoint_segments = 8 # in logfile segments, min > 1, 16MB each > effective_cache_size = 1280MB > log_min_duration_statement = 100 # -1 is disabled, 0 logs all > statements > log_line_prefix = '%m %u@%d pid=%p ip=%r sid=%c:%l ' sar data for period in question (CPU, processes, paging, disk, etc.) shows no obvious change in CPU usage, number of forks, load, or disk activity on the database device (sdh). # atopsar -b 7:00 -e 10:00 | grep all ip-XXXXXX 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 2009/09/14 -------------------------- analysis date: 2009/09/14 -------------------------- 07:04:15 cpu %usr %nice %sys %irq %softirq %steal %wait %idle _cpu_ 07:14:15 all 1 0 0 0 0 0 0 98 07:24:15 all 2 0 0 0 0 0 0 97 07:34:15 all 3 0 0 0 0 0 0 97 07:44:15 all 1 0 0 0 0 0 0 99 07:54:15 all 2 0 0 0 0 0 0 98 08:04:15 all 2 0 0 0 0 0 0 98 08:14:15 all 2 0 0 0 0 0 0 97 <--- slow query occurred at 8:12 08:24:15 all 6 0 1 0 0 0 0 94 08:34:16 all 8 0 0 0 0 0 0 92 08:44:16 all 2 0 0 0 0 0 0 98 08:54:16 all 1 0 0 0 0 0 0 99 09:04:16 all 3 0 0 0 0 0 0 97 09:14:16 all 6 0 1 0 0 0 0 94 09:24:16 all 1 0 1 0 0 0 0 98 09:34:16 all 1 0 0 0 0 0 0 98 09:44:16 all 2 0 0 0 0 0 0 98 09:54:17 all 2 0 0 0 0 0 0 98 # atopsar -b 7:00 -e 10:00 -s ip-XXXXXX 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 2009/09/14 -------------------------- analysis date: 2009/09/14 -------------------------- 07:04:15 pagescan/s swapin/s swapout/s commitspc commitlim _swap_ 07:14:15 0.00 0.00 0.00 1647M 3587M 07:24:15 0.00 0.00 0.00 1447M 3587M 07:34:15 0.00 0.00 0.00 2038M 3587M 07:44:15 0.00 0.00 0.00 1444M 3587M 07:54:15 0.00 0.00 0.00 2082M 3587M 08:04:15 0.00 0.00 0.00 1658M 3587M 08:14:15 0.00 0.00 0.00 1990M 3587M <--- slow query occurred at 8:12 08:24:15 0.00 0.00 0.00 1709M 3587M 08:34:16 0.00 0.00 0.00 1507M 3587M 08:44:16 0.00 0.00 0.00 1894M 3587M 08:54:16 0.00 0.00 0.00 1640M 3587M 09:04:16 0.00 0.00 0.00 1775M 3587M 09:14:16 0.00 0.00 0.00 2209M 3587M 09:24:16 0.00 0.00 0.00 2035M 3587M 09:34:16 0.00 0.00 0.00 1887M 3587M 09:44:16 0.00 0.00 0.00 1922M 3587M 09:54:17 0.00 0.00 0.00 2138M 3587M # atopsar -b 7:00 -e 10:00 -d ip-XXXXXXX 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 2009/09/14 -------------------------- analysis date: 2009/09/14 -------------------------- 07:04:15 disk busy read/s KB/read write/s KB/writ avque avserv _disk_ 07:24:15 sdh 0% 0.00 0.0 3.10 14.2 3.41 1.17 ms 07:34:15 sdh 0% 0.00 0.0 3.43 14.0 3.85 1.01 ms 07:44:15 sdh 0% 0.00 0.0 2.76 13.8 4.51 1.14 ms 07:54:15 sdh 0% 0.00 0.0 3.11 13.7 2.87 1.11 ms 08:04:15 sdh 0% 0.00 0.0 3.33 13.4 3.85 1.13 ms 08:14:15 sdh 0% 0.00 0.0 3.71 13.3 3.37 1.25 ms <--- slow query occurred at 8:12 08:24:15 sdh 0% 0.00 0.0 2.95 13.4 5.04 1.13 ms 08:34:16 sdh 0% 0.00 0.0 3.27 13.4 4.17 1.05 ms 08:44:16 sdh 0% 0.00 0.0 2.73 13.7 3.31 1.17 ms 08:54:16 sdh 0% 0.00 0.0 1.57 13.9 4.02 1.14 ms 09:04:16 sdh 0% 0.00 0.0 2.05 14.2 4.75 0.96 ms 09:14:16 sdh 0% 0.00 0.0 3.87 14.0 4.19 1.10 ms 09:24:16 sdh 0% 0.00 0.0 3.26 13.9 4.17 1.11 ms 09:34:16 sdh 0% 0.00 0.0 2.17 14.1 3.67 1.18 ms 09:44:16 sdh 0% 0.00 0.0 2.72 15.0 3.63 0.99 ms 09:54:17 sdh 0% 0.00 0.0 3.15 15.0 4.40 1.21 ms # atopsar -b 7:00 -e 10:00 -p ip-XXXXXXX 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 2008 x86_64 2009/09/14 -------------------------- analysis date: 2009/09/14 -------------------------- 07:04:15 pswch/s devintr/s forks/s loadavg1 loadavg5 loadavg15 _load_ 07:14:15 174 163 0.90 0.21 0.20 0.18 07:24:15 207 228 0.90 0.24 0.28 0.21 07:34:15 250 233 0.88 0.18 0.26 0.21 07:44:15 176 172 0.97 0.04 0.13 0.16 07:54:15 215 203 0.88 0.26 0.24 0.18 08:04:15 177 162 0.96 0.23 0.22 0.18 08:14:15 212 259 0.90 0.33 0.29 0.21 <--- slow query occurred at 8:12 08:24:15 220 266 1.08 1.29 0.75 0.40 08:34:16 207 290 0.84 0.25 0.51 0.49 08:44:16 178 175 0.95 0.09 0.21 0.33 08:54:16 159 156 0.82 0.12 0.12 0.20 09:04:16 185 324 0.78 0.44 0.33 0.24 09:14:16 279 505 0.92 0.49 0.56 0.38 09:24:16 231 222 0.87 0.18 0.33 0.34 09:34:16 164 166 0.74 0.21 0.15 0.21 09:44:16 210 191 0.78 0.78 0.34 0.23 09:54:17 240 224 1.03 0.32 0.27 0.23 Possible causes? 1) I/O bottleneck? Bad block on device? Yet "sar -d" shows no change in I/O service time 2) lock contention with (autovacuum?)? why would this not affect other statements on other tables? 3) clock change? Yet why only during this particular UPDATE query? 4) ??? Thank you, V. - - - - Explain verbose: > {INDEXSCAN > :startup_cost 0.00 > :total_cost 8.27 > :plan_rows 1 > :plan_width 47 > :targetlist ( > {TARGETENTRY > :expr > {VAR > :varno 1 > :varattno 1 > :vartype 23 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno 1 > } > :resno 1 > :resname id > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk false > } > {TARGETENTRY > :expr > {VAR > :varno 1 > :varattno 2 > :vartype 23 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno 2 > } > :resno 2 > :resname tender_id > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk false > } > {TARGETENTRY > :expr > {VAR > :varno 1 > :varattno 3 > :vartype 25 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno 3 > } > :resno 3 > :resname note > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk false > } > {TARGETENTRY > :expr > {CONST > :consttype 1043 > :consttypmod 259 > :constlen -1 > :constbyval false > :constisnull false > :constvalue 12 [ 48 0 0 0 114 101 106 101 99 116 101 100 ] > } > :resno 4 > :resname status > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk false > } > {TARGETENTRY > :expr > {VAR > :varno 1 > :varattno 5 > :vartype 1114 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno 5 > } > :resno 5 > :resname created_at > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk false > } > {TARGETENTRY > :expr > {CONST > :consttype 1114 > :consttypmod -1 > :constlen 8 > :constbyval false > :constisnull false > :constvalue 8 [ -58 44 34 -2 -125 22 1 0 ] > } > :resno 6 > :resname updated_at > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk false > } > {TARGETENTRY > :expr > {VAR > :varno 1 > :varattno -1 > :vartype 27 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno -1 > } > :resno 7 > :resname ctid > :ressortgroupref 0 > :resorigtbl 0 > :resorigcol 0 > :resjunk true > } > ) > :qual <> > :lefttree <> > :righttree <> > :initPlan <> > :extParam (b) > :allParam (b) > :scanrelid 1 > :indexid 18771 > :indexqual ( > {OPEXPR > :opno 96 > :opfuncid 65 > :opresulttype 16 > :opretset false > :args ( > {VAR > :varno 1 > :varattno 1 > :vartype 23 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno 1 > } > {CONST > :consttype 23 > :consttypmod -1 > :constlen 4 > :constbyval true > :constisnull false > :constvalue 4 [ 32 1 0 0 0 0 0 0 ] > } > ) > } > ) > :indexqualorig ( > {OPEXPR > :opno 96 > :opfuncid 65 > :opresulttype 16 > :opretset false > :args ( > {VAR > :varno 1 > :varattno 1 > :vartype 23 > :vartypmod -1 > :varlevelsup 0 > :varnoold 1 > :varoattno 1 > } > {CONST > :consttype 23 > :consttypmod -1 > :constlen 4 > :constbyval true > :constisnull false > :constvalue 4 [ 32 1 0 0 0 0 0 0 ] > } > ) > } > ) > :indexstrategy (i 3) > :indexsubtype (o 23) > :indexorderdir 1 > }
pgsql-performance by date: