Context-switch storm in 8.1.15 - Mailing list pgsql-admin
From | Iñigo Martinez Lasala |
---|---|
Subject | Context-switch storm in 8.1.15 |
Date | |
Msg-id | 1FBF68E78578447EAD1570BC0DDF1041@DEIMOS Whole thread Raw |
Responses |
Re: Context-switch storm in 8.1.15
|
List | pgsql-admin |
Hi everybody. Recently our company has been granted with a contract for an on-line store mainteinance. The website has been developed under J2EE and Postgres 8.1 as database backend. The system has been working without problem for several month, but with Christmas access to web portal has raised a lot. The database suffers of a performance problem on high load. Lot of context switch happens reaching up to 200.000 cs per second. This system is a 16GB, 4 CPU intel Xeon MP with HT enabled and a RAID10 iSCSI storage, kernel 2.4.21 (RHAS 3). Half of CPU power is lost on system time, as you can see. Vmstat on high load 19 0 0 281852 150316 13732396 0 0 32 80 1071 128209 41 43 16 0 75 0 0 282040 150316 13732396 0 0 0 0 719 148023 40 38 22 0 3 0 0 284208 150324 13732412 0 0 16 484 728 145371 39 40 21 0 12 0 0 278364 150324 13732508 0 0 80 56 660 157533 35 42 23 1 6 0 0 284972 150324 13732580 0 0 32 200 685 142014 39 41 20 0 8 0 0 296424 150324 13732624 0 0 40 136 554 139601 41 39 20 0 85 0 0 265004 150324 13732664 0 0 32 48 642 142437 48 32 20 0 32 0 0 267432 150324 13732680 0 0 0 788 1003 144409 37 42 21 0 13 0 0 270468 150324 13732676 0 0 0 24 724 146663 42 40 19 Vmstat after 20 seconds after stopping portal: 8 0 0 962388 206744 13771548 0 0 0 0 131 199784 11 38 51 0 3 0 0 970212 206744 13771548 0 0 0 1856 305 203639 12 40 48 0 10 0 0 975036 206744 13771588 0 0 0 128 212 201899 11 36 52 0 3 0 0 970272 206744 13771652 0 0 16 232 685 202672 14 41 44 0 6 0 0 1008320 206744 13771656 0 0 0 40 198 196298 14 46 39 0 3 0 0 1034836 206744 13771656 0 0 0 0 147 202731 12 39 50 0 3 0 0 1037764 206752 13771656 0 0 0 952 202 202933 11 39 50 0 5 0 0 1078132 206752 13771656 0 0 0 0 154 203408 18 35 47 0 6 0 0 1110572 206752 13771656 0 0 0 0 153 196864 18 41 41 0 4 0 0 1105440 206752 13771824 0 0 16 592 461 207538 12 37 51 1 I've read about this problem with version prior 8.2. However at this moment is not possible to migrate to 8.2 due to the amount of stored procedures and we don't have time enough to test ALL procedures in order to migrate to 8.2 (or 8.3). However we have performed light tests with 8.2 on high load an this problem has been solved or mitigated. Now the question. Is there any backport patch for 8.1 that solves context-switch storm? The patch I'm looking for is this or a similar one(this one is for 8.2): --- A Itagaki Takahiro/Tom Lane patch which arranges for GetSnapshotData to copy live-subtransaction XIDs from the PGPROC array into snapshots, and use this information to avoid visits to pg_subtrans in HeapTupleSatisfiesSnapshot. This appears to solve the pg_subtrans-related context swap storm problem that's been reported by several people for 8.1. While at it, modify GetSnapshotData to not take an exclusive lock on ProcArrayLock, as closer analysis shows that shared lock is always sufficient. --- Thanks in advance.
pgsql-admin by date: