Thread: pgpgout/s without swapping--what does it mean?
We had a big increase in load, iowait, and disk i/o on a dedicated database host the other night.
Looking at the sar logs, the problem shows itself in a big increase in pgpgout/s, which I believe is postgres paging out parts of itself to disk?
02:15:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
02:25:01 AM 49.63 163.83 836.80 0.00 467.69 4.80 0.00 4.80 100.00
02:35:01 AM 44.85 230.29 1248.68 0.00 677.18 16.21 0.00 16.21 100.00
02:45:01 AM 47.67 190.82 1059.58 0.00 517.72 5.33 0.00 5.33 100.00
02:55:01 AM 59.00 175.58 986.87 0.00 514.08 18.13 0.00 18.13 100.00
03:05:01 AM 24.67 849.77 1382.60 0.00 1267.94 37.35 0.00 37.35 100.00
03:15:01 AM 28.67 1701.67 717.88 0.00 1231.48 0.00 0.00 0.00 0.00
03:25:02 AM 42.64 21342.02 4086.19 0.04 9701.70 415.92 0.00 414.44 99.64
03:35:01 AM 35.60 28290.69 4305.38 0.10 12906.73 623.89 0.00 615.85 98.71
03:45:01 AM 36.94 31119.30 3675.34 0.01 12456.54 527.55 0.00 521.61 98.87
03:55:01 AM 42.77 29020.72 3458.96 0.01 12165.57 557.57 0.00 553.10 99.20
Average: 41.25 11306.39 2175.59 0.02 5189.70 220.62 0.00 218.63 99.10
However, there isn't a corresponding increase in pages *in*, so if postgres is writing portions of itself out to disk, they can't be very important. And there's no swapping going on:
02:15:01 AM pswpin/s pswpout/s
02:25:01 AM 0.00 0.00
02:35:01 AM 0.00 0.00
02:45:01 AM 0.00 0.00
02:55:01 AM 0.00 0.00
03:05:01 AM 0.00 0.00
03:15:01 AM 0.00 0.00
03:25:02 AM 0.00 0.00
03:35:01 AM 0.00 0.00
03:45:01 AM 0.00 0.00
03:55:01 AM 0.00 0.00
Average: 0.00 0.00
Can anybody help me understand what these statistics are suggesting, what's actually going on on this box/in postgresql? What is it writing to disk, and why? Is it just writing out new/changed rows, or what?
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgoess@bepress.com
510-665-1200 x179
www.bepress.com
bepress: sustainable scholarly publishing
Kevin Goess wrote > Can anybody help me understand what these statistics are suggesting, > what's > actually going on on this box/in postgresql? What is it writing to disk, > and why? Is it just writing out new/changed rows, or what? Not a clue on the statistics but most likely you are seeing checkpoint activity. At a simplified level all changes to the database are first persisted to disk using WAL (write-ahead-logs). The changes are written to disk, into WAL files, during commit via fsync. The original data files are not affected immediately thus improving performance at the time of commit by instead risking a prolonged delay in situations where an unclean shutdown occurs. However, at some point the WAL files need to be removed and the physical table files updated. This occurs during a checkpoint. A checkpoint basically causes the on-disk files to become baselined to the current reality so that only subsequent WAL files need be applied. There is considerably more to this whole concept than I can go into off the top of my head but in addition to looking at just I/O it would help to look, simultaneously, at what processes are active. HTH David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pgpgout-s-without-swapping-what-does-it-mean-tp5796346p5796355.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Monday, March 17, 2014, Kevin Goess <kgoess@bepress.com> wrote:
We had a big increase in load, iowait, and disk i/o on a dedicated database host the other night.Looking at the sar logs, the problem shows itself in a big increase in pgpgout/s, which I believe is postgres paging out parts of itself to disk?02:15:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
...
However, there isn't a corresponding increase in pages *in*, so if postgres is writing portions of itself out to disk, they can't be very important.
As far as I can tell, pgpgout/s includes all data written to disk, not just process memory being paged. So it includes WAL and data files being written, for example due to bulk loads. Seems like a odd name for that parameter, and I don't know how it differs from bwrtn/s, other than the different units.
If it is a bulk load, that would explain why it is not being read back in. Also, it could be that the data is needed, but when it is needed it is still in cache and so doesn't lead to disk reads. But it still needs to be written for durability reasons.
Cheers,
Jeff
That makes sense, so updates to rows that are already in memory, either in blocks in the kernel page cache or in blocks in the postgres cache, would trigger writes but not reads. Thanks for the sanity check.
On Mon, Mar 17, 2014 at 11:04 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Monday, March 17, 2014, Kevin Goess <kgoess@bepress.com> wrote:We had a big increase in load, iowait, and disk i/o on a dedicated database host the other night.Looking at the sar logs, the problem shows itself in a big increase in pgpgout/s, which I believe is postgres paging out parts of itself to disk?02:15:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff...However, there isn't a corresponding increase in pages *in*, so if postgres is writing portions of itself out to disk, they can't be very important.As far as I can tell, pgpgout/s includes all data written to disk, not just process memory being paged. So it includes WAL and data files being written, for example due to bulk loads. Seems like a odd name for that parameter, and I don't know how it differs from bwrtn/s, other than the different units.If it is a bulk load, that would explain why it is not being read back in. Also, it could be that the data is needed, but when it is needed it is still in cache and so doesn't lead to disk reads. But it still needs to be written for durability reasons.Cheers,Jeff
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgoess@bepress.com
510-665-1200 x179
www.bepress.com
bepress: sustainable scholarly publishing
Hi , Me too have the same issue on PosgreSQL server. pgpgout/s is going high and the same time didn't observe any swapping or memory pressure on the host. Please refer below sar O/P. 12:00:01 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 12:10:01 PM 0.00 4.16 114.55 0.00 106.62 0.00 0.00 0.00 0.00 12:20:01 PM 0.00 3.94 38.37 0.00 57.85 0.00 0.00 0.00 0.00 12:30:01 PM 7.63 4.65 206.51 0.07 108.31 0.00 0.00 0.00 0.00 12:40:01 PM 3.08 4.99 110.57 0.02 93.15 0.00 0.00 0.00 0.00 12:50:01 PM 0.39 4.43 133.92 0.00 88.53 0.00 0.00 0.00 0.00 01:00:01 PM 0.00 4.93 134.93 0.00 88.07 0.00 0.00 0.00 0.00 01:10:01 PM 17.10 20652.46 233.39 0.18 14035.87 4165.03 0.00 4164.51 99.99 01:20:01 PM 9.37 15831.25 140.93 0.13 11912.22 4116.36 0.00 4116.36 100.00 01:30:01 PM 15.40 19688.21 164.96 0.18 14261.36 4938.36 0.00 4938.32 100.00 01:40:01 PM 11.52 11529.86 149.57 0.12 8133.76 2784.77 0.00 2784.77 100.00 01:50:01 PM 9.16 12416.42 140.14 0.13 8890.65 3057.49 0.00 3057.49 100.00 02:00:01 PM 12.19 9076.26 142.05 0.11 6756.87 2320.05 1.88 2321.92 100.00 02:10:01 PM 14.29 12839.23 219.15 0.17 9501.58 3257.99 0.00 3257.99 100.00 02:20:01 PM 12.78 10694.65 145.12 0.11 8268.03 2465.23 0.00 2465.23 100.00 02:30:01 PM 5.79 4.93 140.92 0.03 94.61 0.00 0.00 0.00 0.00 02:40:01 PM 0.06 4.57 147.94 0.00 109.82 0.00 0.00 0.00 0.00 02:50:01 PM 4.94 5.11 140.54 0.03 94.58 0.00 0.00 0.00 0.00 03:00:01 PM 0.00 4.92 138.82 0.00 94.25 0.00 0.00 0.00 0.00 03:10:01 PM 7.66 7.84 267.43 0.12 177.26 0.00 0.00 0.00 0.00 03:20:01 PM 0.00 4.57 138.89 0.00 94.06 0.00 0.00 0.00 0.00 03:30:01 PM 0.00 4.85 140.98 0.00 95.61 0.00 0.00 0.00 0.00 03:40:01 PM 0.00 4.54 148.36 0.00 110.78 0.00 0.00 0.00 0.00 03:50:01 PM 0.00 4.51 137.96 0.00 93.90 0.00 0.00 0.00 0.00 04:00:01 PM 0.00 5.08 143.63 0.00 95.18 0.00 0.00 0.00 0.00 04:10:01 PM 22.53 50746.42 230.90 0.22 35820.37 18562.07 0.00 11605.90 62.52 04:20:01 PM 23.93 60429.47 136.53 0.19 43166.45 22012.66 0.00 14525.16 65.99 04:30:01 PM 17.97 3110.80 140.26 0.10 2822.69 1505.53 0.00 601.17 39.93 And also observed that during this time . there some rsync jobs started by pgsql and consuming CPU resources. % CPU PID 30.8 10452 sshd 24.9 10454 rsync 3.0 10455 flush-253:2 postgres 10452 30.8 0.1 104372 6256 ? S 16:00 6:02 sshd: postgres@notty postgres 10453 0.0 0.0 110512 1496 ? Ss 16:00 0:00 rsync --server -logDtpre.iLs . /backup/ARCHIVELOGS/CLUSTER1 postgres 10454 24.9 0.0 110772 1056 ? S 16:00 4:53 rsync --server -logDtpre.iLs . /backup/ARCHIVELOGS/CLUSTER1 root 10455 3.0 0.0 0 0 ? S 16:00 0:35 [flush-253:2] VM stat . procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 1 15704 173136 234692 5197108 0 0 167 4141 1 1 3 2 93 2 0 0 1 15704 120684 234692 5250312 0 0 0 242356 4374 3450 18 17 25 39 0 1 0 15704 131224 234688 5247532 0 0 0 102396 5525 2630 22 17 45 16 0 2 0 15704 159744 234688 5219992 0 0 0 56 3809 2555 17 10 74 0 0 1 0 15704 157636 234688 5221132 0 0 0 172 6719 4522 26 16 58 0 0 Can anyone suggest how to get rid of this pageout alert. -- View this message in context: http://postgresql.nabble.com/pgpgout-s-without-swapping-what-does-it-mean-tp5796346p5867809.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.