Hi hackers,
Since few months, we occasionally see .ready files appearing on some slaveinstances from various context. The two I
havein mind are under 9.2.x.
I tried to investigate a bit. These .ready files are created when a WAL file
from pg_xlog has no corresponding file in pg_xlog/archive_status. I could
easily experience this by deleting such a file: it is created again at the next
restartpoint or checkpoint received from the master.
Looking at the WAL in pg_xlog folder corresponding to these .ready files, they
are all much older than the current WAL "cycle" in both mtime and name logic
sequence. As instance on one of these box we have currently 6 of those "ghost"
WALs:
0000000200001E53000000FF 0000000200001F18000000FF 0000000200002047000000FF 00000002000020BF000000FF
0000000200002140000000FF0000000200002370000000FF 000000020000255D000000A8 000000020000255D000000A9 [...normal WAL
sequence...]000000020000255E0000009D
And on another box:
000000010000040E000000FF 0000000100000414000000DA 000000010000046E000000FF 0000000100000470000000FF
00000001000004850000000F000000010000048500000010 [...normal WAL sequence...] 000000010000048500000052
So it seems for some reasons, these old WALs were "forgotten" by the
restartpoint mechanism when they should have been recylced/deleted.
For one of these servers, I could correlate this with some brutal disconnection
of the streaming replication appearing in its logs. But there was no known SR
disconnection on the second one.
Any idea about this weird behaviour? What can we do to help you investigate
further?
Regards,
--
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com