Thread: WAL questions
We have a system with 1202 files in the WAL directory (pg_xlog).
When we start postmaster, it goes into the starting state for 5 minutes
and then crashes.
Questions:
1) What is the biggest number of WAL files you've seen and what were
you doing to the database at the time?
2) When postmaster starts, it replays the WAL files. During this time
any connection is rejected with an error indicating that the database
is starting up. What the longest amount of time that you'd expect
postmaster to be in the "starting up" state?
"Steve Oualline" <soualline@stbernard.com> writes: > We have a system with 1202 files in the WAL directory (pg_xlog). > When we start postmaster, it goes into the starting state for 5 minutes > and then crashes. Define "crash". If you don't show us the *exact* messages you're seeing, it's difficult to guess what's going on. Also, what happened when the postmaster stopped the first time? The most interesting part of this from my point of view is how did you get into this state in the first place --- unless you had set insanely high values for checkpoint_segments and checkpoint_timeout, you should not have gotten up to that many files in pg_xlog. A plausible guess is that something was preventing checkpoints from completing, but any such problem should have left traces in the postmaster log. If you've still got the pre-crash log it would be very interesting to see. regards, tom lane