Thread: URGENT issue: pg-xlog growing on master!

URGENT issue: pg-xlog growing on master!

From

Niels Kristian Schjødt

Date:

10 June 2013, 11:29:56

Hi, My pg_xlog dir has been growing rapidly the last 4 days, and my disk is now almost full (1000Gb) even though the
databaseis only 50Gb. I have a streaming replication server running, and in the log of the slave it says: 

cp: cannot stat `/var/lib/postgresql/9.2/wals/0000000200000E1B000000A9': No such file or directory
cp: cannot stat `/var/lib/postgresql/9.2/wals/0000000200000E1B000000A9': No such file or directory
2013-06-10 11:21:45 GMT FATAL:  could not connect to the primary server: could not connect to server: No route to host
        Is the server running on host "192.168.0.4" and accepting
        TCP/IP connections on port 5432?

All the time.

I have tried to restart the server, but that didn't help. I checked the master, and the file
/var/lib/postgresql/9.2/wals/0000000200000E1B000000A9does not exist! I'm pretty lost here, can someone help me solve
thisand get my master server cleaned up. What is causing this, and what do I need to do? 

Kind regards

Re: URGENT issue: pg-xlog growing on master!

From

Dinesh Kumar

Date:

10 June 2013, 11:47:59

Hi, My pg_xlog dir has been growing rapidly the last 4 days, and my disk is now almost full (1000Gb) even though the database is only 50Gb. I have a streaming replication server running, and in the log of the slave it says:

cp: cannot stat `/var/lib/postgresql/9.2/wals/0000000200000E1B000000A9': No such file or directory
cp: cannot stat `/var/lib/postgresql/9.2/wals/0000000200000E1B000000A9': No such file or directory
2013-06-10 11:21:45 GMT FATAL: could not connect to the primary server: could not connect to server: No route to host
Is the server running on host "192.168.0.4" and accepting
TCP/IP connections on port 5432?

All the time.

I have tried to restart the server, but that didn't help. I checked the master, and the file /var/lib/postgresql/9.2/wals/0000000200000E1B000000A9 does not exist! I'm pretty lost here, can someone help me solve this and get my master server cleaned up. What is causing this, and what do I need to do?

IIRC, this kind of situation we may expect, when the archive command was failed at master side. Could you verify, how many files "000000xxxxxxx.ready" reside under the master's pg_xlog/archive_status directory. And also, verify the master server's recent pg_log file, for finding the root cause of the master server down issue.

Dinesh

--
Dinesh Kumar

Software Engineer

Ph: +918087463317

Skype ID: dinesh.kumar432

www.enterprisedb.co m

Follow us on Twitter
@EnterpriseDB

Visit EnterpriseDB for tutorials, webinars, whitepapers and more

Re: URGENT issue: pg-xlog growing on master!

From

bricklen

Date:

10 June 2013, 14:36:11

On Mon, Jun 10, 2013 at 4:29 AM, Niels Kristian Schjødt <nielskristian@autouncle.com> wrote:

2013-06-10 11:21:45 GMT FATAL: could not connect to the primary server: could not connect to server: No route to host
Is the server running on host "192.168.0.4" and accepting
TCP/IP connections on port 5432?

Did anything get changed on the standby or master around the time this message started occurring?

On the master, what do the following show?

show port;
show listen_addresses;

The master's IP is still 192.168.0.4?

Have you tried connecting to the master using something like:
psql -h 192.168.0.4 -p 5432 -U postgres -d postgres

Does that throw a useful error or warning?

Re: URGENT issue: pg-xlog growing on master!

From

Niels Kristian Schjødt

Date:

10 June 2013, 15:35:51

Den 10/06/2013 kl. 16.36 skrev bricklen <bricklen@gmail.com>:

On Mon, Jun 10, 2013 at 4:29 AM, Niels Kristian Schjødt <nielskristian@autouncle.com> wrote:

2013-06-10 11:21:45 GMT FATAL: could not connect to the primary server: could not connect to server: No route to host
Is the server running on host "192.168.0.4" and accepting
TCP/IP connections on port 5432?

Did anything get changed on the standby or master around the time this message started occurring?
On the master, what do the following show?
show port;
show listen_addresses;

The master's IP is still 192.168.0.4?

Have you tried connecting to the master using something like:
psql -h 192.168.0.4 -p 5432 -U postgres -d postgres

Does that throw a useful error or warning?

It turned out that the switch port that the server was connected to was faulty, and hence no successful connection between master and slave was established. This resolved in pg_xlog building up very fast, because our system performs a lot of changes on the data we store.

I ended up running pg_archivecleanup on the master to get some space freed urgently. Then I got the switch changed with a new one. Now I'm trying to the streaming replication setup from scratch again, but with no luck.

I can't seem to figure out which steps I need to do, to get the standby server wiped and get it started as a streaming replication again from scratch. I tried to follow the steps, from step 6, in here http://wiki.postgresql.org/wiki/Streaming_Replication but the process seems to fail when I reach the point where I try to do a psql -c "SELECT pg_stop_backup()". It just says:

NOTICE: pg_stop_backup cleanup done, waiting for required WAL segments to be archived

WARNING: pg_stop_backup still waiting for all required WAL segments to be archived (60 seconds elapsed)