Re: pg_xlog Concern - Mailing list pgsql-general

From PT
Subject Re: pg_xlog Concern
Date
Msg-id 20150520070911.096a82b21b975098cdec7369@potentialtech.com
Whole thread Raw
In response to Re: pg_xlog Concern  (Koray Eyidoğan <korayey@gmail.com>)
List pgsql-general
(I'm not going to try to fix the top-posting)

I've seen what I think you're describing on several occasions.

What we basically discovered is very simple: disks have a top speed and
you can't exceed it.

If I understand you correctly, you see the # of wal segments grow very
large during and shortly after bulk data changes, then they shrink back
down to what you'd expect, but grow again during the next bulk data
change.

Essentially, what we discovered was happening was that we were doing
bulk data manipulations at about 100% the speed of the disk subsystem.
As a result, there was no additional capacity for the wal archiving to
copy files with. Archiving still runs, but it's much slowed down, just
like any other disk activity when the disks are very busy. Since
PostgreSQL thinks your data is important, it won't discard wal files
that have not yet been archived, so they stick around until it can
catch up.

If I'm diagnosing your situation correclty, you have a few options:

1) Just accept it, that's what we decided since the bulk operations
   only happened about once a month and the problem cleared up in a
   few hours.
2) Faster disks: move to SSDs or a better RAID controller or whatever
   it takes to make the disks fast enough not to have the problem.
3) Move the wal and/or the wal archive directories onto a different disk
   subsystem, which essentially increases the speed of the disks through
   "divide and conquer". You many not even need new hardware to accomplish
   this -- if you have enough disks you might benefit from rearranging
   how they're organized in the RAID controller. Of course, you'll have
   to back up and restore the system to do so.
4) Change the application that does the bulk loading to throttle itself
   so it doesn't overload the disks, which will then allow wal archiving
   to keep up through the process.

Of course, each of these solutions has its benefits and drawbacks, so
you'll have to decide which is right for you.

On Mon, 18 May 2015 17:34:21 +0300
Koray Eyidoğan <korayey@gmail.com> wrote:

> Hi,
>
> Any kind of slowness on your archive directory may cause the archiving
> process fall behind thus accumulating segment files in your cluster's
> pg_xlog directory.
>
> I assume that you are on PostgreSQL 9.4. Could you please check your
> archiver status by typing "select * from pg_catalog.pg_stat_archiver;" in
> psql ? If the last_archived_wal column's value is not so close to your
> current xlog location, then it probably means a slow write speed on your
> archive path compared to your pg_xlog path.
>
> You can check your current xlog file by typing "select
> pg_xlogfile_name(pg_current_xlog_location());" in psql.
>
> If you are not on PostgreSQL 9.4, you can check your archiver status by
> typing "ps -ef | grep archiver" in your shell.
>
> I've also assumed that you are not using replication slots.
>
> As far as I know, long running transactions (just sitting in idle) won't
> affect pg_xlog directory size. Correct me if I'm wrong.
>
> Hope that helps.
>
> Have a nice day.
>
> Koray
>
>
>
>
> On Mon, May 18, 2015 at 5:00 PM, Torsten Förtsch <torsten.foertsch@gmx.net>
> wrote:
>
> > On 18/05/15 13:44, Sachin Srivastava wrote:
> > > But currently my pg_xlog size is 60 GB and there are 3740 WAL file in
> > > this folder and in Last week this was 400 GB(pg_xlog folder) and WAL
> > > file were approx. 30000. Due to archiving pg_xlog folder size is
> > > decreasing now but it’s taking one week to come in normal size.
> >
> > Any chance you have unfinished transactions running for a week?
> >
> > pg_stat_activity should be able to tell you that.
> >
> > Torsten
> >
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> >


--
PT <wmoran@potentialtech.com>


pgsql-general by date:

Previous
From: Naveed Shaikh
Date:
Subject: Re: Memory Utilization Issue
Next
From: George Neuner
Date:
Subject: partitioned tables