Re: log files - Mailing list pgsql-novice

From Amit Langote
Subject Re: log files
Date
Msg-id CA+HiwqGrGAOzttxV9mwGhi8EF22i+q1fi2O8J2b+qW1w_sgcxw@mail.gmail.com
Whole thread Raw
In response to Re: log files  (jjurban <jjurban@attglobal.net>)
List pgsql-novice
On Fri, May 17, 2013 at 12:19 AM, jjurban <jjurban@attglobal.net> wrote:
> Kevin Grittner wrote:
>>
>> jjurban <jjurban@attglobal.net> wrote:
>>
>>>
>>> Amit Langote wrote:
>>>
>>
>>
>>
>>>
>>> There are over 5.7 Gigabytes of these files.  This cannot be
>>> correct.
>>>
>>
>>
>> That depends entirely on your configuration.  In some environments
>> that would be excessive; in others it might be just right.
>>
>
>
>
> Here's what I did - I checked those pg_xlog files and discovered that the
> last 2 entries in the pg_subtrans directory had dates of 4/12/2013 and
> 5/10/2013 - are these considered "checkpoints?"

You can see the last checkpoint using pg_controldata program. You can
pass to it the path of your data directory and it tells (among other
things) the latest checkpoint location or rather latest checkpoint's
REDO location.

http://www.postgresql.org/docs/8.0/static/app-pgcontroldata.html

You can copy that location and use an administrative xlog function to
see which WAL segment file it is contained it and safely delete all
the files before that file. A sample use of that function:


[amit@amit-centos64 pgbench]$ ls ~/pgdata/pg/pg_xlog/
00000001000000000000001F  000000010000000000000021
000000010000000000000023  000000010000000000000025  archive_status
000000010000000000000020  000000010000000000000022
000000010000000000000024  000000010000000000000026


[amit@amit-centos64 pgbench]$ pg_controldata ~/pgdata/pg | head -n 9
pg_control version number:            937
Catalog version number:               201305061
Database system identifier:           5878462236927524454
Database cluster state:               in production
pg_control last modified:             Fri 17 May 2013 09:34:09 AM JST
Latest checkpoint location:           0/2384D628
Prior checkpoint location:            0/208F1830
Latest checkpoint's REDO location:    0/22005E38
Latest checkpoint's REDO WAL file:    000000010000000000000022


[amit@amit-centos64 pgbench]$ psql -d postgres -c "select
pg_xlogfile_name('0/22005E38')"
Password:
     pg_xlogfile_name
--------------------------
 000000010000000000000022
(1 row)


So that I can safely delete 00000001000000000000001F,
000000010000000000000020  and 000000010000000000000021 or under proper
configuration they would be recycled or deleted by PostgreSQL itself.

You need to be cautious here not to use the file modified times from
ls listing to make that decision, since, some of the previous files
are possibly recycled from the old segment files which you otherwise
would have deleted). I suggest you rather read a bit about how WAL
segments are named so that you can understand which ones are *before*
a given file and can safely be deleted. In fact proper WAL
configurtion based on your workload requirements would never require
you to bother yourself with this manual process, since, PostgreSQL
itself takes care of keeping the pg_xlog clean.

I have found this article to be useful to learn about WAL:


http://www.depesz.com/2011/07/14/write-ahead-log-understanding-postgresql-conf-checkpoint_segments-checkpoint_timeout-checkpoint_warning/


>
> I still am not clear on what I have to do to avoid generating these big
> files that I really don't need.  I don't want to do archiving and I always
> back up by simply zipping my PGSData folder.
>

Well, depending on your crash safety requirements you would want to
keep at least *some* of the WAL around by proper configuration. With
no archiving (archive_mode=off), you would want to keep
wal_keep_segments to a decent value for crash safety. You are taking
backups anyway, so keeping WAL around would always help to recover if
need be. You would want to have a look at checkpoint_segments, which
decides both how often checkpoints happen and and also how many WAL
segments pg_xlog would contain at any given time, though exact formula
to calculate that number is different). Do read the references from
which you will get an idea as to what it all means and what is needed
to accomplish what you are trying to without risking safety of your
own data.

Amit Langote


pgsql-novice by date:

Previous
From: jjurban
Date:
Subject: Re: log files
Next
From: "Birchall, Austen"
Date:
Subject: pg_ctl stop failure