Thread: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?

Curiosity: what is PostgreSQL doing with data when "nothing" is happening?

From
Gavan Schneider
Date:
Context:
After an unfortunate experience I have implemented WAL archive
backup strategy as per documentation with the minor extensions that:
     1. The archive files are compressed;
     2. Only 8Mb WAL files (it's a low activity setup);
     3. Purge WAL segment every 10 minutes (still under
examination); and,
     4. Able to rotate archive directory when desired, i.e.,
when doing a new baseline archive.

With this debugged and working I restored my salvaged data.

Assumptions (please correct me here):

     From my reading I understand the WAL log file sequence as
recording all changes to the cluster such that replaying them
will recreate the cluster to a consistent state as at the time
the last WAL file was written to disk. From this I have
extrapolated that no activity should mean no need to put
alterations into a WAL file, i.e., minimal information in the
WAL file when PostgreSQL is idle.
     I am also assuming the compressed size of a WAL file is a
rough measure of real information, as opposed to padding of the
file. And I did see compressed WAL files of less than 300 bytes
(0.3Kb!) before my data was loaded into the database, and much
bigger (compressed) files reflecting the data getting imported.
All very sensible and understandable.

My surprise is that with the data on board, and nothing
happening, those WAL files now appear to have a lot more real
content, i.e., I am assuming this means the data and/or
configuration are getting altered even when there are no
connections. I know vacuum can change things but surely it
should run out of things to do after a few passes.

This is a snapshot of the WAL archive files. There is a file
every ten minutes. No data has been changed for 12 hours, no
connections open to the database for at least 30 minutes. No
automatic processes other than default vacuum settings, etc. The
archive commit process always seems to be 6-8 files behind.


dir:Archive postgres$ ls -ls *21 active_wal_files/
20121121:
total 351744
  ...
  584 -rw-------+ 1 postgres  _postgres   295904 Nov 24 10:46 000000010000000100000087.bz2
  232 -rw-------+ 1 postgres  _postgres   116480 Nov 24 10:56 000000010000000100000088.bz2
3104 -rw-------+ 1 postgres  _postgres  1585487 Nov 24 11:06 000000010000000100000089.bz2
1224 -rw-------+ 1 postgres  _postgres   624755 Nov 24 11:16 00000001000000010000008A.bz2
  744 -rw-------+ 1 postgres  _postgres   377394 Nov 24 11:26 00000001000000010000008B.bz2
   80 -rw-------+ 1 postgres  _postgres    38856 Nov 24 11:36 00000001000000010000008C.bz2
   32 -rw-------+ 1 postgres  _postgres    16147 Nov 24 11:46 00000001000000010000008D.bz2
   48 -rw-------+ 1 postgres  _postgres    21710 Nov 24 11:56 00000001000000010000008E.bz2
6816 -rw-------+ 1 postgres  _postgres  3487066 Nov 21 07:56 base_archive_20121121_075602.tbz

active_wal_files/:
total 131080
     8 -rw-------+ 1 postgres  _postgres      381 Nov 21 07:56 0000000100000000000000C5.00000020.backup
16384 -rw-------+ 1 postgres  _postgres  8388608 Nov 24 11:46 00000001000000010000008D
16384 -rw-------+ 1 postgres  _postgres  8388608 Nov 24 11:56 00000001000000010000008E
16384 -rw-------+ 1 postgres  _postgres  8388608 Nov 24 10:46 00000001000000010000008F
16384 -rw-------+ 1 postgres  _postgres  8388608 Nov 24 10:56 000000010000000100000090
16384 -rw-------+ 1 postgres  staff      8388608 Nov 24 11:06 000000010000000100000091
16384 -rw-------+ 1 postgres  staff      8388608 Nov 24 11:16 000000010000000100000092
16384 -rw-------+ 1 postgres  _postgres  8388608 Nov 24 11:26 000000010000000100000093
16384 -rw-------+ 1 postgres  _postgres  8388608 Nov 24 11:36 000000010000000100000094
     0 drwx------+ 2 postgres  staff          170 Nov 24 11:56 archive_status
dir:Archive postgres$


Question: what is behind the idle content in the WAL files?

Thank you for your time.
Regards
Gavan
--
Gavan Schneider             <gavan@galeel.org>
Ph. +61 2 6355 6256         "Pendari" Anarel Road
Mb. +61 405 124 883         Sodwalls NSW 2790
                             Australia



Gavan Schneider <pg-gts@snkmail.com> writes:
> My surprise is that with the data on board, and nothing
> happening, those WAL files now appear to have a lot more real
> content, i.e., I am assuming this means the data and/or
> configuration are getting altered even when there are no
> connections. I know vacuum can change things but surely it
> should run out of things to do after a few passes.

In the base configuration, an idle Postgres installation won't generate
any fresh WAL; but this isn't true if you've got hot-standby replication
configured, because of the effect described here:
http://archives.postgresql.org/pgsql-hackers/2011-10/msg00207.php

The quick-and-dirty patch that was proposed in that thread was later
found to have problems of its own and reverted, so right at the moment
we're back to the behavior Fujii-san described.  I'm not sure if anyone
is thinking about other ways to avoid it; a few bytes of WAL every ten
minutes is not the end of the world exactly, especially since hot
standby is typically used in installations that are far from idle.

However, you've got a bigger problem because it appears you're getting
a WAL file switch after each checkpoint, which does *not* happen in a
minimal replication configuration.  I suspect you have archive_timeout
activated.  You might want to reconsider those settings: the combination
of hot standby and archive timeout seems a bit redundant to me.  If
you're relying on a hot standby then you don't really need to worry
about keeping the WAL archive fully up-to-date, and contrarily if you
aren't using a hot standby then you should back off the wal_level
setting to eliminate unnecessary WAL traffic.

            regards, tom lane


Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?

From
Gavan Schneider
Date:
On Saturday, November 24, 2012 at 04:18, Tom Lane wrote:

>Gavan Schneider <pg-gts@snkmail.com> writes:
>>My surprise is that with the data on board, and nothing
>>happening, those WAL files now appear to have a lot more real
>>content, i.e., I am assuming this means the data and/or
>>configuration are getting altered even when there are no
>>connections. I know vacuum can change things but surely it
>>should run out of things to do after a few passes.
>
>In the base configuration, an idle Postgres installation won't generate
>any fresh WAL; but this isn't true if you've got hot-standby replication
>configured, because of the effect described here:
>http://archives.postgresql.org/pgsql-hackers/2011-10/msg00207.php
>
....

>However, you've got a bigger problem because it appears you're getting
>a WAL file switch after each checkpoint, which does *not* happen in a
>minimal replication configuration.  I suspect you have archive_timeout
>activated.
>
Correct

>You might want to reconsider those settings: the combination
>of hot standby and archive timeout seems a bit redundant to me.  If
>you're relying on a hot standby then you don't really need to worry
>about keeping the WAL archive fully up-to-date, and contrarily if you
>aren't using a hot standby then you should back off the wal_level
>setting to eliminate unnecessary WAL traffic.
>
But no hot standby...

My current settings (with some bulk stripped out):

     ./configure \
         --prefix=${VOL} \
         --datarootdir=${VOL}/configuration \
         --htmldir=${VOL} \
         --with-pgport=${PORT} \
         --with-perl --with-tcl --with-python\
         --with-openssl --with-pam with-ldap \
         --with-bonjour \
         --with-libxml --with-libxslt \
         --with-system-tzdata=/usr/share/zoneinfo \
         --with-wal-segsize=8

     #------------------------------------------------------------------------------
     # WRITE AHEAD LOG
     #------------------------------------------------------------------------------
     wal_level = archive     # minimal, archive, or hot_standby
     #fsync = on             # turns forced synchronization on
or off
     #synchronous_commit = on        # synchronization level;
     #wal_sync_method = fsync        # the default is the first option
     #full_page_writes = on          # recover from partial page writes
     #wal_buffers = -1           # min 32kB, -1 sets based on shared_buffers
     #wal_writer_delay = 200ms       # 1-10000 milliseconds
     #commit_delay = 0           # range 0-100000, in microseconds
     #commit_siblings = 5            # range 1-1000
     #checkpoint_segments = 3        # in logfile segments, min
1, 16MB each
     #checkpoint_timeout = 5min      # range 30s-1h
     #checkpoint_completion_target = 0.5 # checkpoint target
duration, 0.0 - 1.0
     #checkpoint_warning = 30s       # 0 disables
     archive_mode = on       # allows archiving to be done
     archive_command =
'/Volumes/SQL/postgres/Archive/bin/archive_command %p %f'
     archive_timeout = 600   # force a logfile segment switch
after this

     #------------------------------------------------------------------------------
     # REPLICATION
     #------------------------------------------------------------------------------
     #max_wal_senders = 0        # max number of walsender processes
     #wal_keep_segments = 0      # in logfile segments, 16MB
each; 0 disables
     #replication_timeout = 60s  # in milliseconds; 0 disables
     #synchronous_standby_names = '' # standby servers that
provide sync rep
     #vacuum_defer_cleanup_age = 0   # number of xacts by which
cleanup is delayed
     #hot_standby = off          # "on" allows queries during recovery
     #max_standby_archive_delay = 30s    # max delay before
canceling queries
     #max_standby_streaming_delay = 30s  # max delay before
canceling queries
     #wal_receiver_status_interval = 10s # send replies at least
this often
     #hot_standby_feedback = off     # send info from standby to prevent

I think my observation of WAL content is explained by Depesz:

<http://www.depesz.com/2011/07/14/write-ahead-log-understanding-postgresql-conf-checkpoint_segments-checkpoint_timeout-checkpoint_warning/>

or <http://tinyurl.com/bpvjv6z>

In summary: I now realize the WAL files are recycled and not
zeroed between rewrites. So what I am seeing is past activity
possibly overwritten by new activity. So there is always going
to be something in those files from now on. When the server is
idle most of the file will be the old stuff with a minimum
amount of new material overwritten at the beginning.

Therefore I probably need to turn off the archive_timeout and
let each WAL file fill with relevant new material before it cycles.

This leaves me with the need to find the best settings to allow
the WAL files to be properly filled, but not lose (too much)
data if the machine goes down before the WAL file is rotated.

Since this is mostly a low volume setup I figure a smaller WAL
file is sensible.

Therefore I wonder if I should just go down to 4MB, no
archive_timeout, and accept the (max) 4MB of updated pages as my
(worst case) loss if everything should go wrong at once? And,
maybe do the segment downsize when a new version needs
installing :)

Regards
Gavan



Gavan Schneider <pg-gts@snkmail.com> writes:
> On Saturday, November 24, 2012 at 04:18, Tom Lane wrote:
>> In the base configuration, an idle Postgres installation won't generate
>> any fresh WAL; but this isn't true if you've got hot-standby replication
>> configured, because of the effect described here:
>> http://archives.postgresql.org/pgsql-hackers/2011-10/msg00207.php
>> ....
>> You might want to reconsider those settings: the combination
>> of hot standby and archive timeout seems a bit redundant to me.  If
>> you're relying on a hot standby then you don't really need to worry
>> about keeping the WAL archive fully up-to-date, and contrarily if you
>> aren't using a hot standby then you should back off the wal_level
>> setting to eliminate unnecessary WAL traffic.

> But no hot standby...

Yeah, I took a second look and realized that you'd get this effect even
without hot_standby --- it's the alternation of checkpoint records and
forced segment switches from archive_timeout that creates the effect.

If you're concerned about minimizing WAL traffic at idle, you might want
to think about setting up a streaming replication arrangement instead of
relying on archive segment switches.  archive_timeout is really kind
of a hack.

            regards, tom lane