Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening? - Mailing list pgsql-novice

From Gavan Schneider
Subject Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?
Date
Msg-id 16532-1353848086-612384@sneakemail.com
Whole thread Raw
In response to Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?
List pgsql-novice
On Saturday, November 24, 2012 at 04:18, Tom Lane wrote:

>Gavan Schneider <pg-gts@snkmail.com> writes:
>>My surprise is that with the data on board, and nothing
>>happening, those WAL files now appear to have a lot more real
>>content, i.e., I am assuming this means the data and/or
>>configuration are getting altered even when there are no
>>connections. I know vacuum can change things but surely it
>>should run out of things to do after a few passes.
>
>In the base configuration, an idle Postgres installation won't generate
>any fresh WAL; but this isn't true if you've got hot-standby replication
>configured, because of the effect described here:
>http://archives.postgresql.org/pgsql-hackers/2011-10/msg00207.php
>
....

>However, you've got a bigger problem because it appears you're getting
>a WAL file switch after each checkpoint, which does *not* happen in a
>minimal replication configuration.  I suspect you have archive_timeout
>activated.
>
Correct

>You might want to reconsider those settings: the combination
>of hot standby and archive timeout seems a bit redundant to me.  If
>you're relying on a hot standby then you don't really need to worry
>about keeping the WAL archive fully up-to-date, and contrarily if you
>aren't using a hot standby then you should back off the wal_level
>setting to eliminate unnecessary WAL traffic.
>
But no hot standby...

My current settings (with some bulk stripped out):

     ./configure \
         --prefix=${VOL} \
         --datarootdir=${VOL}/configuration \
         --htmldir=${VOL} \
         --with-pgport=${PORT} \
         --with-perl --with-tcl --with-python\
         --with-openssl --with-pam with-ldap \
         --with-bonjour \
         --with-libxml --with-libxslt \
         --with-system-tzdata=/usr/share/zoneinfo \
         --with-wal-segsize=8

     #------------------------------------------------------------------------------
     # WRITE AHEAD LOG
     #------------------------------------------------------------------------------
     wal_level = archive     # minimal, archive, or hot_standby
     #fsync = on             # turns forced synchronization on
or off
     #synchronous_commit = on        # synchronization level;
     #wal_sync_method = fsync        # the default is the first option
     #full_page_writes = on          # recover from partial page writes
     #wal_buffers = -1           # min 32kB, -1 sets based on shared_buffers
     #wal_writer_delay = 200ms       # 1-10000 milliseconds
     #commit_delay = 0           # range 0-100000, in microseconds
     #commit_siblings = 5            # range 1-1000
     #checkpoint_segments = 3        # in logfile segments, min
1, 16MB each
     #checkpoint_timeout = 5min      # range 30s-1h
     #checkpoint_completion_target = 0.5 # checkpoint target
duration, 0.0 - 1.0
     #checkpoint_warning = 30s       # 0 disables
     archive_mode = on       # allows archiving to be done
     archive_command =
'/Volumes/SQL/postgres/Archive/bin/archive_command %p %f'
     archive_timeout = 600   # force a logfile segment switch
after this

     #------------------------------------------------------------------------------
     # REPLICATION
     #------------------------------------------------------------------------------
     #max_wal_senders = 0        # max number of walsender processes
     #wal_keep_segments = 0      # in logfile segments, 16MB
each; 0 disables
     #replication_timeout = 60s  # in milliseconds; 0 disables
     #synchronous_standby_names = '' # standby servers that
provide sync rep
     #vacuum_defer_cleanup_age = 0   # number of xacts by which
cleanup is delayed
     #hot_standby = off          # "on" allows queries during recovery
     #max_standby_archive_delay = 30s    # max delay before
canceling queries
     #max_standby_streaming_delay = 30s  # max delay before
canceling queries
     #wal_receiver_status_interval = 10s # send replies at least
this often
     #hot_standby_feedback = off     # send info from standby to prevent

I think my observation of WAL content is explained by Depesz:

<http://www.depesz.com/2011/07/14/write-ahead-log-understanding-postgresql-conf-checkpoint_segments-checkpoint_timeout-checkpoint_warning/>

or <http://tinyurl.com/bpvjv6z>

In summary: I now realize the WAL files are recycled and not
zeroed between rewrites. So what I am seeing is past activity
possibly overwritten by new activity. So there is always going
to be something in those files from now on. When the server is
idle most of the file will be the old stuff with a minimum
amount of new material overwritten at the beginning.

Therefore I probably need to turn off the archive_timeout and
let each WAL file fill with relevant new material before it cycles.

This leaves me with the need to find the best settings to allow
the WAL files to be properly filled, but not lose (too much)
data if the machine goes down before the WAL file is rotated.

Since this is mostly a low volume setup I figure a smaller WAL
file is sensible.

Therefore I wonder if I should just go down to 4MB, no
archive_timeout, and accept the (max) 4MB of updated pages as my
(worst case) loss if everything should go wrong at once? And,
maybe do the segment downsize when a new version needs
installing :)

Regards
Gavan



pgsql-novice by date:

Previous
From: Tom Lane
Date:
Subject: Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?
Next
From: "Kevin Grittner"
Date:
Subject: Re: Curiosity: what is PostgreSQL doing with data when "nothing" is happening?