Re: Incremental Refresh - PostgreSQL 9.2 - Mailing list pgsql-admin

From Matheus de Oliveira
Subject Re: Incremental Refresh - PostgreSQL 9.2
Date
Msg-id CAJghg4KJzKqqP8gPtUm1Ws0V-+a84imPNM5JiwgEvJv1P+0jaw@mail.gmail.com
Whole thread Raw
In response to Incremental Refresh - PostgreSQL 9.2  ("drum.lucas@gmail.com" <drum.lucas@gmail.com>)
List pgsql-admin

On Tue, Feb 23, 2016 at 6:59 PM, drum.lucas@gmail.com <drum.lucas@gmail.com> wrote:
I.E: /usr/pgsql-9.2/bin/pg_dump --exclude-table-data='junk.*' --format=custom geoop_live > bigdump.sql

How are restoring it? Have tried using pg_restore with "-j" or "--jobs" option? This tend to make the restore process way faster. And as you should be doing logical backups routinely, you can simple use those.

In fact, if you are on 9.3+, you can also use directory format on pg_dump to have -j option there also.

Now that the test server has a full copy from master, how could I just do a incremental refreshing once a month?

Besides the already mentioned options, I see two more, but both are based on physical backup of the primary dabase, and not a logical backup (by using pg_dump):

Option 1: you can use rsync to copy from primary to test database, while on the first run it will take awhile, the next runs might be way faster *if* good part of the files haven't changed. Problem with rsync is that you either need to have the primary database completely shtdown or you need to use --checksum option of rsync (it is not safe otherwise and it makes a read the entire data set, but transfer only the changed files). Another option is to use rsync from an standby, streaming replica, database. If you choose to use rsync with --checksum, you need to do somthing in the lines of:

    a) shutdown test database
    b) on primary: psql -c "SELECT pg_start_backup('rsync-test');"
    c) on any node: rsync -azvP --checksum <primary>:/path/to/pgdata <test>:/path/to/pgata
    d) on primary: psql -c "SELECT pg_stop_backup();"
    e) copy archivelogs to test database and setup recovery.conf with restore_command at least
    f) start test database

You need to check how much rsync --checksum really buy you, sometimes it is just faster to copy everything (but then pg_dump/pg_restore is probably fast enough either), but, if the nodes are geographic apart and most of the files haven't changed, then --checksum is probably faster.

Option 2: use pg_rewind [1], if you can afford to save every WAL segment generated since each "refresh".
Best regards,
--
Matheus de Oliveira


pgsql-admin by date:

Previous
From: Albe Laurenz
Date:
Subject: Re: PostgreSQL 9.3.5 - Enable SSL
Next
From: Stephen Frost
Date:
Subject: Re: Incremental Refresh - PostgreSQL 9.2