Re: LVM snapshots - Mailing list pgsql-admin
From | Murthy Kambhampaty |
---|---|
Subject | Re: LVM snapshots |
Date | |
Msg-id | 2D92FEBFD3BE1346A6C397223A8DD3FC0921C6@THOR.goeci.com Whole thread Raw |
In response to | LVM snapshots ("Matt Clark" <matt@ymogen.net>) |
Responses |
Re: LVM snapshots
|
List | pgsql-admin |
Here's what I have so far: The attachment includes a revised version of a script for backing up the database cluster, and the logs from a couple of instances of it; in these instances the server was under the following workload: A script that 1. Copies a roughly 380,000 row and 85 column file into a newly create table called tbl1 2. Creates 3 single column indexes on tlb1 3. In 3 parallel subshells, 3 separate insert ... from select ... queries put data from tbl1 into tbl2, tbl3, and tbl4 4. Upon completion, of step3, tbl1 is dropped and the process is repeated for 12 different text files with distinct data, and similar size 5. 3 single column indexes are added to each of tbl2, tbl3 and tbl4 in 3 parallel subshells Server hardware: Quad PIII Xeon 500 MHz/2MB L2, 6 GB RAM, 168 GB HW RAID5 (data), free space on second 168 GB HW RAID5 for snapshots on a dual channel Mylex Acceleraid 352, with XFS filesystem logs on a LINUX MD1 connected with an Adaptec 39320-R. Column sums on the larger tables on the production server and on the backup server gave identical values. Using a SAS System facility called PROC COMPARE on the smaller tables, they were found to be identical. Note, however, (i) my rsync options get the WHOLE FILE each time a table changes, and (ii) because the logs attached are from tables that are being updated on the production server, the copies of these tables have not been verified for "accuracy", only recoverability (vacuum analyze completes successfully). Therefore, users in transaction processing environments will have to do their own testing. (I have verified that doing incremental rsyncs until the data stabilize is also reliable; I'm just being paranoid on this one, and the cost is not too high.) I have also successfully run the backup simultaneously with the above workload on a Dual PIII 1GHz/512MB L2, 528 MB RAM, 60 GB IDE data here, 6 GB IDE (snapshots here) on the SAME ide channel. This version uses a two-phase rsync: the first rsync copies the live $PGDATA directory, excluding the pg_xlog/ folder, from the production server to the backup server; the second is run on an LVM snapshot of the $PGDATA folder. The postmaster is started and stopped on the backup server, so that any problems can be identified right away. (Notice the "ReadRecord: unexpected pageaddr 13C/98EDA000 in log file 317, segment 11, offset 15572992" in the later log. This seems to be a non-critical error; VACUUM ANALYZE gave a short series of: NOTICE: Rel <table name>: Uninitialized page 54300 - fixing ... VACUUM ). Others have suggested the two-phase rsync, but there the second phase required shutting down the postmaster. Besides the obvious advantage, it turns out that because snapshots are fast -- especially if fsync=true -- the second rsync has little to do (if you turn of --exclude='pg_xlog/' in the first phase, you many find that the backup is fully complete and the snapshot is removed almost immediately after creation). Thus, the time that the snapshot is up and dragging down filesystem operations is minimized. ADDITIONAL NOTES: Data recovery from backup is done with: pg_dump -Fc -h bkHost -U pguser -d db-to-recover [ -t <tbl-to-recover> ] | pg_restore -h pgHost -U pguser -c As, the 7.4dev docs say: "WAL offers the opportunity for a new method for database on-line backup and restore (BAR). To use this method, one would have to make periodic saves of data files to another disk, a tape or another host and also archive the WAL log files. The database file copy and the archived log files could be used to restore just as if one were restoring after a crash. Each time a new database file copy was made the old log files could be removed. Implementing this facility will require the logging of data file and index creation and deletion; it will also require development of a method for copying the data files (operating system copy commands are not suitable)." If you can stand doing it with filesystem level tools (LVM or EVMS on linux; hw snapshots on other OS?), you can do this already, as shown here. Besides BAR, here are a couple of applications to consider: 1. A limited form of PITR: You can recover any object to its state at the last snapshot backup. (Of course, log roll-forward is missing.) From Oracle 9i documentation: "Introduction to RMAN TSPITR Recovery Manager (RMAN) automated tablespace point-in-time recovery (TSPITR) enables you to quickly recover one or more tablespaces to a time that is different from that of the rest of the database. RMAN TSPITR is most useful for recovering the following: * An erroneous DROP TABLE or TRUNCATE TABLE statement * A table that has become logically corrupted * An incorrect batch job or other DML statement that has affected only a subset of the database ... " Can do here. 2. It is much easier to upgrade versions. Initdb the new version on pgHost, then (pg_dumpall -Fc -h bkHost | pg_restore ) from the upgraded server. 3. If $PGDATA/pg_xlog is on a different set of disks than PGDATA, you should be able to adapt with: xfs_freeze -f $PGXLOG_FS xfs_freeze -f $PGDATA_FS lvcreate -s -L <size> -n snap_pgdata <the $PGDATA_FS device> lvcreate -s -n <size> -n snap_pgxlog <the $PGXLOG_FS device> xfs_freeze -u $PGDATA_FS xfs_freeze -u $PGXLOG_FS (on the assumption that the postmaster isn't doing anything while waiting for pg_xlog/ writes to complete). You can then mount the snapshots in the proper places, rsync and go. This has not been tested, though it's on my agenda. Could the postgresql developers comment on this? As Matt points out, this strategy is a useful addition to the postgresql administrator's tools. It seems useful to update the documentation, and to point out that the WAL design already gives the benefit of online backup and restore when combined with externally available snaphsot facilities. QED. Cheers, Murthy -----Original Message----- From: Peter Eisentraut [mailto:peter_e@gmx.net] Sent: Sunday, March 16, 2003 09:05 To: Matt Cc: Murthy Kambhampaty; pgsql-admin@postgresql.org Subject: Re: [ADMIN] LVM snapshots Matt writes: > If the theory can be robustly demonstrated to work in practice (which > you may have already done) then I say this strategy needs to be > recognised in the docs (or at least the techdocs) as a valid and useful one. The note in the documentation is rather old. If you can conclusively disprove it, then we'll be happy to alter it. -- Peter Eisentraut peter_e@gmx.net
Attachment
pgsql-admin by date: