Thread: Backup routine
Hi
Here, where I work, the backups of the postgresql databases are being done the following way:
There is a daily copy of nearly all the hd (excluding /tmp, /proc, /dev and so on) in which databases are
and besides this there is also one script which makes the pg_dump of each one of the databases on the server.
This daily copy of the hd is made with postmaster being active (without stopping the daemon), so the data
from /usr/local/pgsql/data would not be 100% consistent, I guess.
Supposing there was a failure and it was needed to restore the whole thing, I think the procedure to
recovery would be the following:
1) Copy data from the backup hd to a new hd
2) Once this was done, delete the postmaster.pid file and start the postmaster service
3) Drop all databases and recreate them from those pg_dump files
There are some questions I have about this backup routine:
If I recover data from that "inconsistent" backup hd, I know that the binary files (psql, pg_dump and so on)
will remain ok. The data may have some inconsistencies. Would these inconsistencies let the postmaster
start and work properly (that is, even with the possible presence of inconsistent data) Would it start and
be able to work normally and keep the information about users and groups? I am talking about users and
groups information because these ones are not dumped by pg_dump. I was thinking about using
"pg_dump -g" to generate this information.
I was also thinking about excluding /usr/local/pgsql/data from the backup routine, as the data is
also in other files generated by pg_dump. The problem is that this directory has not only the databases
data but also some config files, like postgresql.conf.
Opinions are welcome.
"Enio Schutt Junior" <enio@pmpf.rs.gov.br> writes: > Here, where I work, the backups of the postgresql databases are being done = > the following way: > There is a daily copy of nearly all the hd (excluding /tmp, /proc, /dev and= > so on) in which databases are=20 > and besides this there is also one script which makes the pg_dump of each o= > ne of the databases on the server. > This daily copy of the hd is made with postmaster being active (without sto= > pping the daemon), so the data > from /usr/local/pgsql/data would not be 100% consistent, I guess.=20 > Supposing there was a failure and it was needed to restore the whole thing,= > I think the procedure to > recovery would be the following: > 1) Copy data from the backup hd to a new hd > 2) Once this was done, delete the postmaster.pid file and start the postmas= > ter service > 3) Drop all databases and recreate them from those pg_dump files I would just initdb and then load the pg_dump files. An unsynchronized copy of /usr/local/pgsql/data is just about completely untrustworthy. You should use pg_dumpall to make a dump of user and group status; pg_dump will not do that. > I was also thinking about excluding /usr/local/pgsql/data from the > backup routine, as the data is also in other files generated by > pg_dump. The problem is that this directory has not only the databases > data but also some config files, like postgresql.conf. Yeah. Instead, exclude the directories below it ($PGDATA/base, etc). regards, tom lane
On Friday 08 August 2003 18:59, Enio Schutt Junior wrote: > Hi > > Here, where I work, the backups of the postgresql databases are being done > the following way: There is a daily copy of nearly all the hd (excluding > /tmp, /proc, /dev and so on) in which databases are and besides this there > is also one script which makes the pg_dump of each one of the databases on > the server. This daily copy of the hd is made with postmaster being active > (without stopping the daemon), so the data from /usr/local/pgsql/data would > not be 100% consistent, I guess. > Supposing there was a failure and it was needed to restore the whole thing, > I think the procedure to recovery would be the following: > 1) Copy data from the backup hd to a new hd > 2) Once this was done, delete the postmaster.pid file and start the > postmaster service 3) Drop all databases and recreate them from those > pg_dump files > > There are some questions I have about this backup routine: > If I recover data from that "inconsistent" backup hd, I know that the > binary files (psql, pg_dump and so on) will remain ok. The data may have > some inconsistencies. Would these inconsistencies let the postmaster start > and work properly (that is, even with the possible presence of inconsistent > data) Would it start and be able to work normally and keep the information > about users and groups? I am talking about users and groups information > because these ones are not dumped by pg_dump. I was thinking about using > "pg_dump -g" to generate this information. > > I was also thinking about excluding /usr/local/pgsql/data from the backup > routine, as the data is also in other files generated by pg_dump. The > problem is that this directory has not only the databases data but also > some config files, like postgresql.conf. > In most cases so long as you have not actually had a hard disk failure. When postgres comes up after it has crashed, it uses its WAL logs to clean up the database. So If your machine crashes with out hard disk failure you database should come up fine. If you have had a disk failure. The problem is that the beginning of the database may be backed up before the end. So you will need to use the dump. ie WAL logs may not agree with database and Table A not agree with Table B etc. Hence the general advise is bring it up from dump. The database will either come up so you can drop the databases or failure badly in which case you can always work out whats gone wrong and work round it. However there is a third way. That should be safe but some people may disagree with me! If you can "freeze" the disk while you take the backup. The backup can be used as if the computer had crashed with no hard disk failure at all. Ie WAL will be consistant and database may take longer but once it is up it will be safe (like paragaph 1). Now freezeing a disk for backup is not that difficult. You should be doing it anyway for user file consistancy. (You don't want the first 30 pages of you document to disagree with the end because somone was saving it during the backup! Freezing a disk can be done with LVM for example. http://www.tldp.org/HOWTO/LVM-HOWTO/ and is not difficult. RAID can also be used to avoid disk failure in the first place. Always however keep a dump as the dump will be your only way out if your basic sanity checks fail! Peter Childs
The world rejoiced as peterandsarah@blueyonder.co.uk (Peter and Sarah Childs) wrote: > However there is a third way. That should be safe but some > people may disagree with me! If you can "freeze" the disk while you > take the backup. The backup can be used as if the computer had > crashed with no hard disk failure at all. Ie WAL will be consistant > and database may take longer but once it is up it will be safe (like > paragaph 1). Now freezeing a disk for backup is not that > difficult. You should be doing it anyway for user file > consistancy. (You don't want the first 30 pages of you document to > disagree with the end because somone was saving it during the > backup! I heard D'Arcy Cain indicate that some SAN systems (I think he mentioned NetApp) support this sort of thing, too. Digital's AdvFS also supports it. Of course, if you take this approach, you have to make _certain_ that when you "freeze" a replica of a filesystem, that _ALL_ of the database is contained in that one filesystem. If you move WAL to a different filesystem, bets would be off again... -- let name="cbbrowne" and tld="ntlug.org" in String.concat "@" [name;tld];; http://www3.sympatico.ca/cbbrowne/ The real problem with the the year 2000 is that there are too many zero bits and that adversely affects the global bit density. -- Boyd Roberts <boyd@france3.fr>
Christopher Browne wrote: > The world rejoiced as peterandsarah@blueyonder.co.uk (Peter and Sarah Childs) wrote: > > However there is a third way. That should be safe but some > > people may disagree with me! If you can "freeze" the disk while you > > take the backup. The backup can be used as if the computer had > > crashed with no hard disk failure at all. Ie WAL will be consistant > > and database may take longer but once it is up it will be safe (like > > paragaph 1). Now freezeing a disk for backup is not that > > difficult. You should be doing it anyway for user file > > consistancy. (You don't want the first 30 pages of you document to > > disagree with the end because somone was saving it during the > > backup! > > I heard D'Arcy Cain indicate that some SAN systems (I think he > mentioned NetApp) support this sort of thing, too. Digital's AdvFS > also supports it. > > Of course, if you take this approach, you have to make _certain_ that > when you "freeze" a replica of a filesystem, that _ALL_ of the > database is contained in that one filesystem. If you move WAL to a > different filesystem, bets would be off again... Also, I assume you have to stop the server just for a moment while you do the freeze, right? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Tom Lane wrote: > "Enio Schutt Junior" <enio@pmpf.rs.gov.br> writes: > > Here, where I work, the backups of the postgresql databases are being done = > > the following way: > > There is a daily copy of nearly all the hd (excluding /tmp, /proc, /dev and= > > so on) in which databases are=20 > > and besides this there is also one script which makes the pg_dump of each o= > > ne of the databases on the server. > > This daily copy of the hd is made with postmaster being active (without sto= > > pping the daemon), so the data > > from /usr/local/pgsql/data would not be 100% consistent, I guess.=20 > > Supposing there was a failure and it was needed to restore the whole thing,= > > I think the procedure to > > recovery would be the following: > > 1) Copy data from the backup hd to a new hd > > 2) Once this was done, delete the postmaster.pid file and start the postmas= > > ter service > > 3) Drop all databases and recreate them from those pg_dump files > > I would just initdb and then load the pg_dump files. An unsynchronized > copy of /usr/local/pgsql/data is just about completely untrustworthy. > > You should use pg_dumpall to make a dump of user and group status; > pg_dump will not do that. > > > I was also thinking about excluding /usr/local/pgsql/data from the > > backup routine, as the data is also in other files generated by > > pg_dump. The problem is that this directory has not only the databases > > data but also some config files, like postgresql.conf. > > Yeah. Instead, exclude the directories below it ($PGDATA/base, etc). Yes, but if he restores everything but /base, he gets xlog and clog, and other stuff he shouldn't be restoring. This is why I would like the config files move into a separate directory under /data. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Tom Lane wrote: >> Yeah. Instead, exclude the directories below it ($PGDATA/base, etc). > Yes, but if he restores everything but /base, he gets xlog and clog, and > other stuff he shouldn't be restoring. That's why I said "etc". Only the loose files in the top-level directory are interesting. regards, tom lane
Hi Enio, Enio Schutt Junior wrote: > Hi > > Here, where I work, the backups of the postgresql databases are being > done the following way: > There is a daily copy of nearly all the hd (excluding /tmp, /proc, > /dev and so on) in which databases are > and besides this there is also one script which makes the pg_dump of > each one of the databases on the server. Hmm, I don't really see what you are doing with a backup of /tmp, /proc, /dev/tmp, /proc, /dev. I mean /tmp might be ok, but /proc shouldnt be backuped in my opinion, as /proc is NOT on your hd, but pointing directly to Kernel Memory. I would not dare to restore such a Backup! And /dev as well, I mean, these are your devices, so its completely Hardwarebound. > This daily copy of the hd is made with postmaster being active > (without stopping the daemon), so the data > from /usr/local/pgsql/data would not be 100% consistent, I guess. You need to stop Postgres, else forget about your backup. The DB might not even come up again. Here at my site, we have a nice little script, which can be configured to do certain actions before doing a backup of a given directory, and also after the backup. > > There are some questions I have about this backup routine: > If I recover data from that "inconsistent" backup hd, I know that the > binary files (psql, pg_dump and so on) > will remain ok. The data may have some inconsistencies. Would these > inconsistencies let the postmaster > start and work properly (that is, even with the possible presence of > inconsistent data) Would it start and > be able to work normally and keep the information about users and > groups? I am talking about users and > groups information because these ones are not dumped by pg_dump. I was > thinking about using > "pg_dump -g" to generate this information. I would really not go down this road. Regards, Dani
In article <200308110313.h7B3DCv06482@candle.pha.pa.us>, Bruce Momjian <pgman@candle.pha.pa.us> wrote: >Also, I assume you have to stop the server just for a moment while you >do the freeze, right? It depends on if you need known state or just consistent state. Taking a snapshot of the system will get you a consistent state just like if the machine crashed. You can restore that snapshot, bring PG back up and everything will work. Of course, you really have no way of knowing what transactions were commited and what were not. On the other hand, stop the server/snapshot/start the server gives you not only consistency, but a known state. That is, you know for sure that whatever was done before you stopped the server is what was done. Which approach is necessary for you depends on your needs, I as always. mrc -- Mike Castle dalgoda@ix.netcom.com www.netcom.com/~dalgoda/ We are all of us living in the shadow of Manhattan. -- Watchmen fatal ("You are in a maze of twisty compiler features, all different"); -- gcc
Oh, good point. By grabbing xlog along with the data files, you do get a consistent snapshot just like if it had crashed. --------------------------------------------------------------------------- Mike Castle wrote: > In article <200308110313.h7B3DCv06482@candle.pha.pa.us>, > Bruce Momjian <pgman@candle.pha.pa.us> wrote: > >Also, I assume you have to stop the server just for a moment while you > >do the freeze, right? > > It depends on if you need known state or just consistent state. > > Taking a snapshot of the system will get you a consistent state just like > if the machine crashed. You can restore that snapshot, bring PG back up > and everything will work. Of course, you really have no way of knowing > what transactions were commited and what were not. > > On the other hand, stop the server/snapshot/start the server gives you not > only consistency, but a known state. That is, you know for sure that > whatever was done before you stopped the server is what was done. > > Which approach is necessary for you depends on your needs, I as always. > > mrc > -- > Mike Castle dalgoda@ix.netcom.com www.netcom.com/~dalgoda/ > We are all of us living in the shadow of Manhattan. -- Watchmen > fatal ("You are in a maze of twisty compiler features, all different"); -- gcc > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Centuries ago, Nostradamus foresaw when pgman@candle.pha.pa.us (Bruce Momjian) would write: > Christopher Browne wrote: >> The world rejoiced as peterandsarah@blueyonder.co.uk (Peter and Sarah Childs) wrote: >> > However there is a third way. That should be safe but some >> > people may disagree with me! If you can "freeze" the disk while you >> > take the backup. The backup can be used as if the computer had >> > crashed with no hard disk failure at all. Ie WAL will be consistant >> > and database may take longer but once it is up it will be safe (like >> > paragaph 1). Now freezeing a disk for backup is not that >> > difficult. You should be doing it anyway for user file >> > consistancy. (You don't want the first 30 pages of you document to >> > disagree with the end because somone was saving it during the >> > backup! >> >> I heard D'Arcy Cain indicate that some SAN systems (I think he >> mentioned NetApp) support this sort of thing, too. Digital's AdvFS >> also supports it. >> >> Of course, if you take this approach, you have to make _certain_ >> that when you "freeze" a replica of a filesystem, that _ALL_ of the >> database is contained in that one filesystem. If you move WAL to a >> different filesystem, bets would be off again... > > Also, I assume you have to stop the server just for a moment while > you do the freeze, right? I'm sure that's _preferable_. Supposing you don't, the result is that the backup will be treated much like the condition where a server is "terminated by power failure," and, at restart, the system will have to rummage around the WAL to clean up a bit. Obviously not what you'd want, in an _ideal_ world, but it fits into what WAL is supposed to be able to protect against, right? If-and-when PITR gets into place, I'd think one a valued feature would be the notion of being able to signal the postmaster to tell it to close off a WAL and open a new one (even though we might not strictly be due for it). (Note that the "O-guys" can do something like this with their "archive logs.") If that signal can be submitted, then someone might be able to take this sort of "cloned filesystem" backup, and just drop off the last WAL file as irrelevant. That might not be quite exactly what's imminent for 7.5, mind you... -- select 'cbbrowne' || '@' || 'acm.org'; http://www.ntlug.org/~cbbrowne/sap.html (eq? 'truth 'beauty) ; to avoid unassigned-var error, since compiled code ; will pick up previous value to var set!-ed, ; the unassigned object. -- from BBN-CL's cl-parser.scm
On Sun, 10 Aug 2003, Christopher Browne wrote: > Centuries ago, Nostradamus foresaw when pgman@candle.pha.pa.us (Bruce Momjian) would write: > > Christopher Browne wrote: > >> The world rejoiced as peterandsarah@blueyonder.co.uk (Peter and Sarah Childs) wrote: > >> > However there is a third way. That should be safe but some > >> > people may disagree with me! If you can "freeze" the disk while you > >> > take the backup. The backup can be used as if the computer had > >> > crashed with no hard disk failure at all. Ie WAL will be consistant > >> > and database may take longer but once it is up it will be safe (like > >> > paragaph 1). Now freezeing a disk for backup is not that > >> > difficult. You should be doing it anyway for user file > >> > consistancy. (You don't want the first 30 pages of you document to > >> > disagree with the end because somone was saving it during the > >> > backup! > >> > >> I heard D'Arcy Cain indicate that some SAN systems (I think he > >> mentioned NetApp) support this sort of thing, too. Digital's AdvFS > >> also supports it. > >> > >> Of course, if you take this approach, you have to make _certain_ > >> that when you "freeze" a replica of a filesystem, that _ALL_ of the > >> database is contained in that one filesystem. If you move WAL to a > >> different filesystem, bets would be off again... > > > > Also, I assume you have to stop the server just for a moment while > > you do the freeze, right? > > I'm sure that's _preferable_. > > Supposing you don't, the result is that the backup will be treated > much like the condition where a server is "terminated by power > failure," and, at restart, the system will have to rummage around the > WAL to clean up a bit. Note that many NAS storage systems support a "snapshot" mechanism that basically does this in about 1/10th of a second. It produced horrible hickups for our Linux boxen with default NFS settings, whereby our NFS mounts would disappear for three or four minutes after a snapshot. I've never, by the way, run my database on a NAS, and probably never will, if I have any say in it. We just use NAS for static document storage (hundreds of gigs of it.)
There's an extensive discussion of doing file system level backups off an LVM snapshot taken on a frozen XFS filesystem (XFS provides xfs_freeze command which allows the sysadmin to freeze and unfreeze a given file system at will). See http://marc.theaimsgroup.com/?l=postgresql-admin&w=2&r=1&s=LVM+snapshots&q=b (as Mike Castle points out in this thread, this gives you a "consistent state" but not a know one). If your hardware gives you the ability to freeze I/O, you might get similar capabilities with other filesystems/OSes than XFS on Linux. In that thread, I also discussed a strategy for getting consistent backups when the $PGDATA/pg_xlog/ folder is on a different filesystem: freeze $PGDATA/pg_xlog/ freeze $PGDATA create snapshot $PGDATA/pg_xlog/ and mount that to $PGDATA.snap/pg_xlog\ create snapshot $PGDATA and mount that to $PGDATA.snap unfreeze $PGDATA unfreeze $PGDATA/pg_xlog/ (notice that since you only need to freeze the filesystem during snapshot creation, which is fast, you don't hold the postmaster up very long) rsync $PGDATA.snap/ to a backup volume umount and lvremove the snapshot of $PGDATA/pg_xlog/ umount and lvremove the snapshot of $PGDATA/ Filesystem performance is adversely affected as long as the snapshots exist; so I do a "preliminary rsync" (giving an inconsistent backup) at the beginning of the sequence to minimize the amount of data to be backed up from the snapshot(s). An example bash script which handles $PGDATA/pg_xlog/ being on a different filesystem (and mounted, not symlinked, to $PGDATA/pg_xlog) is attached. It has been tested under a fairly heavy load and seems to handle the job. I'd be interested in comments on the strategy/implementation from the postgresql gurus on this list (the script is rather messy, sorry). Thanks, Murthy >-----Original Message----- >From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] >Sent: Sunday, August 10, 2003 23:13 >To: Christopher Browne >Cc: pgsql-admin@postgresql.org >Subject: Re: [ADMIN] Backup routine > > >Christopher Browne wrote: >> The world rejoiced as peterandsarah@blueyonder.co.uk (Peter >and Sarah Childs) wrote: >> > However there is a third way. That should be safe but some >> > people may disagree with me! If you can "freeze" the disk while you >> > take the backup. The backup can be used as if the computer had >> > crashed with no hard disk failure at all. Ie WAL will be consistant >> > and database may take longer but once it is up it will be >safe (like >> > paragaph 1). Now freezeing a disk for backup is not that >> > difficult. You should be doing it anyway for user file >> > consistancy. (You don't want the first 30 pages of you document to >> > disagree with the end because somone was saving it during the >> > backup! >> >> I heard D'Arcy Cain indicate that some SAN systems (I think he >> mentioned NetApp) support this sort of thing, too. Digital's AdvFS >> also supports it. >> >> Of course, if you take this approach, you have to make _certain_ that >> when you "freeze" a replica of a filesystem, that _ALL_ of the >> database is contained in that one filesystem. If you move WAL to a >> different filesystem, bets would be off again... > >Also, I assume you have to stop the server just for a moment while you >do the freeze, right? > >-- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, >Pennsylvania 19073 > >---------------------------(end of >broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to >majordomo@postgresql.org) >
Attachment
On Monday, August 11, 2003 17:26, dalgoda@ix.netcom.com [mailto:dalgoda@ix.netcom.com] wrote: >From: dalgoda@ix.netcom.com [mailto:dalgoda@ix.netcom.com] >Sent: Monday, August 11, 2003 17:26 >To: pgsql-admin@postgresql.org >Subject: Re: [ADMIN] Backup routine > > >In article <200308110313.h7B3DCv06482@candle.pha.pa.us>, >Bruce Momjian <pgman@candle.pha.pa.us> wrote: >>Also, I assume you have to stop the server just for a moment while you >>do the freeze, right? > >It depends on if you need known state or just consistent state. > >Taking a snapshot of the system will get you a consistent >state just like >if the machine crashed. You can restore that snapshot, bring >PG back up >and everything will work. Of course, you really have no way of knowing >what transactions were commited and what were not. > >On the other hand, stop the server/snapshot/start the server >gives you not >only consistency, but a known state. That is, you know for sure that >whatever was done before you stopped the server is what was done. > But these considerations apply to pg_dump-s as well, no? I guess with pg_dump you CAN dump one database at a time, and you can "quiesce" each database before dumping -- disallow connections to that database for the duration of the pg_dump, and wait for all transactions to complete before starting pg_dump -- which is a little more flexible. Given the time it takes to do a pg_dump on databases over a few gigabytes in size, though, I can't say I find the flexibility valuable. Cheers, Murthy
On Wed, 13 Aug 2003, Murthy Kambhampaty wrote: > On Monday, August 11, 2003 17:26, dalgoda@ix.netcom.com > [mailto:dalgoda@ix.netcom.com] wrote: > >From: dalgoda@ix.netcom.com [mailto:dalgoda@ix.netcom.com] > >Sent: Monday, August 11, 2003 17:26 > >To: pgsql-admin@postgresql.org > >Subject: Re: [ADMIN] Backup routine > > > > > >In article <200308110313.h7B3DCv06482@candle.pha.pa.us>, > >Bruce Momjian <pgman@candle.pha.pa.us> wrote: > >>Also, I assume you have to stop the server just for a moment while you > >>do the freeze, right? > > > >It depends on if you need known state or just consistent state. > > > >Taking a snapshot of the system will get you a consistent > >state just like > >if the machine crashed. You can restore that snapshot, bring > >PG back up > >and everything will work. Of course, you really have no way of knowing > >what transactions were commited and what were not. > > > >On the other hand, stop the server/snapshot/start the server > >gives you not > >only consistency, but a known state. That is, you know for sure that > >whatever was done before you stopped the server is what was done. > > > But these considerations apply to pg_dump-s as well, no? I guess with > pg_dump you CAN dump one database at a time, and you can "quiesce" each > database before dumping -- disallow connections to that database for the > duration of the pg_dump, and wait for all transactions to complete before > starting pg_dump -- which is a little more flexible. Given the time it takes > to do a pg_dump on databases over a few gigabytes in size, though, I can't > say I find the flexibility valuable. But that's still not exactly the same. If you pg_dump a single database in a cluster, THAT database will be consistent to itself, guaranteed by MVCC. Sure, one database in a cluster may not be consistent with another database, but generally, seperate databases are treated / considered to be independent.
Hello All I am trying to test the following function, But I dont know how to test these functions. Please help me how to test these functions, and also give some examples.... 1. pg_get_viewdef 2. pg_get_ruledef 3. pg_get_indexdef 4. pg_get_userbyid 5. obj_description thanks in advance --mathan --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.491 / Virus Database: 290 - Release Date: 6/18/2003
Morning all, bit of a general question here... consider: begin; update a set col1 = 'p' where id = '1'; update a set col2 = 'q' where id = '1'; commit; versus: update a set col1 = 'p', col2 = 'q' where id = '1'; Does the first case generate any more dead tuples that will need vacuuming than the second case, or are the two updates 'merged' when the transaction is committed? Or is the answer 'sometimes' (I would guess depending on checkpoints)? Cheers Matt
"Matt Clark" <matt@ymogen.net> writes: > consider: > begin; > update a set col1 = 'p' where id = '1'; > update a set col2 = 'q' where id = '1'; > commit; > versus: > update a set col1 = 'p', col2 = 'q' where id = '1'; > Does the first case generate any more dead tuples that will need > vacuuming than the second case, Yes. Takes twice as long, too. regards, tom lane
On Wednesday, August 13, 2003 18:50, scott.marlowe [mailto:scott.marlowe@ihs.com] wrote: >On Wed, 13 Aug 2003, Murthy Kambhampaty wrote: >> On Monday, August 11, 2003 17:26, dalgoda@ix.netcom.com ... >> >It depends on if you need known state or just consistent state. >> > >> >Taking a snapshot of the system will get you a consistent >> >state just like >> >if the machine crashed. You can restore that snapshot, bring >> >PG back up >> >and everything will work. Of course, you really have no >way of knowing >> >what transactions were commited and what were not. >> > >> >On the other hand, stop the server/snapshot/start the server >> >gives you not >> >only consistency, but a known state. That is, you know for >sure that >> >whatever was done before you stopped the server is what was done. >> > >> But these considerations apply to pg_dump-s as well, no? I guess with >> pg_dump you CAN dump one database at a time, and you can >"quiesce" each >> database before dumping -- disallow connections to that >database for the >> duration of the pg_dump, and wait for all transactions to >complete before >> starting pg_dump -- which is a little more flexible. Given >the time it takes >> to do a pg_dump on databases over a few gigabytes in size, >though, I can't >> say I find the flexibility valuable. > >But that's still not exactly the same. If you pg_dump a >single database >in a cluster, THAT database will be consistent to itself, >guaranteed by >MVCC. > >Sure, one database in a cluster may not be consistent with another >database, but generally, seperate databases are treated / >considered to be >independent. > What I had in mind was: 1. If you stop the server and run either (a.) pg_dumpall or (b.) copy of $PGDATA from a snapshot created with a frozen filesystem (the one containing $PGDATA), then you get a consistent backup with a known state. 2. If you leave the server running and either (a.) pg_dumpall or (b.) copy of $PGDATA from a snapshot created with a frozen filesystem (the one containing $PGDATA), then you get a consistent backup with an unknown state. Additionally, with pg_dump you can, unlike with "the consistent filesystem-level backup strategy", independently "quisce" each database while other databases continue normal operations. This gives you the "flexibility" to get pg_dumps with known state for each individual database, without taking down ALL databases at one time (by shutting down the server). This does not seem very valuable. Sorry for any confusion I created, Murthy