Thread: pg_upgrade + streaming replication ?
I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with streaming replication. I'm in the planning stages of upgrading to 9.1.x, and am looking into the most efficient way to do the upgrade with the goal of minimizing downtime & risk. After googling, the only discussion that I've found of using pg_upgrade with a streaming replication setup seems to be this (nearly) year old thread: http://web.archiveorange.com/archive/v/9FNVlDWGQtpyWVL54jlK In summary, there is no way to use both pg_upgrade and streaming replication simultaneously. I'd have to either use pg_upgrade and then effectively rebuild/redeploy the slaves, or not use pg_upgrade, and reimport all of the data. Is that still the latest status, or are there other options? thanks
On Thu, Mar 01, 2012 at 02:01:31PM -0800, Lonni J Friedman wrote: > I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with > streaming replication. I'm in the planning stages of upgrading to > 9.1.x, and am looking into the most efficient way to do the upgrade > with the goal of minimizing downtime & risk. After googling, the only > discussion that I've found of using pg_upgrade with a streaming > replication setup seems to be this (nearly) year old thread: > http://web.archiveorange.com/archive/v/9FNVlDWGQtpyWVL54jlK > > In summary, there is no way to use both pg_upgrade and streaming > replication simultaneously. I'd have to either use pg_upgrade and > then effectively rebuild/redeploy the slaves, or not use pg_upgrade, > and reimport all of the data. Is that still the latest status, or are > there other options? You can shut down all three servers, run pg_upgrade on all of them, then restart them as 9.1 servers. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Mon, Mar 19, 2012 at 12:30 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Thu, Mar 01, 2012 at 02:01:31PM -0800, Lonni J Friedman wrote: >> I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with >> streaming replication. I'm in the planning stages of upgrading to >> 9.1.x, and am looking into the most efficient way to do the upgrade >> with the goal of minimizing downtime & risk. After googling, the only >> discussion that I've found of using pg_upgrade with a streaming >> replication setup seems to be this (nearly) year old thread: >> http://web.archiveorange.com/archive/v/9FNVlDWGQtpyWVL54jlK >> >> In summary, there is no way to use both pg_upgrade and streaming >> replication simultaneously. I'd have to either use pg_upgrade and >> then effectively rebuild/redeploy the slaves, or not use pg_upgrade, >> and reimport all of the data. Is that still the latest status, or are >> there other options? > > You can shut down all three servers, run pg_upgrade on all of them, then > restart them as 9.1 servers. > Thanks for your reply. This is very good news.
On Mon, 2012-03-19 at 15:30 -0400, Bruce Momjian wrote: > On Thu, Mar 01, 2012 at 02:01:31PM -0800, Lonni J Friedman wrote: > > I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with > > streaming replication. I'm in the planning stages of upgrading to > > 9.1.x, and am looking into the most efficient way to do the upgrade > > with the goal of minimizing downtime & risk. After googling, the only > > discussion that I've found of using pg_upgrade with a streaming > > replication setup seems to be this (nearly) year old thread: > > http://web.archiveorange.com/archive/v/9FNVlDWGQtpyWVL54jlK > > > > In summary, there is no way to use both pg_upgrade and streaming > > replication simultaneously. I'd have to either use pg_upgrade and > > then effectively rebuild/redeploy the slaves, or not use pg_upgrade, > > and reimport all of the data. Is that still the latest status, or are > > there other options? > > You can shut down all three servers, run pg_upgrade on all of them, then > restart them as 9.1 servers. After running pg_upgrade on each server individually, they will have different system IDs, and potentially different on-disk representation of the catalogs, right? So how can you resume streaming without rebuilding the slaves? Regards, Jeff Davis
On Mon, Mar 19, 2012 at 03:07:02PM -0700, Jeff Davis wrote: > On Mon, 2012-03-19 at 15:30 -0400, Bruce Momjian wrote: > > On Thu, Mar 01, 2012 at 02:01:31PM -0800, Lonni J Friedman wrote: > > > I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with > > > streaming replication. I'm in the planning stages of upgrading to > > > 9.1.x, and am looking into the most efficient way to do the upgrade > > > with the goal of minimizing downtime & risk. After googling, the only > > > discussion that I've found of using pg_upgrade with a streaming > > > replication setup seems to be this (nearly) year old thread: > > > http://web.archiveorange.com/archive/v/9FNVlDWGQtpyWVL54jlK > > > > > > In summary, there is no way to use both pg_upgrade and streaming > > > replication simultaneously. I'd have to either use pg_upgrade and > > > then effectively rebuild/redeploy the slaves, or not use pg_upgrade, > > > and reimport all of the data. Is that still the latest status, or are > > > there other options? > > > > You can shut down all three servers, run pg_upgrade on all of them, then > > restart them as 9.1 servers. > > After running pg_upgrade on each server individually, they will have > different system IDs, and potentially different on-disk representation > of the catalogs, right? > > So how can you resume streaming without rebuilding the slaves? Oh, wow, I never thought of the fact that the system tables will be different? I guess you could assume the pg_dump restore is going to create things exactly the same on all the systems, but I never tested that. Do the system id's have to match? That would be a problem because you are initdb'ing on each server. OK, crazy idea, but I wonder if you could initdb on the master, then copy that to the slaves, then run pg_upgrade on each of them. Obviously this needs some testing. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, Mar 20, 2012 at 11:46 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Mar 19, 2012 at 03:07:02PM -0700, Jeff Davis wrote: >> On Mon, 2012-03-19 at 15:30 -0400, Bruce Momjian wrote: >> > On Thu, Mar 01, 2012 at 02:01:31PM -0800, Lonni J Friedman wrote: >> > > I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with >> > > streaming replication. I'm in the planning stages of upgrading to >> > > 9.1.x, and am looking into the most efficient way to do the upgrade >> > > with the goal of minimizing downtime & risk. After googling, the only >> > > discussion that I've found of using pg_upgrade with a streaming >> > > replication setup seems to be this (nearly) year old thread: >> > > http://web.archiveorange.com/archive/v/9FNVlDWGQtpyWVL54jlK >> > > >> > > In summary, there is no way to use both pg_upgrade and streaming >> > > replication simultaneously. I'd have to either use pg_upgrade and >> > > then effectively rebuild/redeploy the slaves, or not use pg_upgrade, >> > > and reimport all of the data. Is that still the latest status, or are >> > > there other options? >> > >> > You can shut down all three servers, run pg_upgrade on all of them, then >> > restart them as 9.1 servers. >> >> After running pg_upgrade on each server individually, they will have >> different system IDs, and potentially different on-disk representation >> of the catalogs, right? >> >> So how can you resume streaming without rebuilding the slaves? > > Oh, wow, I never thought of the fact that the system tables will be > different? I guess you could assume the pg_dump restore is going to > create things exactly the same on all the systems, but I never tested > that. Do the system id's have to match? That would be a problem > because you are initdb'ing on each server. OK, crazy idea, but I > wonder if you could initdb on the master, then copy that to the slaves, > then run pg_upgrade on each of them. Obviously this needs some testing. Wouldn't it be easier to just pg_upgrade the master, then setup the slaves from scratch (with rsync, etc)? It certainly wouldn't be any more work to do it that way (although still a lot more work than simply running pg_upgrade on all servers).
On Tue, Mar 20, 2012 at 11:56:29AM -0700, Lonni J Friedman wrote: > >> So how can you resume streaming without rebuilding the slaves? > > > > Oh, wow, I never thought of the fact that the system tables will be > > different? I guess you could assume the pg_dump restore is going to > > create things exactly the same on all the systems, but I never tested > > that. Do the system id's have to match? That would be a problem > > because you are initdb'ing on each server. OK, crazy idea, but I > > wonder if you could initdb on the master, then copy that to the slaves, > > then run pg_upgrade on each of them. Obviously this needs some testing. > > Wouldn't it be easier to just pg_upgrade the master, then setup the > slaves from scratch (with rsync, etc)? It certainly wouldn't be any > more work to do it that way (although still a lot more work than > simply running pg_upgrade on all servers). Hey, wow, that is an excellent idea because rsync is going to realize that all the user-data files are exactly the same and skip them --- that is the winner solution. I should probably add this to the pg_upgrade documentaiton. Thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, Mar 20, 2012 at 02:58:20PM -0400, Bruce Momjian wrote: > On Tue, Mar 20, 2012 at 11:56:29AM -0700, Lonni J Friedman wrote: > > >> So how can you resume streaming without rebuilding the slaves? > > > > > > Oh, wow, I never thought of the fact that the system tables will be > > > different? I guess you could assume the pg_dump restore is going to > > > create things exactly the same on all the systems, but I never tested > > > that. Do the system id's have to match? That would be a problem > > > because you are initdb'ing on each server. OK, crazy idea, but I > > > wonder if you could initdb on the master, then copy that to the slaves, > > > then run pg_upgrade on each of them. Obviously this needs some testing. > > > > Wouldn't it be easier to just pg_upgrade the master, then setup the > > slaves from scratch (with rsync, etc)? It certainly wouldn't be any > > more work to do it that way (although still a lot more work than > > simply running pg_upgrade on all servers). > > Hey, wow, that is an excellent idea because rsync is going to realize > that all the user-data files are exactly the same and skip them --- that > is the winner solution. I should probably add this to the pg_upgrade > documentaiton. Thanks. Actually, I am not sure how well rsync will work, because by default it only skips files with matching file timestamp and size, and I bet many of the file will have different times because of streaming replication lag, and server time lag. I think we need this rsync options: -c, --checksum This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a "quick check" that (by default) checks if each file's size and time of last modification match between the sender and receiver. This option changes this to compare a 128-bit checksum for each file that has a matching size. Generating the check sums means that both sides will expend a lot of disk I/O reading all the data in the files in the transfer (and this is prior to any reading that will be done to transfer changed files), so this can slow things down significantly. The sending side generates its checksums while it is doing the file-system scan that builds the list of the available files. The receiver generates its checksums when it is scanning for changed files, and will checksum any file that has the same size as the corresponding sender's file: files with either a changed size or a changed checksum are selected for transfer. and I suspect that will be slow. Probably better than nothing, but not super-fast either. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Tue, 2012-03-20 at 16:49 -0400, Bruce Momjian wrote: > On Tue, Mar 20, 2012 at 02:58:20PM -0400, Bruce Momjian wrote: > > On Tue, Mar 20, 2012 at 11:56:29AM -0700, Lonni J Friedman wrote: > > > >> So how can you resume streaming without rebuilding the slaves? > > > > > > > > Oh, wow, I never thought of the fact that the system tables will be > > > > different? I guess you could assume the pg_dump restore is going to > > > > create things exactly the same on all the systems, but I never tested > > > > that. Do the system id's have to match? That would be a problem > > > > because you are initdb'ing on each server. OK, crazy idea, but I > > > > wonder if you could initdb on the master, then copy that to the slaves, > > > > then run pg_upgrade on each of them. Obviously this needs some testing. This sounds promising. Fundamentally, the user data files aren't changing, and if you can upgrade the master you can upgrade the slaves. So there is no fundamental problem here, but there will be some careful bookkeeping. I think we need to look at this as a new feature that needs its own testing and documentation. It's important though, because as you point out downthread, rsync doesn't really solve the problem (still takes time proportional to the user data size). Regards, Jeff Davis
actually rsync works fine on file level and is good for manual syncing. it check really the files with the stat command, so a bit change will trigger the copy in practice you need to keep an eye on compleetness of the rsync action. try to use it without compression for large data sets, it saves time strangely. Met vriendelijk groet, with kind regards Henk Bronk On 20 mrt. 2012, at 21:49, Bruce Momjian <bruce@momjian.us> wrote: > On Tue, Mar 20, 2012 at 02:58:20PM -0400, Bruce Momjian wrote: >> On Tue, Mar 20, 2012 at 11:56:29AM -0700, Lonni J Friedman wrote: >>>>> So how can you resume streaming without rebuilding the slaves? >>>> >>>> Oh, wow, I never thought of the fact that the system tables will be >>>> different? I guess you could assume the pg_dump restore is going to >>>> create things exactly the same on all the systems, but I never tested >>>> that. Do the system id's have to match? That would be a problem >>>> because you are initdb'ing on each server. OK, crazy idea, but I >>>> wonder if you could initdb on the master, then copy that to the slaves, >>>> then run pg_upgrade on each of them. Obviously this needs some testing. >>> >>> Wouldn't it be easier to just pg_upgrade the master, then setup the >>> slaves from scratch (with rsync, etc)? It certainly wouldn't be any >>> more work to do it that way (although still a lot more work than >>> simply running pg_upgrade on all servers). >> >> Hey, wow, that is an excellent idea because rsync is going to realize >> that all the user-data files are exactly the same and skip them --- that >> is the winner solution. I should probably add this to the pg_upgrade >> documentaiton. Thanks. > > Actually, I am not sure how well rsync will work, because by default it > only skips files with matching file timestamp and size, and I bet many > of the file will have different times because of streaming replication > lag, and server time lag. I think we need this rsync options: > > -c, --checksum > This changes the way rsync checks if the files have been > changed and are in need of a transfer. Without this option, > rsync uses a "quick check" that (by default) checks if each > file's size and time of last modification match > between the sender and receiver. This option changes > this to compare a 128-bit checksum for each file that has > a matching size. Generating the check sums means that > both sides will expend a lot of disk I/O reading all the > data in the files in the transfer (and this is prior to > any reading that will be done to transfer changed files), > so this can slow things down significantly. > > The sending side generates its checksums while it is > doing the file-system scan that builds the list of the > available files. The receiver generates its checksums when > it is scanning for changed files, and will checksum any file > that has the same size as the corresponding sender's file: > files with either a changed size or a changed checksum are > selected for transfer. > > and I suspect that will be slow. Probably better than nothing, but not > super-fast either. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On Tue, 2012-03-20 at 22:21 +0100, Henk Bronk wrote: > actually rsync works fine on file level and is good for manual syncing. > it check really the files with the stat command, so a bit change will trigger the copy > in practice you need to keep an eye on compleetness of the rsync action. Rsync still needs to examine the entire file. It has no information to know that the file is the same on master and slave. We could try to give it the appropriate information on which it can make that assumption -- e.g. keep the timestamps the same so that rsync assumes the contents are the same. But that seems fragile and I don't see a good way of doing it, anyway. We need a way to take a base backup of just the catalogs, essentially, and leave the user data intact. Probably quite a few details to sort out though. Regards, Jeff Davis
On Tue, Mar 20, 2012 at 8:27 PM, Jeff Davis <pgsql@j-davis.com> wrote:
On Tue, 2012-03-20 at 22:21 +0100, Henk Bronk wrote:
> actually rsync works fine on file level and is good for manual syncing.
> it check really the files with the stat command, so a bit change will trigger the copy
> in practice you need to keep an eye on compleetness of the rsync action.
Rsync still needs to examine the entire file. It has no information to
know that the file is the same on master and slave.
We could try to give it the appropriate information on which it can make
that assumption -- e.g. keep the timestamps the same so that rsync
assumes the contents are the same. But that seems fragile and I don't
see a good way of doing it, anyway.
We need a way to take a base backup of just the catalogs, essentially,
and leave the user data intact. Probably quite a few details to sort out
though.
Regards,
Jeff Davis
rsync can take file that contains a list of files to sync. Is there a convenient way to generate such a control file?
On linux, you can also do a
cp -rpuv. source destination
or script, something like this in good old bash:
ls directory | grep file extension >> copy.txt
list=$(< /full path/copy.txt)
location=/target directory
scp $list user@host:$location
or rsync
rsync -avz --progress -e ssh user@host:source-directory/ /target-directory/
check the slashes at the end, without a new directory is created
Hope this is helpfull
Met vriendelijk groet, with kind regards
Henk Bronk
On Tue, Mar 20, 2012 at 8:27 PM, Jeff Davis <pgsql@j-davis.com> wrote:On Tue, 2012-03-20 at 22:21 +0100, Henk Bronk wrote:
> actually rsync works fine on file level and is good for manual syncing.
> it check really the files with the stat command, so a bit change will trigger the copy
> in practice you need to keep an eye on compleetness of the rsync action.
Rsync still needs to examine the entire file. It has no information to
know that the file is the same on master and slave.
We could try to give it the appropriate information on which it can make
that assumption -- e.g. keep the timestamps the same so that rsync
assumes the contents are the same. But that seems fragile and I don't
see a good way of doing it, anyway.
We need a way to take a base backup of just the catalogs, essentially,
and leave the user data intact. Probably quite a few details to sort out
though.
Regards,
Jeff Davisrsync can take file that contains a list of files to sync. Is there a convenient way to generate such a control file?
Le jeudi 01 mars 2012 à 14:01 -0800, Lonni J Friedman a écrit : > I've got a 3 node cluster (1 master/2 slaves) running 9.0.x with > streaming replication. I'm in the planning stages of upgrading to > 9.1.x, and am looking into the most efficient way to do the upgrade > with the goal of minimizing downtime & risk. See this recent post : http://archives.postgresql.org/pgsql-admin/2012-01/msg00115.php -- Vincent Veyron http://marica.fr/ Logiciel de gestion des sinistres assurances et des dossiers contentieux pour le service juridique
On Wed, 2012-03-21 at 09:12 +0100, Henk Bronk wrote: > On linux, you can also do a > > > cp -rpuv. source destination My point was that we should not take shortcuts that avoid the work of a full base backup for the replicas until we've determined a safe way to do that. As far as I know, nobody has successfully and safely done a pg_upgrade of a set of replicas without the need for full base backups (which need to copy all of the user data). Until someone determines that it's safe, adds the necessary functionality to pg_upgrade and/or replication, and documents it; then I do _not_ recommend such a thing for any production system. Regards, Jeff Davis