Thread: pg_upgrade 8.3 to 9.0, shutdown is to slow
Hi all, I try to inplace migrate our postgresql databases from 8.3 to 9.0. So far, it worked by the testdbs I set up, it was really quick and I looked forward to migrate the live dbs. But here were some issues I didn't had in the test environment. If I start this command: """ su postgres -c "cd /usr/lib/postgresql/9.0/bin/; /usr/lib/postgresql/9.0/bin/pg_upgrade --link --check -g -v -d /var/lib/postgresql/8.3/main/ -D /var/lib/postgresql/9.0/main/ -b /usr/lib/postgresql/8.3/bin/ -B /usr/lib/postgresql/9.0/bin/ -l /tmp/migration.log" """ It starts the checking and all the things, at the end he trys to stop the db via pg_ctl, but here seems to be the error nested: """"/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D "/var/lib/postgresql/8.3/main" stop >> "/tmp/migration.log" 2>&1 waiting for server to shut down...2011-01-26 14:21:59 CET LOG: received smart shutdown request .......2011-01-26 14:22:05 CET FATAL: the database system is shutting down ..................................................... failed pg_ctl: server does not shut down There were problems executing "/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D "/var/lib/postgresql/8.3/main" stop >> "/tmp/migration.log" 2>&1 "/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D "/var/lib/postgresql/8.3/main" -m fast stop >> "/tmp/migration.log" 2>&1 2011-01-26 14:22:59 CET LOG: received fast shutdown request 2011-01-26 14:22:59 CET LOG: aborting any active transactions 2011-01-26 14:22:59 CET FATAL: terminating connection due to administrator command waiting for server to shut down....2011-01-26 14:22:59 CET LOG: shutting down 2011-01-26 14:22:59 CET LOG: database system is shut down done server stopped """ At the end it stopped, but with returncode 1. So it is broken here... :( Well, as I saw this first I thought I might forgot some db connections to terminate, but all services were down. A "ps -axuf" gives me the following output: """ postgres 26253 2.8 0.3 426264 10684 pts/1 S+ 14:21 0:01 /usr/lib/postgresql/8.3/bin/postgres -D /var/lib/postgresql/8.3/main - postgres 26255 0.0 0.0 426396 2044 ? Ss 14:21 0:00 \_ postgres: writer process postgres 26257 0.0 0.0 154368 1616 ? Ss 14:21 0:00 \_ postgres: stats collector process postgres 26258 0.0 0.1 427612 4188 ? Ss 14:21 0:00 \_ postgres: grepo DB_NAME LOCAL_IP(PORT) idle """ It seems that it waits for termination of its own process. With my testsetup i didn't get this error, maybe because it was much faster, because the used data were new and not fragmented or anything. If i try a /etc/init.d/postgresql-8.3 stop with user root it needs only 5 seconds and the db is down. Well, now I am stucked at this point and can't upgrade my databases, is there any way to increase the timeout? or another way to perform before the upgrade to reduce the stopping and starting time? Hope you have some hints for me. greetz Bernhard
Wow, that is odd. Good thing you were only running in check mode. What happens if you run that pg_ctl command manually? Is /etc/init.d/postgresql-8.3 stop running pg_ctl or something different? --------------------------------------------------------------------------- Bernhard Schrader wrote: > Hi all, > > I try to inplace migrate our postgresql databases from 8.3 to 9.0. > So far, it worked by the testdbs I set up, it was really quick and I > looked forward to migrate the live dbs. > But here were some issues I didn't had in the test environment. > > If I start this command: > > """ > su postgres -c > "cd /usr/lib/postgresql/9.0/bin/; /usr/lib/postgresql/9.0/bin/pg_upgrade > --link --check -g -v -d /var/lib/postgresql/8.3/main/ > -D /var/lib/postgresql/9.0/main/ -b /usr/lib/postgresql/8.3/bin/ > -B /usr/lib/postgresql/9.0/bin/ -l /tmp/migration.log" > """ > > It starts the checking and all the things, at the end he trys to stop > the db via pg_ctl, but here seems to be the error nested: > > """"/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D > "/var/lib/postgresql/8.3/main" stop >> "/tmp/migration.log" 2>&1 > waiting for server to shut down...2011-01-26 14:21:59 CET LOG: received > smart shutdown request > .......2011-01-26 14:22:05 CET FATAL: the database system is shutting > down > ..................................................... failed > pg_ctl: server does not shut down > > There were problems executing "/usr/lib/postgresql/8.3/bin/pg_ctl" -l > "/tmp/migration.log" -D "/var/lib/postgresql/8.3/main" stop >> > "/tmp/migration.log" 2>&1 > "/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D > "/var/lib/postgresql/8.3/main" -m fast stop >> "/tmp/migration.log" 2>&1 > 2011-01-26 14:22:59 CET LOG: received fast shutdown request > 2011-01-26 14:22:59 CET LOG: aborting any active transactions > 2011-01-26 14:22:59 CET FATAL: terminating connection due to > administrator command > waiting for server to shut down....2011-01-26 14:22:59 CET LOG: > shutting down > 2011-01-26 14:22:59 CET LOG: database system is shut down > done > server stopped > > """ > > At the end it stopped, but with returncode 1. So it is broken here... :( > > Well, as I saw this first I thought I might forgot some db connections > to terminate, but all services were down. A "ps -axuf" gives me the > following output: > > """ > postgres 26253 2.8 0.3 426264 10684 pts/1 S+ 14:21 > 0:01 /usr/lib/postgresql/8.3/bin/postgres > -D /var/lib/postgresql/8.3/main - > postgres 26255 0.0 0.0 426396 2044 ? Ss 14:21 0:00 \_ postgres: writer > process > postgres 26257 0.0 0.0 154368 1616 ? Ss 14:21 0:00 \_ postgres: stats > collector process > postgres 26258 0.0 0.1 427612 4188 ? Ss 14:21 0:00 \_ postgres: grepo > DB_NAME LOCAL_IP(PORT) idle > """ > > It seems that it waits for termination of its own process. With my > testsetup i didn't get this error, maybe because it was much faster, > because the used data were new and not fragmented or anything. > > If i try a /etc/init.d/postgresql-8.3 stop with user root it needs only > 5 seconds and the db is down. > > Well, now I am stucked at this point and can't upgrade my databases, is > there any way to increase the timeout? or another way to perform before > the upgrade to reduce the stopping and starting time? > > Hope you have some hints for me. > > greetz > Bernhard > > > > > > > > > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
it's the same behaviour, it doesnt stop. what maybe is also interesting, if i start the database manually i have a huge waiting IO of 23% for a time of maybe 2-5 minutes, i think it is because of the fragmentation of the tables. is the shutdown so slow because of the waiting io? but why can i stop it as user root without problems... i will test the command later today, and wait for the waiting io to get finished... Am Mittwoch, den 26.01.2011, 21:41 -0500 schrieb Bruce Momjian: > Wow, that is odd. Good thing you were only running in check mode. What > happens if you run that pg_ctl command manually? Is > /etc/init.d/postgresql-8.3 stop running pg_ctl or something different? > > --------------------------------------------------------------------------- > > Bernhard Schrader wrote: > > Hi all, > > > > I try to inplace migrate our postgresql databases from 8.3 to 9.0. > > So far, it worked by the testdbs I set up, it was really quick and I > > looked forward to migrate the live dbs. > > But here were some issues I didn't had in the test environment. > > > > If I start this command: > > > > """ > > su postgres -c > > "cd /usr/lib/postgresql/9.0/bin/; /usr/lib/postgresql/9.0/bin/pg_upgrade > > --link --check -g -v -d /var/lib/postgresql/8.3/main/ > > -D /var/lib/postgresql/9.0/main/ -b /usr/lib/postgresql/8.3/bin/ > > -B /usr/lib/postgresql/9.0/bin/ -l /tmp/migration.log" > > """ > > > > It starts the checking and all the things, at the end he trys to stop > > the db via pg_ctl, but here seems to be the error nested: > > > > """"/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D > > "/var/lib/postgresql/8.3/main" stop >> "/tmp/migration.log" 2>&1 > > waiting for server to shut down...2011-01-26 14:21:59 CET LOG: received > > smart shutdown request > > .......2011-01-26 14:22:05 CET FATAL: the database system is shutting > > down > > ..................................................... failed > > pg_ctl: server does not shut down > > > > There were problems executing "/usr/lib/postgresql/8.3/bin/pg_ctl" -l > > "/tmp/migration.log" -D "/var/lib/postgresql/8.3/main" stop >> > > "/tmp/migration.log" 2>&1 > > "/usr/lib/postgresql/8.3/bin/pg_ctl" -l "/tmp/migration.log" -D > > "/var/lib/postgresql/8.3/main" -m fast stop >> "/tmp/migration.log" 2>&1 > > 2011-01-26 14:22:59 CET LOG: received fast shutdown request > > 2011-01-26 14:22:59 CET LOG: aborting any active transactions > > 2011-01-26 14:22:59 CET FATAL: terminating connection due to > > administrator command > > waiting for server to shut down....2011-01-26 14:22:59 CET LOG: > > shutting down > > 2011-01-26 14:22:59 CET LOG: database system is shut down > > done > > server stopped > > > > """ > > > > At the end it stopped, but with returncode 1. So it is broken here... :( > > > > Well, as I saw this first I thought I might forgot some db connections > > to terminate, but all services were down. A "ps -axuf" gives me the > > following output: > > > > """ > > postgres 26253 2.8 0.3 426264 10684 pts/1 S+ 14:21 > > 0:01 /usr/lib/postgresql/8.3/bin/postgres > > -D /var/lib/postgresql/8.3/main - > > postgres 26255 0.0 0.0 426396 2044 ? Ss 14:21 0:00 \_ postgres: writer > > process > > postgres 26257 0.0 0.0 154368 1616 ? Ss 14:21 0:00 \_ postgres: stats > > collector process > > postgres 26258 0.0 0.1 427612 4188 ? Ss 14:21 0:00 \_ postgres: grepo > > DB_NAME LOCAL_IP(PORT) idle > > """ > > > > It seems that it waits for termination of its own process. With my > > testsetup i didn't get this error, maybe because it was much faster, > > because the used data were new and not fragmented or anything. > > > > If i try a /etc/init.d/postgresql-8.3 stop with user root it needs only > > 5 seconds and the db is down. > > > > Well, now I am stucked at this point and can't upgrade my databases, is > > there any way to increase the timeout? or another way to perform before > > the upgrade to reduce the stopping and starting time? > > > > Hope you have some hints for me. > > > > greetz > > Bernhard > > > > > > > > > > > > > > > > > > > > -- > > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-admin >
Bernhard Schrader <bernhard.schrader@innogames.de> wrote: > what maybe is also interesting, if i start the database manually i > have a huge waiting IO of 23% for a time of maybe 2-5 minutes, i > think it is because of the fragmentation of the tables. > > is the shutdown so slow because of the waiting io? but why can i > stop it as user root without problems... How do you "stop it as user root", exactly? If you kill PostgreSQL too abruptly, the recovery could easily run for several minutes with a lot of I/O while it recovers. Taking pg_upgrade out of the picture for a minute -- how long do shutdowns and startups normally take you? -Kevin
Am Donnerstag, den 27.01.2011, 09:18 -0600 schrieb Kevin Grittner: > Bernhard Schrader <bernhard.schrader@innogames.de> wrote: > > > what maybe is also interesting, if i start the database manually i > > have a huge waiting IO of 23% for a time of maybe 2-5 minutes, i > > think it is because of the fragmentation of the tables. > > > > is the shutdown so slow because of the waiting io? but why can i > > stop it as user root without problems... > > How do you "stop it as user root", exactly? If you kill PostgreSQL > too abruptly, the recovery could easily run for several minutes with > a lot of I/O while it recovers. > > Taking pg_upgrade out of the picture for a minute -- how long do > shutdowns and startups normally take you? > > -Kevin > well, i tested it now with user postgres, the same command as the program, and it didnt shut down the server, the log gives me the notice that it want to shut down but it doesnt happen, maybe this stop method is to "friendly". i tested it with the already running db, so there were no waiting io or anything.. with option "-m fast" it stops without problems. normally i do it as user root with /etc/init.d/postgresql-8.3 (stop| start) and the shutdown takes about 5-10 seconds, the start the same but with some waiting io, which takes up to 5 minutes of bad perfomance.
Bernhard Schrader wrote: > Am Donnerstag, den 27.01.2011, 09:18 -0600 schrieb Kevin Grittner: > > Bernhard Schrader <bernhard.schrader@innogames.de> wrote: > > > > > what maybe is also interesting, if i start the database manually i > > > have a huge waiting IO of 23% for a time of maybe 2-5 minutes, i > > > think it is because of the fragmentation of the tables. > > > > > > is the shutdown so slow because of the waiting io? but why can i > > > stop it as user root without problems... > > > > How do you "stop it as user root", exactly? If you kill PostgreSQL > > too abruptly, the recovery could easily run for several minutes with > > a lot of I/O while it recovers. > > > > Taking pg_upgrade out of the picture for a minute -- how long do > > shutdowns and startups normally take you? > > > > -Kevin > > > well, i tested it now with user postgres, the same command as the > program, and it didnt shut down the server, the log gives me the notice > that it want to shut down but it doesnt happen, maybe this stop method > is to "friendly". i tested it with the already running db, so there were > no waiting io or anything.. with option "-m fast" it stops without > problems. > > normally i do it as user root with /etc/init.d/postgresql-8.3 (stop| > start) and the shutdown takes about 5-10 seconds, the start the same but > with some waiting io, which takes up to 5 minutes of bad perfomance. I suggest you take pg_upgrade out of the testing and find out why you are having startup/shutdown delays. Once they are fixed, pg_upgrade should work fine. My initial reaction is that something is wrong with your system, either the I/O or the way it is being shutdown by the script. I would start to look in the script and do some pg_ctl tests starting/stopping the server. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
On Thu, Jan 27, 2011 at 9:12 AM, Bruce Momjian <bruce@momjian.us> wrote: > Bernhard Schrader wrote: >> Am Donnerstag, den 27.01.2011, 09:18 -0600 schrieb Kevin Grittner: >> > Bernhard Schrader <bernhard.schrader@innogames.de> wrote: >> > >> > > what maybe is also interesting, if i start the database manually i >> > > have a huge waiting IO of 23% for a time of maybe 2-5 minutes, i >> > > think it is because of the fragmentation of the tables. >> > > >> > > is the shutdown so slow because of the waiting io? but why can i >> > > stop it as user root without problems... >> > >> > How do you "stop it as user root", exactly? If you kill PostgreSQL >> > too abruptly, the recovery could easily run for several minutes with >> > a lot of I/O while it recovers. >> > >> > Taking pg_upgrade out of the picture for a minute -- how long do >> > shutdowns and startups normally take you? >> > >> > -Kevin >> > >> well, i tested it now with user postgres, the same command as the >> program, and it didnt shut down the server, the log gives me the notice >> that it want to shut down but it doesnt happen, maybe this stop method >> is to "friendly". i tested it with the already running db, so there were >> no waiting io or anything.. with option "-m fast" it stops without >> problems. >> >> normally i do it as user root with /etc/init.d/postgresql-8.3 (stop| >> start) and the shutdown takes about 5-10 seconds, the start the same but >> with some waiting io, which takes up to 5 minutes of bad perfomance. > > I suggest you take pg_upgrade out of the testing and find out why you > are having startup/shutdown delays. Once they are fixed, pg_upgrade > should work fine. > > My initial reaction is that something is wrong with your system, either > the I/O or the way it is being shutdown by the script. I would start to > look in the script and do some pg_ctl tests starting/stopping the > server. It could be that his application or whatever is making connections while he's trying to do this. An open connection that's actually doing something will stop a normal shutdown. Is there a reason the pg upgrade script does not use -m fast?
Scott Marlowe wrote: > > My initial reaction is that something is wrong with your system, either > > the I/O or the way it is being shutdown by the script. ?I would start to > > look in the script and do some pg_ctl tests starting/stopping the > > server. > > It could be that his application or whatever is making connections > while he's trying to do this. An open connection that's actually > doing something will stop a normal shutdown. > > Is there a reason the pg upgrade script does not use -m fast? Uh, well, I assume that the person has already shut down all db connections, and opened it only for super-users. If the system is not shutting down, that should signal to the user that they have not locked down the system properly. We would not want someone to connect during pg_upgrade processing, and doing -m fast is not going to help with that. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Am Donnerstag, den 27.01.2011, 12:09 -0500 schrieb Bruce Momjian: > Scott Marlowe wrote: > > > My initial reaction is that something is wrong with your system, either > > > the I/O or the way it is being shutdown by the script. ?I would start to > > > look in the script and do some pg_ctl tests starting/stopping the > > > server. > > > > It could be that his application or whatever is making connections > > while he's trying to do this. An open connection that's actually > > doing something will stop a normal shutdown. > > > > Is there a reason the pg upgrade script does not use -m fast? > > Uh, well, I assume that the person has already shut down all db > connections, and opened it only for super-users. If the system is not > shutting down, that should signal to the user that they have not locked > down the system properly. We would not want someone to connect during > pg_upgrade processing, and doing -m fast is not going to help with that. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > Hi, well, i shut down every client connection that could occur, but with a ps auxf i get following output: postgres 26255 0.0 0.0 426396 2044 ? Ss 14:21 0:00 \_ postgres: writer process postgres 26257 0.0 0.0 154368 1616 ? Ss 14:21 0:00 \_ postgres: stats collector process postgres 26258 0.0 0.1 427612 4188 ? Ss 14:21 0:00 \_ postgres: grepo DB_NAME LOCAL_IP(PORT) idle so there are some connections, but as far as i can say, nothing from a client program, these connections belong to postgres itself??!? is that possible? pg_upgrade has to check the tables anyway, so there must be this connection, or am i wrong?!? greetz
Bernhard Schrader wrote: > Am Donnerstag, den 27.01.2011, 12:09 -0500 schrieb Bruce Momjian: > > Scott Marlowe wrote: > > > > My initial reaction is that something is wrong with your system, either > > > > the I/O or the way it is being shutdown by the script. ?I would start to > > > > look in the script and do some pg_ctl tests starting/stopping the > > > > server. > > > > > > It could be that his application or whatever is making connections > > > while he's trying to do this. An open connection that's actually > > > doing something will stop a normal shutdown. > > > > > > Is there a reason the pg upgrade script does not use -m fast? > > > > Uh, well, I assume that the person has already shut down all db > > connections, and opened it only for super-users. If the system is not > > shutting down, that should signal to the user that they have not locked > > down the system properly. We would not want someone to connect during > > pg_upgrade processing, and doing -m fast is not going to help with that. > > > > -- > > Bruce Momjian <bruce@momjian.us> http://momjian.us > > EnterpriseDB http://enterprisedb.com > > > > + It's impossible for everything to be true. + > > > > Hi, > > well, i shut down every client connection that could occur, but with a > ps auxf i get following output: > > postgres 26255 0.0 0.0 426396 2044 ? Ss 14:21 0:00 \_ postgres: writer > process > postgres 26257 0.0 0.0 154368 1616 ? Ss 14:21 0:00 \_ postgres: stats > collector process > postgres 26258 0.0 0.1 427612 4188 ? Ss 14:21 0:00 \_ postgres: grepo > DB_NAME LOCAL_IP(PORT) idle > > so there are some connections, but as far as i can say, nothing from a > client program, these connections belong to postgres itself??!? is that > possible? pg_upgrade has to check the tables anyway, so there must be > this connection, or am i wrong?!? The first two are normal processes you will see when you start Postgres. That last one looks odd --- did you mask it somehow? That looks like an active idle connection for user 'grepo'. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. +
Am Donnerstag, den 27.01.2011, 13:12 -0500 schrieb Bruce Momjian: > Bernhard Schrader wrote: > > Am Donnerstag, den 27.01.2011, 12:09 -0500 schrieb Bruce Momjian: > > > Scott Marlowe wrote: > > > > > My initial reaction is that something is wrong with your system, either > > > > > the I/O or the way it is being shutdown by the script. ?I would start to > > > > > look in the script and do some pg_ctl tests starting/stopping the > > > > > server. > > > > > > > > It could be that his application or whatever is making connections > > > > while he's trying to do this. An open connection that's actually > > > > doing something will stop a normal shutdown. > > > > > > > > Is there a reason the pg upgrade script does not use -m fast? > > > > > > Uh, well, I assume that the person has already shut down all db > > > connections, and opened it only for super-users. If the system is not > > > shutting down, that should signal to the user that they have not locked > > > down the system properly. We would not want someone to connect during > > > pg_upgrade processing, and doing -m fast is not going to help with that. > > > > > > -- > > > Bruce Momjian <bruce@momjian.us> http://momjian.us > > > EnterpriseDB http://enterprisedb.com > > > > > > + It's impossible for everything to be true. + > > > > > > > Hi, > > > > well, i shut down every client connection that could occur, but with a > > ps auxf i get following output: > > > > postgres 26255 0.0 0.0 426396 2044 ? Ss 14:21 0:00 \_ postgres: writer > > process > > postgres 26257 0.0 0.0 154368 1616 ? Ss 14:21 0:00 \_ postgres: stats > > collector process > > postgres 26258 0.0 0.1 427612 4188 ? Ss 14:21 0:00 \_ postgres: grepo > > DB_NAME LOCAL_IP(PORT) idle > > > > so there are some connections, but as far as i can say, nothing from a > > client program, these connections belong to postgres itself??!? is that > > possible? pg_upgrade has to check the tables anyway, so there must be > > this connection, or am i wrong?!? > > The first two are normal processes you will see when you start Postgres. > That last one looks odd --- did you mask it somehow? That looks like an > active idle connection for user 'grepo'. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + > good hint, thanks a lot. Well, I stopped all services, also a java daemon which connects to the db as this user. the init skript told me that the javaserver was shut down properly, but it didnt, kill -9 helped here. after that i tried the shutdown again and it worked well. :) thanks a lot so far, now i can migrate all of our dbs :) regards Bernhard