Thread: odd problem !
Hi there, below is the problem I just bitten when play with toy db. I did: 1.initdb -D ./t1 2. pg_ctl -D ./t1 start 3. createdb test 4. psql test -c "create table a (f integer);" 5. run script which populates table a in background perl bgupdate.pl & 6. cp -a ./t1 ./t2 6. pg_ctl -D ./t1 stop it's waiting for bgupdate.pl, so I killed script 7. pg_ct -D ./t2 start see, that everything is fine. 8. pg_ctl -D ./t2 stop 9. pg_ctl -D ./t1 start 10. pg@zen:~/test$ psql -l List of databases Name | Owner | Encoding -----------+----------+---------- template0 | postgres | KOI8 template1 | postgres | KOI8 test | postgres | KOI8 (3 rows) 11. pg@zen:~/test$ psql test FATAL: database "test" does not exist psql: FATAL: database "test" does not exist So, how it's possible ? Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
OK, here is more cleaner cut and paste from my notebook: pg@zen:~/test$ initdb -D ./t1 pg@zen:~/test$ pg_ctl -D ./t1 start postmaster starting pg@zen:~/test$ LOG: database system was shut down at 2005-03-23 01:09:34 MSK LOG: checkpoint record is at 0/A2C844 LOG: redo record is at 0/A2C844; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 544; next OID: 17230 LOG: database system is ready pg@zen:~/test$ createdb test CREATE DATABASE pg@zen:~/test$ psql test -c "create table a (f integer);" CREATE TABLE $i); commit;"; done COMMIT COMMIT COMMIT COMMIT COMMIT COMMIT COMMIT COMMIT COMMIT COMMIT pg@zen:~/test$ cp -a ./t1 ./t2 pg@zen:~/test$ rm ./t2/postmaster.pid rm: remove regular file ./t2/postmaster.pid'? y pg@zen:~/test$ pg_ctl -D ./t1 stop LOG: received smart shutdown request LOG: shutting down waiting for postmaster to shut down....LOG: database system is shut down done postmaster stopped pg@zen:~/test$ LOG: database system was interrupted at 2005-03-23 01:10:06 MSK LOG: checkpoint record is at 0/A2C844 LOG: redo record is at 0/A2C844; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 544; next OID: 17230 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 0/A2C880 LOG: record with zero length at 0/A4FD38 LOG: redo done at 0/A4FD10 LOG: database system is ready pg@zen:~/test$ psql test -c "select count(*) from a;" count ------- 10 (1 row) pg@zen:~/test$ pg_ctl -D ./t2 stop waiting for postmaster to shut down....LOG: received smart shutdown request LOG: shutting down LOG: database system is shut down done postmaster stopped pg@zen:~/test$ pg_ctl -D ./t1 start postmaster starting pg@zen:~/test$ LOG: database system was shut down at 2005-03-23 01:11:40 MSK LOG: checkpoint record is at 0/A4FD38 LOG: redo record is at 0/A4FD38; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 568; next OID: 17243 LOG: database system is ready pg@zen:~/test$ psql test -c "select count(*) from a;" ERROR: relation "a" does not exist ERROR: relation "a" does not exist So, where is my table 'a' ? Oleg On Wed, 23 Mar 2005, Oleg Bartunov wrote: > Hi there, > > below is the problem I just bitten when play with toy db. I did: > > 1.initdb -D ./t1 > 2. pg_ctl -D ./t1 start > 3. createdb test > 4. psql test -c "create table a (f integer);" > 5. run script which populates table a in background > perl bgupdate.pl & > 6. cp -a ./t1 ./t2 > 6. pg_ctl -D ./t1 stop > it's waiting for bgupdate.pl, so I killed script > 7. pg_ct -D ./t2 start > see, that everything is fine. > 8. pg_ctl -D ./t2 stop > 9. pg_ctl -D ./t1 start > 10. > > pg@zen:~/test$ psql -l > List of databases > Name | Owner | Encoding -----------+----------+---------- > template0 | postgres | KOI8 > template1 | postgres | KOI8 > test | postgres | KOI8 > (3 rows) > > 11. > pg@zen:~/test$ psql test > FATAL: database "test" does not exist > psql: FATAL: database "test" does not exist > > So, how it's possible ? > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: > below is the problem I just bitten when play with toy db. I did: > 1.initdb -D ./t1 > 2. pg_ctl -D ./t1 start > 3. createdb test > 4. psql test -c "create table a (f integer);" > 5. run script which populates table a in background > perl bgupdate.pl & > 6. cp -a ./t1 ./t2 I would not really expect this to produce a usable copy at all... > 6. pg_ctl -D ./t1 stop > it's waiting for bgupdate.pl, so I killed script > 7. pg_ct -D ./t2 start > see, that everything is fine. Exactly how much did you test? However, that doesn't seem relevant to your subsequent problem with the original. > 8. pg_ctl -D ./t2 stop > 9. pg_ctl -D ./t1 start > 10. > pg@zen:~/test$ psql -l > List of databases > Name | Owner | Encoding > -----------+----------+---------- > template0 | postgres | KOI8 > template1 | postgres | KOI8 > test | postgres | KOI8 > (3 rows) > 11. > pg@zen:~/test$ psql test > FATAL: database "test" does not exist > psql: FATAL: database "test" does not exist What PG version is this exactly? I suppose that you're seeing one of the GetRawDatabase corner cases, but I'm not quite sure why the shutdown and restart after the createdb wouldn't have fixed it. Does vacuuming pg_database and then checkpointing make it work? regards, tom lane
On Tue, 22 Mar 2005, Tom Lane wrote: >> template1 | postgres | KOI8 >> test | postgres | KOI8 >> (3 rows) > >> 11. >> pg@zen:~/test$ psql test >> FATAL: database "test" does not exist >> psql: FATAL: database "test" does not exist > > What PG version is this exactly? I suppose that you're seeing one of > the GetRawDatabase corner cases, but I'm not quite sure why the shutdown > and restart after the createdb wouldn't have fixed it. Does vacuuming > pg_database and then checkpointing make it work? REL8_0_STABLE, I believe. I posted another problem, now "cut and pasted". > > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: >> What PG version is this exactly? > REL8_0_STABLE, I believe. I posted another problem, now "cut and pasted". I've been able to duplicate this here. What is happening is that the damage to ./t1 is being done when you start the postmaster in ./t2. It looks to me like the fundamental problem is that the t2 postmaster is replaying the WAL-logged CREATE DATABASE command from t1's xlog, and *that log entry contains an absolute path name*. So the CREATE replay is wiping out t1's "test" database subdirectory. This isn't a problem in normal use of course, but it'd be a serious issue for someone engaging in WAL-shipping, if their backup postmaster were living at a different absolute path. We probably need to think about whether we can make CREATE DATABASE log only relative paths. Log-shipping CREATE TABLESPACE commands is even more interesting :-(. Not sure how to deal with that. regards, tom lane
On Tue, 22 Mar 2005, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: >>> What PG version is this exactly? > >> REL8_0_STABLE, I believe. I posted another problem, now "cut and pasted". > > I've been able to duplicate this here. What is happening is that the > damage to ./t1 is being done when you start the postmaster in ./t2. > It looks to me like the fundamental problem is that the t2 postmaster > is replaying the WAL-logged CREATE DATABASE command from t1's xlog, and > *that log entry contains an absolute path name*. So the CREATE replay > is wiping out t1's "test" database subdirectory. I suspected this. btw,is there any utility to see WAL log in human-readable format ? > > This isn't a problem in normal use of course, but it'd be a serious > issue for someone engaging in WAL-shipping, if their backup postmaster > were living at a different absolute path. We probably need to think right, this is normal situation if you backup to the same server. Not sure how it's usefull, but still > about whether we can make CREATE DATABASE log only relative paths. any problem ? > > Log-shipping CREATE TABLESPACE commands is even more interesting :-(. > Not sure how to deal with that. > in general case it's impossible. Just speculating, what if we have some dedicated directory doing symbolical links there for all tablespaces ? > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: >> This isn't a problem in normal use of course, but it'd be a serious >> issue for someone engaging in WAL-shipping, if their backup postmaster >> were living at a different absolute path. We probably need to think > right, this is normal situation if you backup to the same server. > Not sure how it's usefull, but still >> about whether we can make CREATE DATABASE log only relative paths. > any problem ? I've committed a fix --- please check that it solves your problem. (I have to leave for a few hours so I can't check it myself right away.) >> Log-shipping CREATE TABLESPACE commands is even more interesting :-(. >> Not sure how to deal with that. > in general case it's impossible. Yeah :-( ... needs more thought. regards, tom lane
Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: > >> This isn't a problem in normal use of course, but it'd be a serious > >> issue for someone engaging in WAL-shipping, if their backup postmaster > >> were living at a different absolute path. We probably need to think > > > right, this is normal situation if you backup to the same server. > > Not sure how it's usefull, but still > > >> about whether we can make CREATE DATABASE log only relative paths. > > > any problem ? > > I've committed a fix --- please check that it solves your problem. > (I have to leave for a few hours so I can't check it myself right away.) > > >> Log-shipping CREATE TABLESPACE commands is even more interesting :-(. > >> Not sure how to deal with that. > > > in general case it's impossible. > > Yeah :-( ... needs more thought. Is this a TODO? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
I checked stable branch. no problem now. Oleg On Tue, 22 Mar 2005, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: >>> This isn't a problem in normal use of course, but it'd be a serious >>> issue for someone engaging in WAL-shipping, if their backup postmaster >>> were living at a different absolute path. We probably need to think > >> right, this is normal situation if you backup to the same server. >> Not sure how it's usefull, but still > >>> about whether we can make CREATE DATABASE log only relative paths. > >> any problem ? > > I've committed a fix --- please check that it solves your problem. > (I have to leave for a few hours so I can't check it myself right away.) > >>> Log-shipping CREATE TABLESPACE commands is even more interesting :-(. >>> Not sure how to deal with that. > >> in general case it's impossible. > > Yeah :-( ... needs more thought. > > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
On Tue, 2005-03-22 at 21:42 -0500, Bruce Momjian wrote: > Tom Lane wrote: > > Oleg Bartunov <oleg@sai.msu.su> writes: > > >> This isn't a problem in normal use of course, but it'd be a serious > > >> issue for someone engaging in WAL-shipping, if their backup postmaster > > >> were living at a different absolute path. We probably need to think > > > > > right, this is normal situation if you backup to the same server. > > > Not sure how it's usefull, but still > > > > >> about whether we can make CREATE DATABASE log only relative paths. > > > > > any problem ? > > > > I've committed a fix --- please check that it solves your problem. > > (I have to leave for a few hours so I can't check it myself right away.) > > > > >> Log-shipping CREATE TABLESPACE commands is even more interesting :-(. > > >> Not sure how to deal with that. > > > > > in general case it's impossible. > > > > Yeah :-( ... needs more thought. > > Is this a TODO? > Yes, it is. In my experience, most people create only a single Database, then define their Tablespaces (or change them rarely, if ever). So I've always regarded CREATE DATABASE and TABLESPACE as lower priority items. I'd always recommend taking another full backup immediately following such changes, rather than rely on everything working in the rollforward. That was true with PITR and similarly true with log shipping based upon PITR. Best Regards, Simon Riggs
On Tue, 2005-03-22 at 19:07 -0500, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: > >> This isn't a problem in normal use of course, but it'd be a serious > >> issue for someone engaging in WAL-shipping, if their backup postmaster > >> were living at a different absolute path. We probably need to think > > > right, this is normal situation if you backup to the same server. > > Not sure how it's usefull, but still > > >> about whether we can make CREATE DATABASE log only relative paths. > > > any problem ? > > I've committed a fix --- please check that it solves your problem. > (I have to leave for a few hours so I can't check it myself right away.) > > >> Log-shipping CREATE TABLESPACE commands is even more interesting :-(. > >> Not sure how to deal with that. > > > in general case it's impossible. > > Yeah :-( ... needs more thought. Just realised that this isn't just a problem with log-shipping. If you restore a database onto a server that already has a data directory on it in the same place as the one referenced by the restored copy, then you'll overwrite the existing one. That means for RPM distros where the path is initially fixed, you could wipe out a database when trying to perform a recovery from another system... Oh dear. Yes, needs more thought and a bug fix for inclusion in 8.0.2+ is probably warranted. Best Regards, Simon Riggs
On Wed, 2005-03-23 at 13:07 +0000, Simon Riggs wrote: > On Tue, 2005-03-22 at 19:07 -0500, Tom Lane wrote: > > Oleg Bartunov <oleg@sai.msu.su> writes: > > >> This isn't a problem in normal use of course, but it'd be a serious > > >> issue for someone engaging in WAL-shipping, if their backup postmaster > > >> were living at a different absolute path. We probably need to think > > > > > right, this is normal situation if you backup to the same server. > > > Not sure how it's usefull, but still > > > > >> about whether we can make CREATE DATABASE log only relative paths. > > > > > any problem ? > > > > I've committed a fix --- please check that it solves your problem. > > (I have to leave for a few hours so I can't check it myself right away.) > > > > >> Log-shipping CREATE TABLESPACE commands is even more interesting :-(. > > >> Not sure how to deal with that. > > > > > in general case it's impossible. > > > > Yeah :-( ... needs more thought. > > Just realised that this isn't just a problem with log-shipping. ...Just a problem with me reading the whole of a mail... :-( Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > On Tue, 2005-03-22 at 21:42 -0500, Bruce Momjian wrote: >> Is this a TODO? > Yes, it is. > In my experience, most people create only a single Database, then define > their Tablespaces (or change them rarely, if ever). So I've always > regarded CREATE DATABASE and TABLESPACE as lower priority items. I'd > always recommend taking another full backup immediately following such > changes, rather than rely on everything working in the rollforward. That > was true with PITR and similarly true with log shipping based upon PITR. In the post-initlocation world, I think CREATE/DROP DATABASE are no big problems. The bug Oleg encountered was really just a thinko or premature optimization on my part: these operations should log database and tablespace OIDs and let the recovery process recompute directory pathnames from those, but I had stupidly made them log absolute paths instead. There may still be some race-condition-type risks associated with CREATE DATABASE but I don't think they are any worse for WAL replay than they are for normal operation. You just can't safely modify a DB that is an active CREATE DATABASE template. DROP TABLESPACE isn't a problem either; we were already logging just the tablespace OID, so that should work fine. The hard part is CREATE TABLESPACE, and the reason it's hard is that someone might possibly want the tablespace to be located at a different place on the recipient machine than it is on the master. I do not see a reasonable way to support that at the moment. For the moment I think we should just document that as a gotcha, and make a TODO item to find a way to do it. regards, tom lane
On Wed, 23 Mar 2005, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Tue, 2005-03-22 at 21:42 -0500, Bruce Momjian wrote: > >> Is this a TODO? > > > Yes, it is. > > > In my experience, most people create only a single Database, then define > > their Tablespaces (or change them rarely, if ever). So I've always > > regarded CREATE DATABASE and TABLESPACE as lower priority items. I'd > > always recommend taking another full backup immediately following such > > changes, rather than rely on everything working in the rollforward. That > > was true with PITR and similarly true with log shipping based upon PITR. > [snip] > The hard part is CREATE TABLESPACE, and the reason it's hard is that > someone might possibly want the tablespace to be located at a different > place on the recipient machine than it is on the master. I do not see a > reasonable way to support that at the moment. For the moment I think we > should just document that as a gotcha, and make a TODO item to find a > way to do it. Here is a docs patch which adds a note to the section on PITR. Gavin
Gavin Sherry <swm@linuxworld.com.au> writes: >> The hard part is CREATE TABLESPACE, and the reason it's hard is that >> someone might possibly want the tablespace to be located at a different >> place on the recipient machine than it is on the master. I do not see a >> reasonable way to support that at the moment. For the moment I think we >> should just document that as a gotcha, and make a TODO item to find a >> way to do it. > Here is a docs patch which adds a note to the section on PITR. I already applied something about this to CVS. Do you want to modify the wording of what's there now? regards, tom lane
On Thu, 24 Mar 2005, Tom Lane wrote: > Gavin Sherry <swm@linuxworld.com.au> writes: > >> The hard part is CREATE TABLESPACE, and the reason it's hard is that > >> someone might possibly want the tablespace to be located at a different > >> place on the recipient machine than it is on the master. I do not see a > >> reasonable way to support that at the moment. For the moment I think we > >> should just document that as a gotcha, and make a TODO item to find a > >> way to do it. > > > Here is a docs patch which adds a note to the section on PITR. > > I already applied something about this to CVS. Do you want to modify > the wording of what's there now? No, it looks good. I didn't see your commit. Gavin