Thread: pg_group_name_index corrupt?
This worry anyone? :) v7.0, compiled today based off of CVS ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org ---------- Forwarded message ---------- pg_banners=> vacuum; NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). Recreate the index. NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). Recreate the index. VACUUM pg_banners=> \d pg_group Table = pg_group +----------------------------------+----------------------------------+-------+ | Field | Type | Length| +----------------------------------+----------------------------------+-------+ | groname | name | 32 | | grosysid | int4 | 4 | | grolist | int4[] | var | +----------------------------------+----------------------------------+-------+ Indices: pg_group_name_index pg_group_sysid_index pg_banners=> \d pg_group_name_index Table = pg_group_name_index +----------------------------------+----------------------------------+-------+ | Field | Type | Length| +----------------------------------+----------------------------------+-------+ | groname | name | 32 | +----------------------------------+----------------------------------+-------+ pg_banners=> drop index pg_group_name_index; ERROR: class "pg_group_name_index" is a system catalog index pg_banners=>
The Hermit Hacker <scrappy@hub.org> writes: > This worry anyone? :) > NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > Recreate the index. > NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > Recreate the index. Not if you had other transactions running in parallel with the vacuum --- if the vacuum was the only thing running then I'd want to know what you were doing before that... regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > This worry anyone? :) > > NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > > Recreate the index. > > NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > > Recreate the index. > > Not if you had other transactions running in parallel with the > vacuum --- if the vacuum was the only thing running then I'd want > to know what you were doing before that... I can't guarantee whether i was or not :( right now, I'm assuming that 'other transactions' would include any database on that server, not just the database that I was vacuuming at the time, as even if I go in and do a vacuum on 'template1', that error pops up ... It says to 'recreate the index', but if I try to, it tells me its a system table (of course) ... is there a way of fixing this without having to do a dump/reload? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: >>>> This worry anyone? :) >>>> NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). >>>> Recreate the index. >>>> NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). >>>> Recreate the index. It occurred to me that this would be the likely symptom if a CREATE GROUP or ALTER GROUP command had neglected to update the indexes on pg_group. However, I can't spot any such problem in the code nor cause it to happen by hand. Anyone else have an idea? BTW Marc, do you happen to know what sequence of user and/or group create/alter/drops might have been executed on your setup? regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > >>>> This worry anyone? :) > >>>> NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > >>>> Recreate the index. > >>>> NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > >>>> Recreate the index. > > It occurred to me that this would be the likely symptom if a CREATE > GROUP or ALTER GROUP command had neglected to update the indexes on > pg_group. However, I can't spot any such problem in the code nor > cause it to happen by hand. Anyone else have an idea? > > BTW Marc, do you happen to know what sequence of user and/or group > create/alter/drops might have been executed on your setup? the only group that is in pg_group is: template1=> select * from pg_group; groname |grosysid|grolist -------------+--------+-------------- trends_acctng| 0|{70,1007,1008} (1 row) and it was added ~2 weeks ago ... i tried deleting and re-adding it this morning, now it says (1 of 2) vs (0 of 1) ... I had a server lock up the other day running an RC2 code base, which a 'truss -p' on postmaster produced nothing ... I did a kill on the server, upgraded to the latest code and restarted it, after which this problem occur'd ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > the only group that is in pg_group is: > template1=> select * from pg_group; > groname |grosysid|grolist > -------------+--------+-------------- > trends_acctng| 0|{70,1007,1008} > (1 row) > and it was added ~2 weeks ago ... > i tried deleting and re-adding it this morning, now it says (1 of 2) vs (0 > of 1) ... Hmm, that's what I was going to suggest doing. I think there must be a dead tuple in pg_group that VACUUM doesn't want to clean out for some reason --- maybe it thinks there is still an open transaction somewhere that could see the tuple as still live? Odd. Did you try doing a vacuum of pg_group in between deleting and re-adding the one valid group? > I had a server lock up the other day running an RC2 code base, which a > 'truss -p' on postmaster produced nothing ... I did a kill on the server, > upgraded to the latest code and restarted it, after which this problem > occur'd ... Wish you'd taken more careful notes when that happened. What do you mean by lock up, exactly? Do you know what it was doing just beforehand? regards, tom lane
I wrote: > It occurred to me that this would be the likely symptom if a CREATE > GROUP or ALTER GROUP command had neglected to update the indexes on > pg_group. However, I can't spot any such problem in the code nor > cause it to happen by hand. Anyone else have an idea? After further looking, I notice that users.c is one of the few places that will drop AccessExclusiveLock at heap_close time rather than holding it till xact commit. I wonder whether this is a bug... it could allow another backend to get in and start vacuuming the file before our updates have committed. I am not sure that vacuum would do the wrong thing in that case, but maybe so. Comments anyone (Vadim?) I also notice that there definitely is a glaring bug there: write_password_file() leaks one kernel file descriptor each time it runs (note the creat() call). Alter enough pg_shadow entries in one session and your backend stops working. I think this is a "must fix" problem --- any objections? regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > I wrote: > > It occurred to me that this would be the likely symptom if a CREATE > > GROUP or ALTER GROUP command had neglected to update the indexes on > > pg_group. However, I can't spot any such problem in the code nor > > cause it to happen by hand. Anyone else have an idea? > > After further looking, I notice that users.c is one of the few places > that will drop AccessExclusiveLock at heap_close time rather than > holding it till xact commit. I wonder whether this is a bug... > it could allow another backend to get in and start vacuuming the file > before our updates have committed. I am not sure that vacuum would do > the wrong thing in that case, but maybe so. Comments anyone (Vadim?) > > I also notice that there definitely is a glaring bug there: > write_password_file() leaks one kernel file descriptor each time it runs > (note the creat() call). Alter enough pg_shadow entries in one session > and your backend stops working. I think this is a "must fix" problem > --- any objections? None from me ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > the only group that is in pg_group is: > > > template1=> select * from pg_group; > > groname |grosysid|grolist > > -------------+--------+-------------- > > trends_acctng| 0|{70,1007,1008} > > (1 row) > > > and it was added ~2 weeks ago ... > > > i tried deleting and re-adding it this morning, now it says (1 of 2) vs (0 > > of 1) ... > > Hmm, that's what I was going to suggest doing. I think there must be a > dead tuple in pg_group that VACUUM doesn't want to clean out for some > reason --- maybe it thinks there is still an open transaction somewhere > that could see the tuple as still live? Odd. > > Did you try doing a vacuum of pg_group in between deleting and re-adding > the one valid group? Hadn't thought of that ... but ... and you aren't going to like this ... if I delete/vacuum/insert/vacuum ... INDEX TUPLES increases by 1, HEAP increases by one ... I'm up to 3->4 now, and would go 4->5 if I were to do it again ... > > I had a server lock up the other day running an RC2 code base, which a > > 'truss -p' on postmaster produced nothing ... I did a kill on the server, > > upgraded to the latest code and restarted it, after which this problem > > occur'd ... > > Wish you'd taken more careful notes when that happened. What do you > mean by lock up, exactly? Do you know what it was doing just > beforehand? Don't know ... one of hte problems I'm having with my FreeBSD machine right now is that, for some reason, setproctitle() isn't working, so all my backends look the same 'postmaster' and its start up options :( Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > Hadn't thought of that ... but ... and you aren't going to like this > ... if I delete/vacuum/insert/vacuum ... INDEX TUPLES increases by 1, HEAP > increases by one ... I'm up to 3->4 now, and would go 4->5 if I were to do > it again ... That definitely sounds like VACUUM thinks there's an old open transaction somewhere and so it doesn't want to clean out the dead tuples. I believe we have a mechanism for deciding that an old transaction must have aborted (it involves looking to see if any active backend claims to be running that transaction). But I wonder whether that mechanism is being used when VACUUM decides whether it can clean out a dead tuple or not. Vadim? > Don't know ... one of hte problems I'm having with my FreeBSD machine > right now is that, for some reason, setproctitle() isn't working, so all > my backends look the same 'postmaster' and its start up options :( Now you know how the other half lives ;-). regards, tom lane
> -----Original Message----- > From: pgsql-hackers-owner@hub.org > [mailto:pgsql-hackers-owner@hub.org]On Behalf Of The Hermit Hacker > > On Thu, 4 May 2000, Tom Lane wrote: > > > The Hermit Hacker <scrappy@hub.org> writes: > > > This worry anyone? :) > > > NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES > (0) IS NOT THE SAME AS HEAP' (1). > > > Recreate the index. > > > NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES > (0) IS NOT THE SAME AS HEAP' (1). > > > Recreate the index. > > > > Not if you had other transactions running in parallel with the > > vacuum --- if the vacuum was the only thing running then I'd want > > to know what you were doing before that... > > I can't guarantee whether i was or not :( right now, I'm assuming that > 'other transactions' would include any database on that server, not just > the database that I was vacuuming at the time, as even if I go in and do a > vacuum on 'template1', that error pops up ... > Why does pg_group exist under $PGDATA though the indexes exist under each $PGDATA/base/db_name ? Could it be consistent on all databases ? > It says to 'recreate the index', but if I try to, it tells me its a system > table (of course) ... is there a way of fixing this without having to do > a dump/reload? > Run "reindex table pg_group force;" under standalone postmaster with options -O and -P. You must shutdown postmaster first. Regards. Hiroshi Inoue Inoue@tpf.co.jp
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > Why does pg_group exist under $PGDATA though the indexes exist > under each $PGDATA/base/db_name ? > Could it be consistent on all databases ? Oh my, I think you've got it! The indexes must be SharedSystemRelations!! Backend 1 in database regression: regression=# vacuum analyze pg_group; VACUUM Backend 2 in database other is also happy: other=# vacuum analyze pg_group; VACUUM Now create a group with backend 1: regression=# create group g; CREATE GROUP Backend 1 sees the index entries: regression=# vacuum analyze pg_group; VACUUM But backend 2 doesn't: other=# vacuum analyze pg_group; NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). Recreate the index. NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). Recreate the index. VACUUM pg_shadow would have the same problem if it had indices, which I thought it did but they seem to have been disabled. Can you say "one more initdb"? I knew you could... regards, tom lane
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >> Why does pg_group exist under $PGDATA though the indexes exist >> under each $PGDATA/base/db_name ? >> Could it be consistent on all databases ? > Oh my, I think you've got it! The indexes must be SharedSystemRelations!! Yup, Hiroshi has spotted the problem. Turning the indexes on pg_group into shared relations fixes the cross-database misbehavior shown in my prior message, and I'll bet this bug explains Marc's report too. We never had any indexes on pg_group (or any shared relation) before, which is why we hadn't seen this kind of failure before. (Another limitation of the regression tests exposed --- they don't test cross-database behaviors.) So, now what? This is a simple fix, but it will require initdb (or at least pg_upgrade), which I'd really rather not do at this point in the release cycle. But I'm not sure we have any choice. As it stands, pg_group is broken. If we are going to have to force a new initdb here, we probably ought to reconsider a couple of recent past decisions that were put off on grounds of "we don't want another initdb before 7.0". I'm thinking of the remaining ODBC support functions and the new LIKE estimator in particular. Do we want to revisit those decisions, or leave well enough alone? regards, tom lane
> The Hermit Hacker <scrappy@hub.org> writes: >> Hadn't thought of that ... but ... and you aren't going to like this >> ... if I delete/vacuum/insert/vacuum ... INDEX TUPLES increases by 1, HEAP >> increases by one ... I'm up to 3->4 now, and would go 4->5 if I were to do >> it again ... > That definitely sounds like VACUUM thinks there's an old open transaction > somewhere and so it doesn't want to clean out the dead tuples. After looking some more at this, I'm not convinced that the problem Hiroshi found explains everything Marc is seeing. Having multiple copies of the pg_group indexes would certainly explain missing index tuples in any one copy. But what about the behavior above? It still seems that VACUUM is not cleaning out dead tuples from the table itself. Marc, is it possible that you have any backends that are in the middle of a transaction and have just been sitting that way for hours or days? If so, that'd explain VACUUM's refusal to remove tuples. Another symptom you should look for is notices likeNOTICE: RegisterSharedInvalid: SI buffer overflowNOTICE: InvalidateSharedInvalid:cache state reset in the postmaster log --- there'll probably be quite a few if there's a hung backend somewhere. regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > But backend 2 doesn't: > > other=# vacuum analyze pg_group; > NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME AS HEAP' (1). > Recreate the index. > NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES (0) IS NOT THE SAME > AS HEAP' (1). > Recreate the index. > VACUUM > > pg_shadow would have the same problem if it had indices, which I thought > it did but they seem to have been disabled. > > Can you say "one more initdb"? I knew you could... Would a pg_upgrade fix this, or do I have to do a whole dump/reload? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Thu, 4 May 2000, Tom Lane wrote: > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >> Why does pg_group exist under $PGDATA though the indexes exist > >> under each $PGDATA/base/db_name ? > >> Could it be consistent on all databases ? > > > Oh my, I think you've got it! The indexes must be SharedSystemRelations!! > > Yup, Hiroshi has spotted the problem. Turning the indexes on pg_group > into shared relations fixes the cross-database misbehavior shown in my > prior message, and I'll bet this bug explains Marc's report too. > > We never had any indexes on pg_group (or any shared relation) before, > which is why we hadn't seen this kind of failure before. (Another > limitation of the regression tests exposed --- they don't test > cross-database behaviors.) > > So, now what? This is a simple fix, but it will require initdb (or at > least pg_upgrade), which I'd really rather not do at this point in the > release cycle. But I'm not sure we have any choice. As it stands, > pg_group is broken. > > If we are going to have to force a new initdb here, we probably ought > to reconsider a couple of recent past decisions that were put off on > grounds of "we don't want another initdb before 7.0". I'm thinking of > the remaining ODBC support functions and the new LIKE estimator in > particular. Do we want to revisit those decisions, or leave well enough > alone? Leave well enough alone ... this fixed, IMHO, a *very* important potential bug, whereas the other two can be worked around. AT this *really* late stage in the cycle, fixing one bug at least only opens us up to the possibility of one bug ... doing the ODBC/LIKE stuff aren't mission critical, and really only affect a relatively small group of ppl in comparison ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
On Fri, 5 May 2000, Hiroshi Inoue wrote: > > It says to 'recreate the index', but if I try to, it tells me its a system > > table (of course) ... is there a way of fixing this without having to do > > a dump/reload? > > > > Run "reindex table pg_group force;" under standalone postmaster > with options -O and -P. You must shutdown postmaster first. Okay, just looked at the postgres man page, and found the -O option, but no -P ... this is in the users-lounge/7.0/docs directory ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Coolness, that worked like a charm ... so I take it I have to do this for each and every database on the system? :) On Fri, 5 May 2000, Hiroshi Inoue wrote: > > -----Original Message----- > > From: pgsql-hackers-owner@hub.org > > [mailto:pgsql-hackers-owner@hub.org]On Behalf Of The Hermit Hacker > > > > On Thu, 4 May 2000, Tom Lane wrote: > > > > > The Hermit Hacker <scrappy@hub.org> writes: > > > > This worry anyone? :) > > > > NOTICE: Index pg_group_sysid_index: NUMBER OF INDEX' TUPLES > > (0) IS NOT THE SAME AS HEAP' (1). > > > > Recreate the index. > > > > NOTICE: Index pg_group_name_index: NUMBER OF INDEX' TUPLES > > (0) IS NOT THE SAME AS HEAP' (1). > > > > Recreate the index. > > > > > > Not if you had other transactions running in parallel with the > > > vacuum --- if the vacuum was the only thing running then I'd want > > > to know what you were doing before that... > > > > I can't guarantee whether i was or not :( right now, I'm assuming that > > 'other transactions' would include any database on that server, not just > > the database that I was vacuuming at the time, as even if I go in and do a > > vacuum on 'template1', that error pops up ... > > > > Why does pg_group exist under $PGDATA though the indexes exist > under each $PGDATA/base/db_name ? > Could it be consistent on all databases ? > > > It says to 'recreate the index', but if I try to, it tells me its a system > > table (of course) ... is there a way of fixing this without having to do > > a dump/reload? > > > > Run "reindex table pg_group force;" under standalone postmaster > with options -O and -P. You must shutdown postmaster first. > > Regards. > > Hiroshi Inoue > Inoue@tpf.co.jp > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: >> Can you say "one more initdb"? I knew you could... > Would a pg_upgrade fix this, or do I have to do a whole dump/reload? pg_upgrade *should* work, but if I were you I'd make a backup dump first, just in case. I don't trust pg_upgrade very much... regards, tom lane
The Hermit Hacker <scrappy@hub.org> writes: >> If we are going to have to force a new initdb here, we probably ought >> to reconsider a couple of recent past decisions that were put off on >> grounds of "we don't want another initdb before 7.0". I'm thinking of >> the remaining ODBC support functions and the new LIKE estimator in >> particular. Do we want to revisit those decisions, or leave well enough >> alone? > Leave well enough alone ... this fixed, IMHO, a *very* important potential > bug, whereas the other two can be worked around. AT this *really* late > stage in the cycle, fixing one bug at least only opens us up to the > possibility of one bug ... doing the ODBC/LIKE stuff aren't mission > critical, and really only affect a relatively small group of ppl in > comparison ... That's a fair objection for the LIKE estimator, which after all hasn't gotten much testing. I'll leave well enough alone there. But those missing ODBC functions are just another dozen SQL-function entries for pg_proc; hard to see how they can break anything else, even if (worst case) they're wrong themselves ... regards, tom lane
The Hermit Hacker <scrappy@hub.org> writes: > Coolness, that worked like a charm ... so I take it I have to do this for > each and every database on the system? :) Yup :-(. But actually there's not much point since you're looking at a pg_upgrade or initdb for the real fix, anyway. regards, tom lane
>> Would a pg_upgrade fix this, or do I have to do a whole dump/reload? > pg_upgrade *should* work, but if I were you I'd make a backup dump > first, just in case. I don't trust pg_upgrade very much... Oh, btw: pg_upgrade will *not* work to save and reload your group definitions, because neither it nor pg_dumpall do anything at all with pg_group! For that matter, a full dumpall/reload won't preserve groups either! I griped about that a week or so ago, but no one seems to have picked up on it. Do you want to consider that a "must fix" problem as well? I think it's a pretty minor fix, but considering how late we are in the cycle... regards, tom lane
The Hermit Hacker <scrappy@hub.org> writes: >> Run "reindex table pg_group force;" under standalone postmaster >> with options -O and -P. You must shutdown postmaster first. > Okay, just looked at the postgres man page, and found the -O option, but > no -P ... this is in the users-lounge/7.0/docs directory ... Hiroshi did commit a change to doc/src/sgml/ref/postgres-ref.sgml for that just a few days ago, but I guess it didn't make it into the derived files that Thomas was preparing off-line. I believe there were several other small doco fixes while Thomas was on vacation --- I dunno if he plans to try to get those into the 7.0 derived docs or not. regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > >> Would a pg_upgrade fix this, or do I have to do a whole dump/reload? > > > pg_upgrade *should* work, but if I were you I'd make a backup dump > > first, just in case. I don't trust pg_upgrade very much... > > Oh, btw: pg_upgrade will *not* work to save and reload your group > definitions, because neither it nor pg_dumpall do anything at all with > pg_group! For that matter, a full dumpall/reload won't preserve > groups either! > > I griped about that a week or so ago, but no one seems to have picked up > on it. Do you want to consider that a "must fix" problem as well? > I think it's a pretty minor fix, but considering how late we are in the > cycle... considering where the problem is, I think that if it can be safely done, please do it ... Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: >> I griped about that a week or so ago, but no one seems to have picked up >> on it. Do you want to consider that a "must fix" problem as well? >> I think it's a pretty minor fix, but considering how late we are in the >> cycle... > considering where the problem is, I think that if it can be safely done, > please do it ... Done and done. I also realized that pg_upgrade had another nasty bug in it: the VACUUMs were not necessarily executed as superuser, but as whichever user happened to own the item dumped last by pg_dump in each database. That would result in VACUUM skipping over tables it thought it didn't have permission to vacuum --- like, say, all the system tables. Perhaps this explains the failures that some people have reported. Another day, another bug swatted ... regards, tom lane PS: when you announce RC5, don't forget to mention the required initdb ;-)
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > >> I griped about that a week or so ago, but no one seems to have picked up > >> on it. Do you want to consider that a "must fix" problem as well? > >> I think it's a pretty minor fix, but considering how late we are in the > >> cycle... > > > considering where the problem is, I think that if it can be safely done, > > please do it ... > > Done and done. I also realized that pg_upgrade had another nasty bug > in it: the VACUUMs were not necessarily executed as superuser, but as > whichever user happened to own the item dumped last by pg_dump in each > database. That would result in VACUUM skipping over tables it thought > it didn't have permission to vacuum --- like, say, all the system > tables. Perhaps this explains the failures that some people have > reported. > > Another day, another bug swatted ... Okay, before I package this up and annouce it ... I just posted something that I *think* is probably related, but just want to make sure ... if I do a pg_dump of a database, i'm getting an ERROR about 'Inhrel' Attribute not found ... not sure why, or if it matters ...
The Hermit Hacker <scrappy@hub.org> writes: > a pg_dump of a database, i'm getting an ERROR about 'Inhrel' Attribute not > found ... not sure why, or if it matters ... That's the sign of a 6.5 pg_dump applied to a 7.0 database, or was it the other way round? Anyway, check your PATH ... regards, tom lane
On Thu, 4 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > a pg_dump of a database, i'm getting an ERROR about 'Inhrel' Attribute not > > found ... not sure why, or if it matters ... > > That's the sign of a 6.5 pg_dump applied to a 7.0 database, or was it > the other way round? Anyway, check your PATH ... Okay, it would be v6.5. pg_dump applied to v7.0 database ... I thought it was me, but I suspect Jeff was on the old v6.5 server running pg_dump at the same time without me thinking to ask ... Am upgrading my server now, and will do RC5 tonight also ... Great work, once more :) Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
okay, something that I think needs to be clarified ... RC5 requires an initdb, so you have to do a pg_dumpall first, then initdb, then reload ... your recent fixes ... should we be running pg_dumpall from RC5 on our RC4 databases, or does it not matter? I'm using the RC5 one right now, and all appears correct, but I figured I'd ask ... On Thu, 4 May 2000, Tom Lane wrote: > >> Would a pg_upgrade fix this, or do I have to do a whole dump/reload? > > > pg_upgrade *should* work, but if I were you I'd make a backup dump > > first, just in case. I don't trust pg_upgrade very much... > > Oh, btw: pg_upgrade will *not* work to save and reload your group > definitions, because neither it nor pg_dumpall do anything at all with > pg_group! For that matter, a full dumpall/reload won't preserve > groups either! > > I griped about that a week or so ago, but no one seems to have picked up > on it. Do you want to consider that a "must fix" problem as well? > I think it's a pretty minor fix, but considering how late we are in the > cycle... > > regards, tom lane > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > okay, something that I think needs to be clarified ... RC5 requires an > initdb, so you have to do a pg_dumpall first, then initdb, then reload ... > your recent fixes ... should we be running pg_dumpall from RC5 on our RC4 > databases, or does it not matter? I'm using the RC5 one right now, and > all appears correct, but I figured I'd ask ... pg_upgrade should work, or at least it's worth trying --- see the message I just posted. If you have anything in pg_group then the best procedure is to use the RC5 pg_dumpall, since RC4 and before's pg_dumpall neglects to dump pg_group. In any case, RC4 and before's pg_upgrade is now known to be broken, so be sure you use RC5's script at that point. Or just use dump/initdb/reload, but it'd be nice to get some pounding on pg_upgrade and find out if it's trustworthy now. I'd definitely recommend a full pg_dumpall before experimenting with pg_upgrade, just in case things go worng ;-) regards, tom lane
> I wrote: > > It occurred to me that this would be the likely symptom if a CREATE > > GROUP or ALTER GROUP command had neglected to update the indexes on > > pg_group. However, I can't spot any such problem in the code nor > > cause it to happen by hand. Anyone else have an idea? > > After further looking, I notice that users.c is one of the few places > that will drop AccessExclusiveLock at heap_close time rather than > holding it till xact commit. I wonder whether this is a bug... > it could allow another backend to get in and start vacuuming the file > before our updates have committed. I am not sure that vacuum would do > the wrong thing in that case, but maybe so. Comments anyone (Vadim?) > > I also notice that there definitely is a glaring bug there: > write_password_file() leaks one kernel file descriptor each time it runs > (note the creat() call). Alter enough pg_shadow entries in one session > and your backend stops working. I think this is a "must fix" problem > --- any objections? Please fix both, I think. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
okay, I did the RC5 pg_dumpall and am doing a reload ... call me chicken :) On Fri, 5 May 2000, Tom Lane wrote: > The Hermit Hacker <scrappy@hub.org> writes: > > okay, something that I think needs to be clarified ... RC5 requires an > > initdb, so you have to do a pg_dumpall first, then initdb, then reload ... > > > your recent fixes ... should we be running pg_dumpall from RC5 on our RC4 > > databases, or does it not matter? I'm using the RC5 one right now, and > > all appears correct, but I figured I'd ask ... > > pg_upgrade should work, or at least it's worth trying --- see the > message I just posted. If you have anything in pg_group then the > best procedure is to use the RC5 pg_dumpall, since RC4 and before's > pg_dumpall neglects to dump pg_group. In any case, RC4 and before's > pg_upgrade is now known to be broken, so be sure you use RC5's script > at that point. > > Or just use dump/initdb/reload, but it'd be nice to get some pounding > on pg_upgrade and find out if it's trustworthy now. > > I'd definitely recommend a full pg_dumpall before experimenting with > pg_upgrade, just in case things go worng ;-) > > regards, tom lane > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > Why does pg_group exist under $PGDATA though the indexes exist > > under each $PGDATA/base/db_name ? > > Could it be consistent on all databases ? > > Oh my, I think you've got it! The indexes must be SharedSystemRelations!! > ... > > pg_shadow would have the same problem if it had indices, which I thought > it did but they seem to have been disabled. > > Can you say "one more initdb"? I knew you could... You know, I am reading through this message, and thinking, gee, how is he going to get out of this without initdb? Well I have my answer. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> After further looking, I notice that users.c is one of the few places >> that will drop AccessExclusiveLock at heap_close time rather than >> holding it till xact commit. I wonder whether this is a bug... >> it could allow another backend to get in and start vacuuming the file >> before our updates have committed. I am not sure that vacuum would do >> the wrong thing in that case, but maybe so. Comments anyone (Vadim?) >> >> I also notice that there definitely is a glaring bug there: >> write_password_file() leaks one kernel file descriptor each time it runs >> (note the creat() call). Alter enough pg_shadow entries in one session >> and your backend stops working. I think this is a "must fix" problem >> --- any objections? > Please fix both, I think. I fixed the file descriptor leak, which was pretty simple and easily checked. After looking at VACUUM I think it will behave reasonably with a not-yet-committed tuple, so I left that issue alone for now. My current thought on releasing locks at heap_close time is that it's probably best to release the lock if and only if your use of the table was strictly read-only. If you modified the table then your lock should be held till xact commit. I'm not sure about that rule however. Any comments? regards, tom lane
> > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > >> Why does pg_group exist under $PGDATA though the indexes exist > >> under each $PGDATA/base/db_name ? > >> Could it be consistent on all databases ? > > > Oh my, I think you've got it! The indexes must be SharedSystemRelations!! > > Yup, Hiroshi has spotted the problem. Turning the indexes on pg_group > into shared relations fixes the cross-database misbehavior shown in my > prior message, and I'll bet this bug explains Marc's report too. > > We never had any indexes on pg_group (or any shared relation) before, > which is why we hadn't seen this kind of failure before. (Another > limitation of the regression tests exposed --- they don't test > cross-database behaviors.) Oh, so it was my indexes on system relations, and because pg_group is a shared one, there was a problem. > So, now what? This is a simple fix, but it will require initdb (or at > least pg_upgrade), which I'd really rather not do at this point in the > release cycle. But I'm not sure we have any choice. As it stands, > pg_group is broken. > > If we are going to have to force a new initdb here, we probably ought > to reconsider a couple of recent past decisions that were put off on > grounds of "we don't want another initdb before 7.0". I'm thinking of > the remaining ODBC support functions and the new LIKE estimator in > particular. Do we want to revisit those decisions, or leave well enough > alone? Well, pg_group is not used very much. Can we disable the cache from using the index? I think that would get us out of the problem. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Well, pg_group is not used very much. Can we disable the cache from > using the index? I think that would get us out of the problem. But they're unique indexes ... can you guarantee that a unique index that's only getting told about some of the updates to its table is not going to kick out any inappropriate errors? It might be OK but it doesn't give me a warm feeling. I think we were kinda stuck on this one. The correct fix is clear and I don't think we'd have been doing anyone any favors by trying to invent a workaround. I'm just glad we found it now and not a few days *after* 7.0 release... regards, tom lane
> That's a fair objection for the LIKE estimator, which after all hasn't > gotten much testing. I'll leave well enough alone there. But those > missing ODBC functions are just another dozen SQL-function entries for > pg_proc; hard to see how they can break anything else, even if (worst > case) they're wrong themselves ... Agreed on the ODBC. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> The Hermit Hacker <scrappy@hub.org> writes: > >> Can you say "one more initdb"? I knew you could... > > > Would a pg_upgrade fix this, or do I have to do a whole dump/reload? > > pg_upgrade *should* work, but if I were you I'd make a backup dump > first, just in case. I don't trust pg_upgrade very much... Aw, badmouthing pg_upgrade again. :-) -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> >> Would a pg_upgrade fix this, or do I have to do a whole dump/reload? > > > pg_upgrade *should* work, but if I were you I'd make a backup dump > > first, just in case. I don't trust pg_upgrade very much... > > Oh, btw: pg_upgrade will *not* work to save and reload your group > definitions, because neither it nor pg_dumpall do anything at all with > pg_group! For that matter, a full dumpall/reload won't preserve > groups either! > > I griped about that a week or so ago, but no one seems to have picked up > on it. Do you want to consider that a "must fix" problem as well? > I think it's a pretty minor fix, but considering how late we are in the > cycle... Please give me something to add to the TODO. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> I griped about that a week or so ago, but no one seems to have picked up >> on it. Do you want to consider that a "must fix" problem as well? >> I think it's a pretty minor fix, but considering how late we are in the >> cycle... > Please give me something to add to the TODO. Try "DONE" ;-) regards, tom lane
> Hiroshi did commit a change to doc/src/sgml/ref/postgres-ref.sgml for > that just a few days ago, but I guess it didn't make it into the derived > files that Thomas was preparing off-line. I believe there were several > other small doco fixes while Thomas was on vacation --- I dunno if he > plans to try to get those into the 7.0 derived docs or not. I'll move a new copy of the html into users-lounge before release, and will try poking at the hardcopy versions too (but I consider the hardcopy fixes noncritical since we are so far past the "freeze" point). Depends on the actual release schedule as to whether they will get in to the hardcopy; the pg_group stuff doesn't open a large can of worms, but we might want a few days for it to settle out?? btw, I'm all for fixing pg_group (it is really nonoptional imho) and the ODBC stuff is noncritical since loading the contrib/odbc/odbc.sql is sufficient to get the functions or operators defined since the support code is already in the backend. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
> The Hermit Hacker <scrappy@hub.org> writes: > >> I griped about that a week or so ago, but no one seems to have picked up > >> on it. Do you want to consider that a "must fix" problem as well? > >> I think it's a pretty minor fix, but considering how late we are in the > >> cycle... > > > considering where the problem is, I think that if it can be safely done, > > please do it ... > > Done and done. I also realized that pg_upgrade had another nasty bug > in it: the VACUUMs were not necessarily executed as superuser, but as > whichever user happened to own the item dumped last by pg_dump in each > database. That would result in VACUUM skipping over tables it thought > it didn't have permission to vacuum --- like, say, all the system > tables. Perhaps this explains the failures that some people have > reported. > > Another day, another bug swatted ... Yes, good eye. Thanks. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
On Fri, 5 May 2000, Thomas Lockhart wrote: > > Hiroshi did commit a change to doc/src/sgml/ref/postgres-ref.sgml for > > that just a few days ago, but I guess it didn't make it into the derived > > files that Thomas was preparing off-line. I believe there were several > > other small doco fixes while Thomas was on vacation --- I dunno if he > > plans to try to get those into the 7.0 derived docs or not. > > I'll move a new copy of the html into users-lounge before release, and > will try poking at the hardcopy versions too (but I consider the > hardcopy fixes noncritical since we are so far past the "freeze" > point). > > Depends on the actual release schedule as to whether they will get in > to the hardcopy; the pg_group stuff doesn't open a large can of worms, > but we might want a few days for it to settle out?? I have no probs with extending the release a couple of days ... Wednesday sound good for everyone, as things stand now? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
> > okay, something that I think needs to be clarified ... RC5 requires an > initdb, so you have to do a pg_dumpall first, then initdb, then reload ... > > your recent fixes ... should we be running pg_dumpall from RC5 on our RC4 > databases, or does it not matter? I'm using the RC5 one right now, and > all appears correct, but I figured I'd ask ... Doesn't matter, except that RC5 dumps groups properly. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Done and done. I also realized that pg_upgrade had another nasty bug >> in it: the VACUUMs were not necessarily executed as superuser, but as >> whichever user happened to own the item dumped last by pg_dump in each >> database. That would result in VACUUM skipping over tables it thought >> it didn't have permission to vacuum --- like, say, all the system >> tables. Perhaps this explains the failures that some people have >> reported. >> >> Another day, another bug swatted ... > Yes, good eye. Thanks. BTW, I believe that this was actually a bug I created --- in prior releases VACUUM would happily vacuum anything in sight, whether you had privileges to lock the target tables or not. So pg_upgrade didn't have to worry before. But it does now... regards, tom lane
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: > btw, I'm all for fixing pg_group (it is really nonoptional imho) and > the ODBC stuff is noncritical since loading the contrib/odbc/odbc.sql > is sufficient to get the functions or operators defined since the > support code is already in the backend. OK, I thought you might want to argue more strongly for adding the ODBC stuff while we had the chance ... but if you are happy with the status quo then I'm happy. regards, tom lane
> OK, I thought you might want to argue more strongly for adding the > ODBC stuff while we had the chance ... but if you are happy with the > status quo then I'm happy. No, I've already been embarrassed once recently; don't want to touch anything in the last few days. Especially since I've been *so* busy at work and with the docs that it will be hard to give it enough attention (even though in principle it is pretty trivial). - Tom -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
> I have no probs with extending the release a couple of days ... Wednesday > sound good for everyone, as things stand now? Sigh... we are going to make a free commemoration CD-ROM (yes, we are going to have a general meeting of local PostgreSQL user's group in Japan. It's almost one year since we established it. Visit http://www.jp.postgresql.org/ if you can read Japanese) with the brand new PostgreSQL 7.0 source in it of course, and the deadline to make a master CD-R is scheduled to 5/8... -- Tatsuo Ishii
> > Hiroshi did commit a change to doc/src/sgml/ref/postgres-ref.sgml for > > that just a few days ago, but I guess it didn't make it into the derived > > files that Thomas was preparing off-line. I believe there were several > > other small doco fixes while Thomas was on vacation --- I dunno if he > > plans to try to get those into the 7.0 derived docs or not. > > I'll move a new copy of the html into users-lounge before release, and > will try poking at the hardcopy versions too (but I consider the > hardcopy fixes noncritical since we are so far past the "freeze" > point). Thomas, do you want PDF's rather than postscript files? Let me know. I can easily download them and generate PDF's. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Peter Eisentraut <e99re41@DoCS.UU.SE> writes: >> After further looking, I notice that users.c is one of the few places >> that will drop AccessExclusiveLock at heap_close time rather than >> holding it till xact commit. I wonder whether this is a bug... > Well it was you who did it while introducing that second argument in the > first place. I think the safest thing to do is definitely to hold any lock > until transaction end. I'm not sure, shouldn't the transaction isolation > level apply here as well? In the end, is there ever a good reason for > releasing a lock in heap_close? I provided that hook because of stuff like the LISTEN/NOTIFY support, which grabs and releases locks on pg_listener --- as it's presently designed, not releasing the lock at end of statement could hang the entire system until you commit your transaction. (Probably LISTEN/NOTIFY should be re-examined to see if it couldn't use a less drastic lock than AccessExclusiveLock. That code hasn't been gone over since before MVCC.) Also, I still think that in general it's OK to release a read lock on a system table early. Write locks, maybe not. >> I also notice that there definitely is a glaring bug there: >> write_password_file() leaks one kernel file descriptor each time it runs >> (note the creat() call). > Wow, this has been there for over two years. Yeah, a long time :-(. It's not the first resource leak we've found in the password-related code, too. I wonder if there are more... regards, tom lane
> -----Original Message----- > From: lockhart@sd.tpf.co.jp [mailto:lockhart@sd.tpf.co.jp]On > > > Hiroshi did commit a change to doc/src/sgml/ref/postgres-ref.sgml for > > that just a few days ago, but I guess it didn't make it into the derived > > files that Thomas was preparing off-line. I believe there were several > > other small doco fixes while Thomas was on vacation --- I dunno if he > > plans to try to get those into the 7.0 derived docs or not. > > I'll move a new copy of the html into users-lounge before release, and > will try poking at the hardcopy versions too (but I consider the > hardcopy fixes noncritical since we are so far past the "freeze" > point). > I'm sorry to inconveniece you. In addition I have to mention that I gave up to confirm sgml format myself. I apologize again that I have inconvenienced you. Thanks. Hiroshi inoue Inoue@tpf.co.jp
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > > The Hermit Hacker <scrappy@hub.org> writes: > >> I griped about that a week or so ago, but no one seems to have > picked up > >> on it. Do you want to consider that a "must fix" problem as well? > >> I think it's a pretty minor fix, but considering how late we are in the > >> cycle... > > > considering where the problem is, I think that if it can be safely done, > > please do it ... > > Done and done. I also realized that pg_upgrade had another nasty bug > in it: the VACUUMs were not necessarily executed as superuser, but as > whichever user happened to own the item dumped last by pg_dump in each > database. That would result in VACUUM skipping over tables it thought > it didn't have permission to vacuum --- like, say, all the system > tables. Perhaps this explains the failures that some people have > reported. > > Another day, another bug swatted ... > If I remember correctly,pg_upgrade doesn't shutdown the postmaster after(or before) moving OLD data to the target dir though it tells us the message "You must stop/start the postmaster ...". How about calling pg_ctl from pg_upgrade to stop the postmaster ? Regards. Hiroshi Inoue Inoue@tpf.co.jp
> I'm sorry to inconveniece you. > In addition I have to mention that I gave up to confirm sgml format myself. > I apologize again that I have inconvenienced you. No, it is better to have the right info *somewhere* than to wait to put it in later. Even if I miss something now, it will be correct next time, which is the best thing. Regards. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > If I remember correctly,pg_upgrade doesn't shutdown the postmaster > after(or before) moving OLD data to the target dir though it tells us > the message "You must stop/start the postmaster ...". > How about calling pg_ctl from pg_upgrade to stop the postmaster ? What I would actually like to see happen is that pg_upgrade uses a standalone backend, with no postmaster running at all for the entire procedure. Having a live postmaster connected to the system just opens the door to getting screwed up by some other user connecting to the database. But that's a bigger change than I dare try to make right now... regards, tom lane
> If I remember correctly,pg_upgrade doesn't shutdown the postmaster > after(or before) moving OLD data to the target dir though it tells us > the message "You must stop/start the postmaster ...". > How about calling pg_ctl from pg_upgrade to stop the postmaster ? Added to TODO list. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
> If I remember correctly,pg_upgrade doesn't shutdown the postmaster > after(or before) moving OLD data to the target dir though it tells us > the message "You must stop/start the postmaster ...". > How about calling pg_ctl from pg_upgrade to stop the postmaster ? Great idea. We never had an automated way to do that before. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> How about calling pg_ctl from pg_upgrade to stop the postmaster ? > Great idea. We never had an automated way to do that before. I'd just as soon *not* have pg_upgrade assuming that it knows how I like my postmaster started. If pg_ctl will work to stop a postmaster whether or not it was started with pg_ctl, then it'd probably be a good idea to use pg_ctl to ensure that there is no postmaster running in the target database. I'd suggest not trying to restart the postmaster automatically afterwards, though. Too much site dependency in that. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> How about calling pg_ctl from pg_upgrade to stop the postmaster ? > > > Great idea. We never had an automated way to do that before. > > I'd just as soon *not* have pg_upgrade assuming that it knows how > I like my postmaster started. If pg_ctl will work to stop a postmaster > whether or not it was started with pg_ctl, then it'd probably be a good > idea to use pg_ctl to ensure that there is no postmaster running in the > target database. I'd suggest not trying to restart the postmaster > automatically afterwards, though. Too much site dependency in that. But doesn't it find the params from the last startup? -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> I'd suggest not trying to restart the postmaster >> automatically afterwards, though. Too much site dependency in that. > But doesn't it find the params from the last startup? What last startup? This may be a virgin database we're talking about (probably would be, if I get my way about not using a postmaster at all during pg_upgrade). More to the point, people who are using system-boot-time scripts to start postgres may expect their postmasters to be started in a different environment than what pg_ctl would produce. (Just because pg_ctl is available is not a good reason to assume that people are using it, particularly not existing dbadmins who will have developed their own procedures.) The environment issue is potentially a pretty nasty gotcha; you'll recall the problem reports we've heard in the past that turned out to trace to different settings of LOCALE or whathaveyou between postmasters started by hand and postmasters started by scripts. Also, IIRC, pg_ctl doesn't currently support sending the postmaster log anywhere but /dev/null, which will annoy at least some people ;-). One might also guess that some sites run their postmasters with higher or lower process priority than normal, or several other things that pg_ctl knows nothing about. So I think it's not really a good idea to wire use of pg_ctl into other tools just yet. Maybe after pg_ctl has been around for a few releases... As I said, I see no harm in using pg_ctl to *stop* a postmaster, if it can do that. I just don't want to have it used automatically to *start* one. regards, tom lane
> What last startup? This may be a virgin database we're talking about > (probably would be, if I get my way about not using a postmaster at > all during pg_upgrade). More to the point, people who are using > system-boot-time scripts to start postgres may expect their postmasters > to be started in a different environment than what pg_ctl would produce. > (Just because pg_ctl is available is not a good reason to assume that > people are using it, particularly not existing dbadmins who will have > developed their own procedures.) > > The environment issue is potentially a pretty nasty gotcha; you'll > recall the problem reports we've heard in the past that turned out to > trace to different settings of LOCALE or whathaveyou between postmasters > started by hand and postmasters started by scripts. Also, IIRC, pg_ctl > doesn't currently support sending the postmaster log anywhere but > /dev/null, which will annoy at least some people ;-). One might also > guess that some sites run their postmasters with higher or lower process > priority than normal, or several other things that pg_ctl knows nothing > about. Tom is right except that pg_ctl sends postmaster log to /dev/null. pg_ctl actulally does nothing with it. So if postmaster sends log to stdout, then it is sent to stdout. If postmaster is invoked with -S option, logs are sent to /dev/null. That all. BTW, currently the best way to get logs from postmaster woul be compiling it with USE_SYSLOG (why is this not enabled by configure?), and sending log to syslog, IMHO. Sending logs to a plain file is just too dumb, like infinite log file size, to rotate log we have to stop postmaster... > So I think it's not really a good idea to wire use of pg_ctl into other > tools just yet. Maybe after pg_ctl has been around for a few releases... Maybe the way we should go is that pg_ctl sends signal to postmaster and let it restart by itself. This would solve part of the problems Tom mentioned, at least environment variables ones. (BTW is it really possible for postmaster to remember its process priority?) > As I said, I see no harm in using pg_ctl to *stop* a postmaster, if it > can do that. Definitely it can, if the postmaster is 7.0 or later. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > (BTW is it really possible for postmaster to remember its process > priority?) No, but a site might have a startup script that nice's the postmaster to a desired priority. > BTW, currently the best way to get logs from postmaster woul be > compiling it with USE_SYSLOG (why is this not enabled by configure?), Good question. Before we standardize on that, however, some testing might be in order. I haven't tried stuffing multimegabyte querytree dumps into syslog ... will it work? On how many platforms? The syslog applications I've seen never write more than a hundred or so characters per log entry, so I'm a tad nervous about assuming that we will get reasonable behavior for large messages... regards, tom lane
> -----Original Message----- > From: Bruce Momjian [mailto:pgman@candle.pha.pa.us] > > > If I remember correctly,pg_upgrade doesn't shutdown the postmaster > > after(or before) moving OLD data to the target dir though it tells us > > the message "You must stop/start the postmaster ...". > > How about calling pg_ctl from pg_upgrade to stop the postmaster ? > > Added to TODO list. > Hmm,what I meant is to simply insert a command call 'pg_ctl -w stop' before moving old data to the target directory in pg_upgrade. It seems dangerous to leave the postmaster alive that doesn't know the changes on empty tables created by pg_upgrade. Regards. Hiroshi Inoue Inoue@tpf.co.jp
Tom Lane writes: > > BTW, currently the best way to get logs from postmaster woul be > > compiling it with USE_SYSLOG (why is this not enabled by configure?), An --enable-syslog switch will be in the next release unless someone protests. > Good question. Before we standardize on that, however, some testing > might be in order. I haven't tried stuffing multimegabyte querytree > dumps into syslog ... will it work? On how many platforms? The > syslog applications I've seen never write more than a hundred or so > characters per log entry, so I'm a tad nervous about assuming that we > will get reasonable behavior for large messages... Indeed. My syslog becomes a little erratic with messages longer that half a screen or so. Nothing critical but it splits up the message in separate chunks together with `Cannot glue message parts together' entries. It also sends out an empty log message to everyone's terminal. I recall that while doing some work for getting exim acquainted with syslog we have had the same problems with long messages. Also, I bet this syslog implementation (claims to be derived from "stock BSD") is very widespread. It's debatable whether these sort of things are appropriate for syslog, though. Who cares about the parse tree the next day? -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > An --enable-syslog switch will be in the next release unless someone > protests. Seems reasonable. >> Good question. Before we standardize on that, however, some testing >> might be in order. I haven't tried stuffing multimegabyte querytree >> dumps into syslog ... will it work? On how many platforms? > It's debatable whether these sort of things are appropriate for syslog, > though. Who cares about the parse tree the next day? Yeah. We really need multiple log destinations to do this right. Tim Holloway had some ideas about that last year, which I thought were overkill at the time, but I'm beginning to see his point... regards, tom lane