Thread: pg_upgrade should truncate/remove its logs before running
I have seen this numerous times but had not dug into it, until now. If pg_upgrade fails and is re-run, it appends to its logfiles, which is confusing since, if it fails again, it then looks like the original error recurred and wasn't fixed. The "append" behavior dates back to 717f6d608. I think it should either truncate the logfiles, or error early if any of the files exist. Or it could put all its output files into a newly-created subdirectory. Or this message could be output to the per-db logfiles, and not just the static ones: | "pg_upgrade run on %s". For the per-db logfiels with OIDs in their name, changing open() from "append" mode to truncate mode doesn't work, since they're written to in parallel. They have to be removed/truncated in advance. This is one possible fix. You can test its effect by deliberately breaking one of the calls to exec_progs(), like this. - "\"%s/pg_restore\" %s %s --exit-on-error --verbose " + "\"%s/pg_restore\" %s %s --exit-on-error --verboose "
Attachment
On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote: > I have seen this numerous times but had not dug into it, until now. > > If pg_upgrade fails and is re-run, it appends to its logfiles, which is > confusing since, if it fails again, it then looks like the original error > recurred and wasn't fixed. The "append" behavior dates back to 717f6d608. > > I think it should either truncate the logfiles, or error early if any of the > files exist. Or it could put all its output files into a newly-created > subdirectory. Or this message could be output to the per-db logfiles, and not > just the static ones: > | "pg_upgrade run on %s". > > For the per-db logfiels with OIDs in their name, changing open() from "append" > mode to truncate mode doesn't work, since they're written to in parallel. > They have to be removed/truncated in advance. > > This is one possible fix. You can test its effect by deliberately breaking one > of the calls to exec_progs(), like this. > > - "\"%s/pg_restore\" %s %s --exit-on-error --verbose " > + "\"%s/pg_restore\" %s %s --exit-on-error --verboose " Uh, the database server doesn't erase its logs on crash/failure, so why should pg_upgrade do that? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
On Wed, Dec 15, 2021 at 04:09:16PM -0500, Bruce Momjian wrote: > On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote: > > I have seen this numerous times but had not dug into it, until now. > > > > If pg_upgrade fails and is re-run, it appends to its logfiles, which is > > confusing since, if it fails again, it then looks like the original error > > recurred and wasn't fixed. The "append" behavior dates back to 717f6d608. > > > > I think it should either truncate the logfiles, or error early if any of the > > files exist. Or it could put all its output files into a newly-created > > subdirectory. Or this message could be output to the per-db logfiles, and not > > just the static ones: > > | "pg_upgrade run on %s". > > > > For the per-db logfiels with OIDs in their name, changing open() from "append" > > mode to truncate mode doesn't work, since they're written to in parallel. > > They have to be removed/truncated in advance. > > > > This is one possible fix. You can test its effect by deliberately breaking one > > of the calls to exec_progs(), like this. > > > > - "\"%s/pg_restore\" %s %s --exit-on-error --verbose " > > + "\"%s/pg_restore\" %s %s --exit-on-error --verboose " > > Uh, the database server doesn't erase its logs on crash/failure, so why > should pg_upgrade do that? To avoid the presence of irrelevant errors from the previous invocation of pg_upgrade. Maybe you would prefer one of my other ideas , like "put all its output files into a newly-created subdirectory" ? -- Justin
Bruce Momjian <bruce@momjian.us> writes: > On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote: >> If pg_upgrade fails and is re-run, it appends to its logfiles, which is >> confusing since, if it fails again, it then looks like the original error >> recurred and wasn't fixed. The "append" behavior dates back to 717f6d608. > Uh, the database server doesn't erase its logs on crash/failure, so why > should pg_upgrade do that? The server emits enough information so that it's not confusing: there are timestamps, and there's an identifiable startup line. pg_upgrade does neither. If you don't want to truncate as Justin suggests, you should do that instead. Personally I like the idea of making a timestamped subdirectory and dropping all the files in that, because the thing that most annoys *me* about pg_upgrade is the litter it leaves behind in $CWD. A subdirectory would make it far easier to mop up the mess. regards, tom lane
On 12/15/21 16:23, Bruce Momjian wrote: > On Wed, Dec 15, 2021 at 04:17:23PM -0500, Tom Lane wrote: >> Bruce Momjian <bruce@momjian.us> writes: >>> On Sat, Dec 11, 2021 at 08:50:17PM -0600, Justin Pryzby wrote: >>>> If pg_upgrade fails and is re-run, it appends to its logfiles, which is >>>> confusing since, if it fails again, it then looks like the original error >>>> recurred and wasn't fixed. The "append" behavior dates back to 717f6d608. >>> Uh, the database server doesn't erase its logs on crash/failure, so why >>> should pg_upgrade do that? >> The server emits enough information so that it's not confusing: >> there are timestamps, and there's an identifiable startup line. >> pg_upgrade does neither. If you don't want to truncate as >> Justin suggests, you should do that instead. >> >> Personally I like the idea of making a timestamped subdirectory >> and dropping all the files in that, because the thing that most >> annoys *me* about pg_upgrade is the litter it leaves behind in >> $CWD. A subdirectory would make it far easier to mop up the mess. > Yes, lot of litter. Putting it in a subdirectory makes a lot of sense. > Justin, do you want to work on that patch, since you had an earlier > version to fix this? > The directory name needs to be predictable somehow, or maybe optionally set as a parameter. Having just a timestamped directory name would make life annoying for a poor buildfarm maintainer. Also, please don't change anything before I have a chance to adjust the buildfarm code to what is going to be done. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Wed, Dec 15, 2021 at 04:13:10PM -0600, Justin Pryzby wrote: > On Wed, Dec 15, 2021 at 05:04:54PM -0500, Andrew Dunstan wrote: >> The directory name needs to be predictable somehow, or maybe optionally >> set as a parameter. Having just a timestamped directory name would make >> life annoying for a poor buildfarm maintainer. Also, please don't change >> anything before I have a chance to adjust the buildfarm code to what is >> going to be done. > > Feel free to suggest the desirable behavior. > It could write to pg_upgrade.log/* and refuse to run if the dir already exists. Andrew's point looks rather sensible to me. So, this stuff should have a predictable name (pg_upgrade.log, pg_upgrade_log or upgrade_log would be fine). But I would also add an option to be able to define a custom log path. The latter would be useful for the regression tests so as everything gets could get redirected to a path already filtered out. -- Michael
Attachment
On 16.12.21 02:39, Michael Paquier wrote: > On Wed, Dec 15, 2021 at 04:13:10PM -0600, Justin Pryzby wrote: >> On Wed, Dec 15, 2021 at 05:04:54PM -0500, Andrew Dunstan wrote: >>> The directory name needs to be predictable somehow, or maybe optionally >>> set as a parameter. Having just a timestamped directory name would make >>> life annoying for a poor buildfarm maintainer. Also, please don't change >>> anything before I have a chance to adjust the buildfarm code to what is >>> going to be done. >> >> Feel free to suggest the desirable behavior. >> It could write to pg_upgrade.log/* and refuse to run if the dir already exists. > > Andrew's point looks rather sensible to me. So, this stuff should > have a predictable name (pg_upgrade.log, pg_upgrade_log or upgrade_log > would be fine). But I would also add an option to be able to define a > custom log path. The latter would be useful for the regression tests > so as everything gets could get redirected to a path already filtered > out. Could we make it write just one log file? Is having multiple log files better?
> On 16 Dec 2021, at 12:11, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > Could we make it write just one log file? Is having multiple log files better? Having individual <checkname>.txt files from checks with additional information on how to handle the error are quite convenient when writing wrappers around pg_upgrade (speaking from experience of having written multiple pg_upgraade frontends). Parsing a single logfile is more work, and will break existing scripts. I'm in favor of a predictable by default logpath, with a parameter to override, as mentioned upthread. -- Daniel Gustafsson https://vmware.com/
On 19.01.22 09:13, Michael Paquier wrote: > - Renaming of the option from --logdir to --outputdir, as this does > not include only logs. That matches also better with default value > assigned in previous patches, aka pg_upgrade_output.d. I'm afraid that is too easily confused with the target directory. Generally, a tool processes data from input to output or from source to target or something like that, whereas a log is more clearly something separate from this main processing stream. The desired "output" of pg_upgrade is the upgraded cluster, after all. A wildcard idea is to put the log output into the target cluster.
On 1/28/22 08:42, Michael Paquier wrote: > On Wed, Jan 26, 2022 at 11:00:28AM +0900, Michael Paquier wrote: >> Bleh. This would point to the old data directory, so this needs to be >> "$self->{pgsql}/src/bin/pg_upgrade/tmp_check/data/pg_upgrade_output.d/log/*.log" >> to point to the upgraded cluster. > Please note that I have sent a patch to merge this change in the > buildfarm code. Comments are welcome. I have committed this. But it will take time to get every buildfarm own to upgrade. I will try to make a new release ASAP. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Michael Paquier <michael@paquier.xyz> writes: > So, it took me some time to get back to this thread, and looked at it > for the last couple of days... The buildfarm client v14 has been > released on the 29th of January, which means that we are good to go. As already mentioned, there's been no notice to buildfarm owners ... so has Andrew actually made a release? regards, tom lane
Michael Paquier <michael@paquier.xyz> writes: > On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote: >> As already mentioned, there's been no notice to buildfarm owners ... >> so has Andrew actually made a release? > There has been one as of 8 days ago: > https://github.com/PGBuildFarm/client-code/releases [ scrapes buildfarm logs ... ] Not even Andrew's own buildfarm critters are using it, so permit me leave to doubt that he thinks it's fully baked. Andrew? regards, tom lane
On 2/6/22 02:17, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: >> On Sun, Feb 06, 2022 at 01:58:21AM -0500, Tom Lane wrote: >>> As already mentioned, there's been no notice to buildfarm owners ... >>> so has Andrew actually made a release? >> There has been one as of 8 days ago: >> https://github.com/PGBuildFarm/client-code/releases > [ scrapes buildfarm logs ... ] > > Not even Andrew's own buildfarm critters are using it, so > permit me leave to doubt that he thinks it's fully baked. > > Andrew? > > *sigh* Sometimes I have a mind like a sieve. I prepped the release a few days ago and meant to come back the next morning and send out emails announcing it, as well as rolling it out to my animals, and got diverted so that didn't happen and it slipped my mind. I'll go and do those things now. But the commit really shouldn't have happened until we know that most buildfarm owners have installed it. It should have waited wait not just for the release but for widespread deployment. Otherwise we will just lose any logging for an error that might appear. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Michael Paquier <michael@paquier.xyz> writes: > On Sun, Feb 06, 2022 at 08:32:59AM -0500, Andrew Dunstan wrote: >> But the commit really shouldn't have happened until we know that most >> buildfarm owners have installed it. It should have waited wait not just >> for the release but for widespread deployment. Otherwise we will just >> lose any logging for an error that might appear. > Would it be better if I just revert the change for now then and do it > again in one/two weeks? I don't see a need to revert it. I note, though, that there's still not been any email to the buildfarm owners list about this update. regards, tom lane
> On Feb 6, 2022, at 7:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Michael Paquier <michael@paquier.xyz> writes: >>> On Sun, Feb 06, 2022 at 08:32:59AM -0500, Andrew Dunstan wrote: >>> But the commit really shouldn't have happened until we know that most >>> buildfarm owners have installed it. It should have waited wait not just >>> for the release but for widespread deployment. Otherwise we will just >>> lose any logging for an error that might appear. > >> Would it be better if I just revert the change for now then and do it >> again in one/two weeks? > > I don't see a need to revert it. > > I note, though, that there's still not been any email to the buildfarm > owners list about this update. > > It’s stuck in moderation Cheers Andrew
On 2/6/22 19:39, Tom Lane wrote: > Michael Paquier <michael@paquier.xyz> writes: >> On Sun, Feb 06, 2022 at 08:32:59AM -0500, Andrew Dunstan wrote: >>> But the commit really shouldn't have happened until we know that most >>> buildfarm owners have installed it. It should have waited wait not just >>> for the release but for widespread deployment. Otherwise we will just >>> lose any logging for an error that might appear. >> Would it be better if I just revert the change for now then and do it >> again in one/two weeks? > I don't see a need to revert it. > > I note, though, that there's still not been any email to the buildfarm > owners list about this update. > > The announcement was held up in list moderation for 20 hours or so. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com