Thread: Is analyze_new_cluster.sh still useful?
Hi, now that we have vacuumdb --all --analyze-in-stages in 9.4, wouldn't it make sense to get rid of the analyze_new_cluster.sh file which pg_upgrade writes? The net content is a single line which could as well be printed by pg_upgrade itself. Instead of an lengthy explanation how to invoke that manually, there should be a short note and a pointer to some manual section. I think the chances of people reading that would even be increased. Similary, I don't really see the usefulness of delete_old_cluster.sh as a file, when "rm -rf" could just be presented on the console for the admin to execute by cut-and-paste. Christoph -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
Christoph Berg <christoph.berg@credativ.de> writes: > now that we have vacuumdb --all --analyze-in-stages in 9.4, wouldn't > it make sense to get rid of the analyze_new_cluster.sh file which > pg_upgrade writes? The net content is a single line which could as > well be printed by pg_upgrade itself. Instead of an lengthy > explanation how to invoke that manually, there should be a short note > and a pointer to some manual section. I think the chances of people > reading that would even be increased. > Similary, I don't really see the usefulness of delete_old_cluster.sh > as a file, when "rm -rf" could just be presented on the console for > the admin to execute by cut-and-paste. There are contexts where pg_upgrade is executed by some wrapper script and the user doesn't normally see its output directly. This is the case in the Red Hat packaging (unless Honza changed it since I left ;-)) and I think Debian might be similar. I generally don't like the amount of cruft pg_upgrade leaves lying around, so I'd be glad to see these script files go away if possible; but we need to think about how this will play when there's a wrapper script between pg_upgrade and the human user. In the Red Hat wrapper script, the pg_upgrade output is dumped into a log file, which the user can look at if he wants, but I'd bet the average user doesn't read it --- that was certainly the expectation. Of course, said user probably never notices the separate shell scripts either, so maybe it's a wash. Another angle is that some folks might have tried to automate things even more, with a wrapper script that starts up the new postmaster and runs analyze_new_cluster.sh all by itself. I guess they could make the wrapper do "vacuumdb --all --analyze-in-stages" directly, though, so maybe that's not a fatal objection either. regards, tom lane
On 2014-06-18 12:51:43 -0400, Tom Lane wrote: > Another angle is that some folks might have tried to automate things > even more, with a wrapper script that starts up the new postmaster > and runs analyze_new_cluster.sh all by itself. I guess they could > make the wrapper do "vacuumdb --all --analyze-in-stages" directly, > though, so maybe that's not a fatal objection either. Wouldn't that be quite counterproductive? The reason we don't normally do that and why --analyze-in-stages exists is that the cluster should be started up as fast as possible. Restarting it after ANALYZE went through would be defeating that purpose, no? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-18 12:51:43 -0400, Tom Lane wrote: >> Another angle is that some folks might have tried to automate things >> even more, with a wrapper script that starts up the new postmaster >> and runs analyze_new_cluster.sh all by itself. I guess they could >> make the wrapper do "vacuumdb --all --analyze-in-stages" directly, >> though, so maybe that's not a fatal objection either. > Wouldn't that be quite counterproductive? The reason we don't normally > do that and why --analyze-in-stages exists is that the cluster should be > started up as fast as possible. Restarting it after ANALYZE went through > would be defeating that purpose, no? How so? Once you've started the postmaster, you're open for business, no? regards, tom lane
On 2014-06-18 13:24:14 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-06-18 12:51:43 -0400, Tom Lane wrote: > >> Another angle is that some folks might have tried to automate things > >> even more, with a wrapper script that starts up the new postmaster > >> and runs analyze_new_cluster.sh all by itself. I guess they could > >> make the wrapper do "vacuumdb --all --analyze-in-stages" directly, > >> though, so maybe that's not a fatal objection either. > > > Wouldn't that be quite counterproductive? The reason we don't normally > > do that and why --analyze-in-stages exists is that the cluster should be > > started up as fast as possible. Restarting it after ANALYZE went through > > would be defeating that purpose, no? > > How so? Once you've started the postmaster, you're open for business, > no? Wasn't there lots of talk about making the server inaccessible while pg_upgrade is doing its thing? Also, many people are desparately unhappy if postgres has to be restarted (to return to be being OS managed) after their application already has connected. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-18 13:24:14 -0400, Tom Lane wrote: >> Andres Freund <andres@2ndquadrant.com> writes: >>> On 2014-06-18 12:51:43 -0400, Tom Lane wrote: >>>> Another angle is that some folks might have tried to automate things >>>> even more, with a wrapper script that starts up the new postmaster >>>> and runs analyze_new_cluster.sh all by itself. I guess they could >>>> make the wrapper do "vacuumdb --all --analyze-in-stages" directly, >>>> though, so maybe that's not a fatal objection either. >>> Wouldn't that be quite counterproductive? The reason we don't normally >>> do that and why --analyze-in-stages exists is that the cluster should be >>> started up as fast as possible. Restarting it after ANALYZE went through >>> would be defeating that purpose, no? >> How so? Once you've started the postmaster, you're open for business, >> no? > Wasn't there lots of talk about making the server inaccessible while > pg_upgrade is doing its thing? Also, many people are desparately unhappy > if postgres has to be restarted (to return to be being OS managed) after > their application already has connected. I think we're not on the same page. My point is that someone might want to automate the whole sequence: stop old postmaster, run pg_upgrade, start the updated postmaster normally (hence it *is* open for business), kick off the analyze runs. If you're concerned about minimal downtime you would not want to be waiting around for the admin to issue a perfectly predictable series of commands. regards, tom lane
On 2014-06-18 13:54:16 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-06-18 13:24:14 -0400, Tom Lane wrote: > >> Andres Freund <andres@2ndquadrant.com> writes: > >>> On 2014-06-18 12:51:43 -0400, Tom Lane wrote: > >>>> Another angle is that some folks might have tried to automate things > >>>> even more, with a wrapper script that starts up the new postmaster > >>>> and runs analyze_new_cluster.sh all by itself. I guess they could > >>>> make the wrapper do "vacuumdb --all --analyze-in-stages" directly, > >>>> though, so maybe that's not a fatal objection either. > > >>> Wouldn't that be quite counterproductive? The reason we don't normally > >>> do that and why --analyze-in-stages exists is that the cluster should be > >>> started up as fast as possible. Restarting it after ANALYZE went through > >>> would be defeating that purpose, no? > > >> How so? Once you've started the postmaster, you're open for business, > >> no? > > > Wasn't there lots of talk about making the server inaccessible while > > pg_upgrade is doing its thing? Also, many people are desparately unhappy > > if postgres has to be restarted (to return to be being OS managed) after > > their application already has connected. > > I think we're not on the same page. My point is that someone might want > to automate the whole sequence: stop old postmaster, run pg_upgrade, start > the updated postmaster normally (hence it *is* open for business), kick > off the analyze runs. If you're concerned about minimal downtime you > would not want to be waiting around for the admin to issue a perfectly > predictable series of commands. Oh, yea. Definitely. I think that's what I've seen happen in pretty much *all* usages of pg_upgrade. I somehow misread that you wanted to add that into pg_upgrade. Not really sure how, sorry. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jun 18, 2014 at 10:58 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2014-06-18 13:54:16 -0400, Tom Lane wrote: >> I think we're not on the same page. My point is that someone might want >> to automate the whole sequence: stop old postmaster, run pg_upgrade, start >> the updated postmaster normally (hence it *is* open for business), kick >> off the analyze runs. If you're concerned about minimal downtime you >> would not want to be waiting around for the admin to issue a perfectly >> predictable series of commands. > > Oh, yea. Definitely. I think that's what I've seen happen in pretty much > *all* usages of pg_upgrade. I think it is a popular way to do it not because it is a particularly good way, but because the better alternatives are not readily available. If your database needs statistics badly enough that you want to do a coarse pre-pass with default_statistics_target=1, why would you want that pass to be done on an open database? Surely you don't want 100 open connections all doing giant seq scans (that should be single-row look up, but without stats they are not) competing with the analyze. Having a database which is "open" to queries but they have such deranged execution plans that they never actually finish is not truly open, and the attempts to service those futile queries just delays the true opening even further. If you really need a multi pass ANALYZE, you probably need the first pass to be before the database opens because otherwise the open will be a disaster, and the 2nd pass to be after the database opens but before your bulk queries (mass deletes, EOM reports, etc.) kick in. Having both passes be on the same "side" of the opening seems unlikely to do much good for most use cases. Fortunately it probably doesn't do much harm to most people, either, simple because most databases are not terribly sensitive to the issue. Cheers, Jeff
Re: Tom Lane 2014-06-18 <21034.1403110303@sss.pgh.pa.us> > Christoph Berg <christoph.berg@credativ.de> writes: > > now that we have vacuumdb --all --analyze-in-stages in 9.4, wouldn't > > it make sense to get rid of the analyze_new_cluster.sh file which > > pg_upgrade writes? The net content is a single line which could as > > well be printed by pg_upgrade itself. Instead of an lengthy > > explanation how to invoke that manually, there should be a short note > > and a pointer to some manual section. I think the chances of people > > reading that would even be increased. > > > Similary, I don't really see the usefulness of delete_old_cluster.sh > > as a file, when "rm -rf" could just be presented on the console for > > the admin to execute by cut-and-paste. > > There are contexts where pg_upgrade is executed by some wrapper script > and the user doesn't normally see its output directly. This is the > case in the Red Hat packaging (unless Honza changed it since I left ;-)) > and I think Debian might be similar. pg_upgradecluster shows the full pg_upgrade output in Debian. (But pg_createcluster hides the initdb output, so it the other way round here... It'd be nice if initdb would just output the interesting parts and omit most of the chatter.) > I generally don't like the amount of cruft pg_upgrade leaves lying > around, so I'd be glad to see these script files go away if possible; > but we need to think about how this will play when there's a wrapper > script between pg_upgrade and the human user. > > In the Red Hat wrapper script, the pg_upgrade output is dumped into a > log file, which the user can look at if he wants, but I'd bet the > average user doesn't read it --- that was certainly the expectation. > Of course, said user probably never notices the separate shell > scripts either, so maybe it's a wash. > > Another angle is that some folks might have tried to automate things > even more, with a wrapper script that starts up the new postmaster > and runs analyze_new_cluster.sh all by itself. I guess they could > make the wrapper do "vacuumdb --all --analyze-in-stages" directly, > though, so maybe that's not a fatal objection either. Yeah that was my point - that's a single static command, that could be executed by the wrapper, and it would be much less opaque. (Same for the delete script - before looking into the file you'd think it would do all sorts of cleanup, but then issues a simple rm -rf.) Mit freundlichen Grüßen, Christoph Berg -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
On Wed, Jun 18, 2014 at 8:41 AM, Christoph Berg <christoph.berg@credativ.de> wrote: > Hi, > > now that we have vacuumdb --all --analyze-in-stages in 9.4, wouldn't > it make sense to get rid of the analyze_new_cluster.sh file which > pg_upgrade writes? The net content is a single line which could as > well be printed by pg_upgrade itself. Instead of an lengthy > explanation how to invoke that manually, there should be a short note > and a pointer to some manual section. I think the chances of people > reading that would even be increased. That one line was longer in the past, it could become longer again in the future. I don't think we should toggle the presentation back and forth from version to version depending how long it happens to be. > Similary, I don't really see the usefulness of delete_old_cluster.sh > as a file, when "rm -rf" could just be presented on the console for > the admin to execute by cut-and-paste. I certainly would not want to run rm -rf commands copied off the console window. A slip of the mouse (or the paste buffer) and suddenly you are removing entirely the wrong level of the directory tree. But I wouldn't mind an option to suppress the creation of those files. Cheers, Jeff
Re: Jeff Janes 2014-06-20 <CAMkU=1z3Edq+CNRo4F=jBEzXNMidSskdm=cPcAZnOgdY2sivXw@mail.gmail.com> > On Wed, Jun 18, 2014 at 8:41 AM, Christoph Berg > <christoph.berg@credativ.de> wrote: > > Hi, > > > > now that we have vacuumdb --all --analyze-in-stages in 9.4, wouldn't > > it make sense to get rid of the analyze_new_cluster.sh file which > > pg_upgrade writes? The net content is a single line which could as > > well be printed by pg_upgrade itself. Instead of an lengthy > > explanation how to invoke that manually, there should be a short note > > and a pointer to some manual section. I think the chances of people > > reading that would even be increased. > > That one line was longer in the past, it could become longer again in > the future. I don't think we should toggle the presentation back and > forth from version to version depending how long it happens to be. I doubt that would happen. If there's more than analyze to do after the upgrade, the file would get a new name, so the "API" would change anyway. > > Similary, I don't really see the usefulness of delete_old_cluster.sh > > as a file, when "rm -rf" could just be presented on the console for > > the admin to execute by cut-and-paste. > > I certainly would not want to run rm -rf commands copied off the > console window. A slip of the mouse (or the paste buffer) and > suddenly you are removing entirely the wrong level of the directory > tree. Well I don't like shell scripts containing rm -rf commands sitting in the filesystem either. > But I wouldn't mind an option to suppress the creation of those files. Another nitpick here: What pg_upgrade outputs doesn't even work on most systems, you need to ./analyze_new_cluster.sh or "sh analyze_new_cluster.sh". Mit freundlichen Grüßen, Christoph Berg -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
On Wed, Jun 18, 2014 at 05:41:06PM +0200, Christoph Berg wrote: > Hi, > > now that we have vacuumdb --all --analyze-in-stages in 9.4, wouldn't > it make sense to get rid of the analyze_new_cluster.sh file which > pg_upgrade writes? The net content is a single line which could as > well be printed by pg_upgrade itself. Instead of an lengthy > explanation how to invoke that manually, there should be a short note > and a pointer to some manual section. I think the chances of people > reading that would even be increased. I was not a big fan of keeping analyze_new_cluster.sh with one command in it, but it does maintain the same user API, so I guess that is why people wanted it kept. > Similary, I don't really see the usefulness of delete_old_cluster.sh > as a file, when "rm -rf" could just be presented on the console for > the admin to execute by cut-and-paste. Uh, that could be hard because delete_old_cluster.sh also can also delete the old major-version-specific subdirectories in tablespaces, so I do think we need to keep that. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Fri, Jun 20, 2014 at 05:15:05PM +0200, Christoph Berg wrote: > Another nitpick here: What pg_upgrade outputs doesn't even work on > most systems, you need to ./analyze_new_cluster.sh or "sh > analyze_new_cluster.sh". Well, the output is: Optimizer statistics are not transferred by pg_upgrade so,once you start the new server, consider running: analyze_new_cluster.shRunningthis script will delete the old cluster's data files: delete_old_cluster.sh It is not really telling you _how_ to run them. Would adding a ./ prefix help? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: Bruce Momjian 2014-07-29 <20140729094234.GC13958@momjian.us> > On Fri, Jun 20, 2014 at 05:15:05PM +0200, Christoph Berg wrote: > > Another nitpick here: What pg_upgrade outputs doesn't even work on > > most systems, you need to ./analyze_new_cluster.sh or "sh > > analyze_new_cluster.sh". > > Well, the output is: > > Optimizer statistics are not transferred by pg_upgrade so, > once you start the new server, consider running: > analyze_new_cluster.sh > > Running this script will delete the old cluster's data files: > delete_old_cluster.sh > > It is not really telling you _how_ to run them. Would adding a ./ > prefix help? I think it would help in hinting the user that these are not system-wide commands in $PATH but rather scripts in the current directory. There's also a case for prefixing them with the full path, Debian's pg_upgradecluster wrapper drops these scripts in a new subdirectory in /var/log/postgresql/ along with the log files. Possibly other automation frameworks do likewise. (Though the frameworks will likely make delete_old_cluster.sh redundant.) Christoph -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE
On Mon, Aug 4, 2014 at 10:17:40AM +0200, Christoph Berg wrote: > Re: Bruce Momjian 2014-07-29 <20140729094234.GC13958@momjian.us> > > On Fri, Jun 20, 2014 at 05:15:05PM +0200, Christoph Berg wrote: > > > Another nitpick here: What pg_upgrade outputs doesn't even work on > > > most systems, you need to ./analyze_new_cluster.sh or "sh > > > analyze_new_cluster.sh". > > > > Well, the output is: > > > > Optimizer statistics are not transferred by pg_upgrade so, > > once you start the new server, consider running: > > analyze_new_cluster.sh > > > > Running this script will delete the old cluster's data files: > > delete_old_cluster.sh > > > > It is not really telling you _how_ to run them. Would adding a ./ > > prefix help? > > I think it would help in hinting the user that these are not > system-wide commands in $PATH but rather scripts in the current > directory. > > There's also a case for prefixing them with the full path, Debian's > pg_upgradecluster wrapper drops these scripts in a new subdirectory in > /var/log/postgresql/ along with the log files. Possibly other > automation frameworks do likewise. (Though the frameworks will likely > make delete_old_cluster.sh redundant.) I have applied a patch to 9.5 to output "./" as a prefix for Unix script file names. While this also works on Windows, it is likely to be confusing. The new Unix output is: Upgrade Complete----------------Optimizer statistics are not transferred by pg_upgrade so,once you start the new server,consider running: ./analyze_new_cluster.shRunning this script will delete the old cluster's data files: ./delete_old_cluster.sh -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: Bruce Momjian 2014-10-12 <20141011224002.GM21267@momjian.us> > I have applied a patch to 9.5 to output "./" as a prefix for Unix script > file names. While this also works on Windows, it is likely to be > confusing. The new Unix output is: > > Upgrade Complete > ---------------- > Optimizer statistics are not transferred by pg_upgrade so, > once you start the new server, consider running: > ./analyze_new_cluster.sh > > Running this script will delete the old cluster's data files: > ./delete_old_cluster.sh I like this, thanks! Mit freundlichen Grüßen, Christoph Berg -- Senior Berater, Tel.: +49 (0)21 61 / 46 43-187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Hohenzollernstr. 133, 41061 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE