Thread: add timing information to pg_upgrade
I've been looking into some options for reducing the amount of downtime required for pg_upgrade, and $SUBJECT seemed like something that would be worthwhile independent of that effort. The attached work-in-progress patch adds the elapsed time spent in each step, which looks like this: Performing Consistency Checks ----------------------------- Checking cluster versions ok (took 0 ms) Checking database user is the install user ok (took 3 ms) Checking database connection settings ok (took 4 ms) Checking for prepared transactions ok (took 2 ms) Checking for system-defined composite types in user tables ok (took 82 ms) Checking for reg* data types in user tables ok (took 55 ms) ... This information can be used to better understand where the time is going and to validate future improvements. I'm open to suggestions on formatting the timing information, assuming folks are interested in this idea. Thoughts? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Jul 28, 2023 at 5:21 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > This information can be used to better understand where the time is going > and to validate future improvements. I'm open to suggestions on formatting > the timing information, assuming folks are interested in this idea. > > Thoughts? +1 for adding time taken. Some comments on the patch: 1. + gettimeofday(&step_start, NULL); + gettimeofday(&step_end, NULL); + start_ms = (step_start.tv_sec * 1000L) + (step_start.tv_usec / 1000L); + end_ms = (step_end.tv_sec * 1000L) + (step_end.tv_usec / 1000L); + elapsed_ms = end_ms - start_ms; How about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT and INSTR_TIME_GET_MILLISEC macros instead of gettimeofday and explicit calculations? 2. > Checking database user is the install user ok (took 3 ms) > Checking database connection settings ok (took 4 ms) + report_status(PG_REPORT, "ok (took %ld ms)", elapsed_ms); I think it's okay to just leave it with "ok \t %ld ms", elapsed_ms); without "took". FWIW, pg_regress does that way, see below: ok 2 + boolean 50 ms ok 3 + char 32 ms ok 4 + name 33 ms > Performing Consistency Checks > ----------------------------- > Checking cluster versions ok (took 0 ms) > Checking database user is the install user ok (took 3 ms) > Checking database connection settings ok (took 4 ms) > Checking for prepared transactions ok (took 2 ms) > Checking for system-defined composite types in user tables ok (took 82 ms) > Checking for reg* data types in user tables ok (took 55 ms) Just curious, out of all the reported pg_upgrade prep_status()-es, which ones are taking more time? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Fri, Jul 28, 2023 at 01:10:14PM +0530, Bharath Rupireddy wrote: > How about using INSTR_TIME_SET_CURRENT, INSTR_TIME_SUBTRACT and > INSTR_TIME_GET_MILLISEC macros instead of gettimeofday and explicit > calculations? That seems like a good idea. > + report_status(PG_REPORT, "ok (took %ld ms)", elapsed_ms); > > I think it's okay to just leave it with "ok \t %ld ms", elapsed_ms); > without "took". FWIW, pg_regress does that way, see below: I'm okay with simply adding the time. However, I wonder if we want to switch to seconds, minutes, hours, etc. if the step takes longer. This would add a bit of complexity, but it would improve human readability. Alternatively, we could keep the code simple and machine readable by always using milliseconds. > Just curious, out of all the reported pg_upgrade prep_status()-es, > which ones are taking more time? I haven't done any testing on meaningful workloads yet, but the following show up on an empty cluster: "creating dump of database schemas", "analyzing all rows in the new cluster", "setting next transaction ID and epoch for new cluster", "restoring datbase schemas in the new cluster", and "sync data directory to disk". I imagine the dump, restore, and file-copying steps will stand out once you start adding data. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Fri, Jul 28, 2023 at 10:38:14AM -0700, Nathan Bossart wrote: > I'm okay with simply adding the time. However, I wonder if we want to > switch to seconds, minutes, hours, etc. if the step takes longer. This > would add a bit of complexity, but it would improve human readability. > Alternatively, we could keep the code simple and machine readable by always > using milliseconds. ... or maybe we show both like psql does. Attached іs a new version of the patch in which I've made use of the INSTR_TIME_* macros and enhanced the output formatting (largely borrowed from PrintTiming() in src/bin/psql/common.c). -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Sat, Jul 29, 2023 at 12:17 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Fri, Jul 28, 2023 at 10:38:14AM -0700, Nathan Bossart wrote: > > I'm okay with simply adding the time. However, I wonder if we want to > > switch to seconds, minutes, hours, etc. if the step takes longer. This > > would add a bit of complexity, but it would improve human readability. > > Alternatively, we could keep the code simple and machine readable by always > > using milliseconds. > > ... or maybe we show both like psql does. Attached іs a new version of the > patch in which I've made use of the INSTR_TIME_* macros and enhanced the > output formatting (largely borrowed from PrintTiming() in > src/bin/psql/common.c). The v2 patch LGTM. I tested it with an upgrade of the 22GB database, the output is here [1]. While on this, I noticed a thing unrelated to your patch that there's no space between the longest status message of size 60 bytes and ok - 'Checking for incompatible "aclitem" data type in user tablesok 23.932 ms'. I think MESSAGE_WIDTH needs to be bumped up - 64 or more. [1] Performing Consistency Checks ----------------------------- Checking cluster versions ok 0.000 ms Checking database user is the install user ok 1.767 ms Checking database connection settings ok 2.210 ms Checking for prepared transactions ok 1.411 ms Checking for system-defined composite types in user tables ok 29.471 ms Checking for reg* data types in user tables ok 26.251 ms Checking for contrib/isn with bigint-passing mismatch ok 0.000 ms Checking for incompatible "aclitem" data type in user tablesok 23.932 ms Checking for user-defined encoding conversions ok 8.350 ms Checking for user-defined postfix operators ok 8.229 ms Checking for incompatible polymorphic functions ok 15.271 ms Checking for tables WITH OIDS ok 6.120 ms Checking for invalid "sql_identifier" user columns ok 24.509 ms Creating dump of global objects ok 14.007 ms Creating dump of database schemas ok 176.690 ms Checking for presence of required libraries ok 0.011 ms Checking database user is the install user ok 2.566 ms Checking for prepared transactions ok 2.065 ms Checking for new cluster tablespace directories ok 0.000 ms If pg_upgrade fails after this point, you must re-initdb the new cluster before continuing. Performing Upgrade ------------------ Setting locale and encoding for new cluster ok 3.014 ms Analyzing all rows in the new cluster ok 373.270 ms Freezing all rows in the new cluster ok 81.064 ms Deleting files from new pg_xact ok 0.089 ms Copying old pg_xact to new server ok 2.204 ms Setting oldest XID for new cluster ok 38.314 ms Setting next transaction ID and epoch for new cluster ok 111.503 ms Deleting files from new pg_multixact/offsets ok 0.078 ms Copying old pg_multixact/offsets to new server ok 1.790 ms Deleting files from new pg_multixact/members ok 0.050 ms Copying old pg_multixact/members to new server ok 1.532 ms Setting next multixact ID and offset for new cluster ok 36.770 ms Resetting WAL archives ok 37.182 ms Setting frozenxid and minmxid counters in new cluster ok 47.879 ms Restoring global objects in the new cluster ok 11.615 ms Restoring database schemas in the new cluster ok 141.839 ms Copying user relation files ok 151308.601 ms (02:31.309) Setting next OID for new cluster ok 44.800 ms Sync data directory to disk ok 4461.213 ms (00:04.461) Creating script to delete old cluster ok 0.059 ms Checking for extension updates ok 66.899 ms Upgrade Complete ---------------- Optimizer statistics are not transferred by pg_upgrade. Once you start the new server, consider running: /home/ubuntu/postgres/HEAD/bin/vacuumdb --all --analyze-in-stages Running this script will delete the old cluster's data files: ./delete_old_cluster.sh real 2m38.133s user 0m0.151s sys 0m21.556s -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Sat, Jul 29, 2023 at 12:17:40PM +0530, Bharath Rupireddy wrote: > While on this, I noticed a thing unrelated to your patch that there's > no space between the longest status message of size 60 bytes and ok - > 'Checking for incompatible "aclitem" data type in user tablesok > 23.932 ms'. I think MESSAGE_WIDTH needs to be bumped up - 64 or more. Good catch. I think I'd actually propose just removing "in user tables" or the word "incompatible" from these messages to keep them succinct. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Sun, Jul 30, 2023 at 2:44 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Sat, Jul 29, 2023 at 12:17:40PM +0530, Bharath Rupireddy wrote: > > While on this, I noticed a thing unrelated to your patch that there's > > no space between the longest status message of size 60 bytes and ok - > > 'Checking for incompatible "aclitem" data type in user tablesok > > 23.932 ms'. I think MESSAGE_WIDTH needs to be bumped up - 64 or more. > > Good catch. I think I'd actually propose just removing "in user tables" or > the word "incompatible" from these messages to keep them succinct. Either of "Checking for \"aclitem\" data type usage" or "Checking for \"aclitem\" data type in user tables" seems okay to me, however, I prefer the second wording. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Mon, Jul 31, 2023 at 11:34:57AM +0530, Bharath Rupireddy wrote: > Either of "Checking for \"aclitem\" data type usage" or "Checking for > \"aclitem\" data type in user tables" seems okay to me, however, I > prefer the second wording. Okay. I used the second wording for all the data type checks in v3. I also marked the timing strings for translation. I considered trying to extract psql's PrintTiming() so that it could be reused in other utilities, but the small differences would likely make translation difficult, and the logic isn't terribly long or sophisticated. My only remaining concern is that this timing information could cause pg_upgrade's output to exceed 80 characters per line. I don't know if this is something that folks are very worried about in 2023, but it still seemed worth bringing up. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On 28.07.23 01:51, Nathan Bossart wrote: > I've been looking into some options for reducing the amount of downtime > required for pg_upgrade, and $SUBJECT seemed like something that would be > worthwhile independent of that effort. The attached work-in-progress patch > adds the elapsed time spent in each step, which looks like this: > > Performing Consistency Checks > ----------------------------- > Checking cluster versions ok (took 0 ms) > Checking database user is the install user ok (took 3 ms) > Checking database connection settings ok (took 4 ms) > Checking for prepared transactions ok (took 2 ms) > Checking for system-defined composite types in user tables ok (took 82 ms) > Checking for reg* data types in user tables ok (took 55 ms) > ... > > This information can be used to better understand where the time is going > and to validate future improvements. But who would use that, other than, you know, you, right now? I think the pg_upgrade output is already too full with not-really-actionable information (like most of the above "Checking ..." are not really interesting for a regular user).
On 31.07.23 20:37, Nathan Bossart wrote: > - prep_status("Checking for incompatible \"aclitem\" data type in user tables"); > + prep_status("Checking for \"aclitem\" data type in user tables"); Why these changes? I think this is losing precision about what it's doing.
> On 1 Aug 2023, at 09:45, Peter Eisentraut <peter@eisentraut.org> wrote: > On 28.07.23 01:51, Nathan Bossart wrote: >> This information can be used to better understand where the time is going >> and to validate future improvements. > > But who would use that, other than, you know, you, right now? > > I think the pg_upgrade output is already too full with not-really-actionable information (like most of the above "Checking..." are not really interesting for a regular user). Maybe if made opt-in with a --debug option, or even a compiler option for enabling only in specialized debugging builds? -- Daniel Gustafsson
On Tue, Aug 01, 2023 at 09:46:02AM +0200, Peter Eisentraut wrote: > On 31.07.23 20:37, Nathan Bossart wrote: >> - prep_status("Checking for incompatible \"aclitem\" data type in user tables"); >> + prep_status("Checking for \"aclitem\" data type in user tables"); > > Why these changes? I think this is losing precision about what it's doing. The message is too long, so there's no space between it and the "ok" message: Checking for incompatible "aclitem" data type in user tablesok Instead of altering the messages, we could bump MESSAGE_WIDTH from 60 to 62 or 64. Do you prefer that approach? (BTW this probably needs to be back-patched to v16.) -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Aug 01, 2023 at 09:58:24AM +0200, Daniel Gustafsson wrote: >> On 1 Aug 2023, at 09:45, Peter Eisentraut <peter@eisentraut.org> wrote: >> On 28.07.23 01:51, Nathan Bossart wrote: > >>> This information can be used to better understand where the time is going >>> and to validate future improvements. >> >> But who would use that, other than, you know, you, right now? One of the main purposes of this thread is to gauge interest. I'm hoping there are other developers who are interested in reducing pg_upgrade-related downtime, and it seemed like it'd be nice to have built-in functionality for measuring the step times instead of requiring folks to apply this patch every time. And I think users might also be interested in this information, if for no other reason than to help us pinpoint which steps are taking longer for various workloads. >> I think the pg_upgrade output is already too full with not-really-actionable information (like most of the above "Checking..." are not really interesting for a regular user). Perhaps. But IMO it's nice to know that it's doing things and making progress, even if you don't understand exactly what it's doing all the time. That being said, I wouldn't be opposed to hiding some of this output behind a --verbose or --debug option or consolidating some of the steps into fewer status messages. > Maybe if made opt-in with a --debug option, or even a compiler option for > enabling only in specialized debugging builds? I'm totally okay with making the timing information an opt-in feature. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Aug 1, 2023 at 9:00 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > >> On 1 Aug 2023, at 09:45, Peter Eisentraut <peter@eisentraut.org> wrote: > >> But who would use that, other than, you know, you, right now? /me raises hand Or at least, me back when I was hacking on pg_upgrade performance. This, or something like it, would have been fantastic. > >> I think the pg_upgrade output is already too full with not-really-actionable information (like most of the above "Checking..." are not really interesting for a regular user). > > Perhaps. But IMO it's nice to know that it's doing things and making > progress, even if you don't understand exactly what it's doing all the > time. +1. One of our findings at $prevjob was that some users *really* want some indication, anything at all, that things are progressing and aren't stuck. There was a lot of anxiety around upgrades. (There are probably _better_ ways to indicate progress than the current step divisions... But even poor progress indicators seemed to lower blood pressures, IIRC.) > That being said, I wouldn't be opposed to hiding some of this output > behind a --verbose or --debug option or consolidating some of the steps > into fewer status messages. I agree that millisecond-level timing should probably be opt-in. --Jacob
On 01.08.23 18:00, Nathan Bossart wrote: > One of the main purposes of this thread is to gauge interest. I'm hoping > there are other developers who are interested in reducing > pg_upgrade-related downtime, and it seemed like it'd be nice to have > built-in functionality for measuring the step times instead of requiring > folks to apply this patch every time. And I think users might also be > interested in this information, if for no other reason than to help us > pinpoint which steps are taking longer for various workloads. If it's just for developers and expert users, perhaps existing shell-level functionality like "pg_upgrade | ts" would suffice?
On 01.08.23 17:45, Nathan Bossart wrote: > The message is too long, so there's no space between it and the "ok" > message: > > Checking for incompatible "aclitem" data type in user tablesok > > Instead of altering the messages, we could bump MESSAGE_WIDTH from 60 to > 62 or 64. Do you prefer that approach? I think we should change the output format to be more like initdb, like Doing something ... ok without horizontally aligning all the "ok"s.
On Wed, Aug 2, 2023 at 12:45 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 01.08.23 17:45, Nathan Bossart wrote: > > The message is too long, so there's no space between it and the "ok" > > message: > > > > Checking for incompatible "aclitem" data type in user tablesok > > > > Instead of altering the messages, we could bump MESSAGE_WIDTH from 60 to > > 62 or 64. Do you prefer that approach? > > I think we should change the output format to be more like initdb, like > > Doing something ... ok > > without horizontally aligning all the "ok"s. While this looks simple, we might end up with a lot of diff and changes after removing MESSAGE_WIDTH. There's a significant part of pg_upgrade code that deals with MESSAGE_WIDTH. I don't think it's worth the effort. Therefore, I'd prefer the simplest possible fix - change the message to '"Checking for \"aclitem\" data type in user tables". It may be an overkill, but we can consider adding Assert(sizeof(message) < MESSAGE_WIDTH) in progress report functions to not encourage new messages to end up in the same formatting issue. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Wed, Aug 2, 2023 at 12:44 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 01.08.23 18:00, Nathan Bossart wrote: > > One of the main purposes of this thread is to gauge interest. I'm hoping > > there are other developers who are interested in reducing > > pg_upgrade-related downtime, and it seemed like it'd be nice to have > > built-in functionality for measuring the step times instead of requiring > > folks to apply this patch every time. And I think users might also be > > interested in this information, if for no other reason than to help us > > pinpoint which steps are taking longer for various workloads. > > If it's just for developers and expert users, perhaps existing > shell-level functionality like "pg_upgrade | ts" would suffice? Interesting. I had to install moreutils package to get ts. And, something like ts command may or may not be available on all platforms. Moreover, the ts command gives me the timestamps for each of the messages printed, so an extra step is required to calculate the time taken for an operation. I think it'd be better if pg_upgrade can calculate the time taken for each operation, and I'm okay if it is an opt-in feature with --verbose option. [1] Aug 02 07:44:17 Sync data directory to disk ok Aug 02 07:44:17 Creating script to delete old cluster ok Aug 02 07:44:17 Checking for extension updates ok -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 02.08.23 10:30, Bharath Rupireddy wrote: > Moreover, the ts command gives me the timestamps for each > of the messages printed, so an extra step is required to calculate the > time taken for an operation. There is "ts -i" for that.
On Wed, Aug 02, 2023 at 09:14:06AM +0200, Peter Eisentraut wrote: > On 01.08.23 18:00, Nathan Bossart wrote: >> One of the main purposes of this thread is to gauge interest. I'm hoping >> there are other developers who are interested in reducing >> pg_upgrade-related downtime, and it seemed like it'd be nice to have >> built-in functionality for measuring the step times instead of requiring >> folks to apply this patch every time. And I think users might also be >> interested in this information, if for no other reason than to help us >> pinpoint which steps are taking longer for various workloads. > > If it's just for developers and expert users, perhaps existing shell-level > functionality like "pg_upgrade | ts" would suffice? Sure, we could just leave it at that unless anyone sees a reason to bake it in. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Aug 02, 2023 at 01:02:53PM +0530, Bharath Rupireddy wrote: > On Wed, Aug 2, 2023 at 12:45 PM Peter Eisentraut <peter@eisentraut.org> wrote: >> I think we should change the output format to be more like initdb, like >> >> Doing something ... ok >> >> without horizontally aligning all the "ok"s. > > While this looks simple, we might end up with a lot of diff and > changes after removing MESSAGE_WIDTH. There's a significant part of > pg_upgrade code that deals with MESSAGE_WIDTH. I don't think it's > worth the effort. Therefore, I'd prefer the simplest possible fix - > change the message to '"Checking for \"aclitem\" data type in user > tables". It may be an overkill, but we can consider adding > Assert(sizeof(message) < MESSAGE_WIDTH) in progress report functions > to not encourage new messages to end up in the same formatting issue. I don't think it's that difficult. ІMO the bigger question is whether we want to back-patch such a change to v16 at this point. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Aug 02, 2023 at 09:09:14AM -0700, Nathan Bossart wrote: > On Wed, Aug 02, 2023 at 01:02:53PM +0530, Bharath Rupireddy wrote: >> On Wed, Aug 2, 2023 at 12:45 PM Peter Eisentraut <peter@eisentraut.org> wrote: >>> I think we should change the output format to be more like initdb, like >>> >>> Doing something ... ok >>> >>> without horizontally aligning all the "ok"s. >> >> While this looks simple, we might end up with a lot of diff and >> changes after removing MESSAGE_WIDTH. There's a significant part of >> pg_upgrade code that deals with MESSAGE_WIDTH. I don't think it's >> worth the effort. Therefore, I'd prefer the simplest possible fix - >> change the message to '"Checking for \"aclitem\" data type in user >> tables". It may be an overkill, but we can consider adding >> Assert(sizeof(message) < MESSAGE_WIDTH) in progress report functions >> to not encourage new messages to end up in the same formatting issue. > > I don't think it's that difficult. ІMO the bigger question is whether we > want to back-patch such a change to v16 at this point. Here is a work-in-progress patch that seems to get things pretty close to what Peter is suggesting. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On 01.08.23 17:45, Nathan Bossart wrote: > On Tue, Aug 01, 2023 at 09:46:02AM +0200, Peter Eisentraut wrote: >> On 31.07.23 20:37, Nathan Bossart wrote: >>> - prep_status("Checking for incompatible \"aclitem\" data type in user tables"); >>> + prep_status("Checking for \"aclitem\" data type in user tables"); >> >> Why these changes? I think this is losing precision about what it's doing. > > The message is too long, so there's no space between it and the "ok" > message: > > Checking for incompatible "aclitem" data type in user tablesok > > Instead of altering the messages, we could bump MESSAGE_WIDTH from 60 to > 62 or 64. Do you prefer that approach? (BTW this probably needs to be > back-patched to v16.) Let's change MESSAGE_WIDTH to 62 in v16, and then pursue the larger restructuring leisurely.
On Tue, Aug 22, 2023 at 11:49:33AM +0200, Peter Eisentraut wrote: > Let's change MESSAGE_WIDTH to 62 in v16, and then pursue the larger > restructuring leisurely. Sounds good. I plan to commit this within the next couple of days. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Aug 22, 2023 at 07:06:23AM -0700, Nathan Bossart wrote: > On Tue, Aug 22, 2023 at 11:49:33AM +0200, Peter Eisentraut wrote: >> Let's change MESSAGE_WIDTH to 62 in v16, and then pursue the larger >> restructuring leisurely. > > Sounds good. I plan to commit this within the next couple of days. Here's the patch. I'm going to run a couple of tests before committing, but I don't think anything else is required. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Aug 23, 2023 at 09:35:20AM -0700, Nathan Bossart wrote: > On Tue, Aug 22, 2023 at 07:06:23AM -0700, Nathan Bossart wrote: >> On Tue, Aug 22, 2023 at 11:49:33AM +0200, Peter Eisentraut wrote: >>> Let's change MESSAGE_WIDTH to 62 in v16, and then pursue the larger >>> restructuring leisurely. >> >> Sounds good. I plan to commit this within the next couple of days. > > Here's the patch. I'm going to run a couple of tests before committing, > but I don't think anything else is required. committed -- Nathan Bossart Amazon Web Services: https://aws.amazon.com