Thread: Avoiding inadvertent debugging mode for pgbench
Attached please find a patch to adjust the behavior of the pgbench program and make it behave like the other programs that connect to a database (namely, psql and pg_dump). Specifically, add support for using -d and --dbname to specify the name of the database. This means that -d can no longer be used to turn on debugging mode, and the long option --debug must be used instead.
This removes a long-standing footgun, in which people assume that the -d option behaves the same as other programs. Indeed, because it takes no arguments, and because the first non-option argument is the database name, it still appears to work. However, people then wonder why pgbench is so darn verbose all the time! :)
This is a breaking change, but fixing it this way seems to have the least total impact, as the number of people using the debug mode of pgbench is likely quite small. Further, those already using the long option are unaffected, and those using the short one simply need to replace '-d' with '--debug', arguably making their scripts a little more self-documenting in the process.
Cheers,
Greg
This removes a long-standing footgun, in which people assume that the -d option behaves the same as other programs. Indeed, because it takes no arguments, and because the first non-option argument is the database name, it still appears to work. However, people then wonder why pgbench is so darn verbose all the time! :)
This is a breaking change, but fixing it this way seems to have the least total impact, as the number of people using the debug mode of pgbench is likely quite small. Further, those already using the long option are unaffected, and those using the short one simply need to replace '-d' with '--debug', arguably making their scripts a little more self-documenting in the process.
Cheers,
Greg
Attachment
On Thu, Feb 29, 2024 at 07:05:13PM -0500, Greg Sabino Mullane wrote: > Attached please find a patch to adjust the behavior of the pgbench program > and make it behave like the other programs that connect to a database > (namely, psql and pg_dump). Specifically, add support for using -d and > --dbname to specify the name of the database. This means that -d can no > longer be used to turn on debugging mode, and the long option --debug must > be used instead. > > This removes a long-standing footgun, in which people assume that the -d > option behaves the same as other programs. Indeed, because it takes no > arguments, and because the first non-option argument is the database name, > it still appears to work. However, people then wonder why pgbench is so > darn verbose all the time! :) > > This is a breaking change, but fixing it this way seems to have the least > total impact, as the number of people using the debug mode of pgbench is > likely quite small. Further, those already using the long option are > unaffected, and those using the short one simply need to replace '-d' with > '--debug', arguably making their scripts a little more self-documenting in > the process. I think this is a generally reasonable proposal, except I don't know whether this breakage is acceptable. AFAICT there are two fundamental behavior changes folks would observe: * "-d <database_name>" would cease to emit the debugging output, and while enabling debug mode might've been unintentional in most cases, it might actually have been intentional in others. * "-d" with no argument or with a following option would begin failing, and users would need to replace "-d" with "--debug". Neither of these seems particularly severe to me, especially for a benchmarking program, but I'd be curious to hear what others think. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On 3/1/24 23:41, Nathan Bossart wrote: > On Thu, Feb 29, 2024 at 07:05:13PM -0500, Greg Sabino Mullane wrote: >> Attached please find a patch to adjust the behavior of the pgbench program >> and make it behave like the other programs that connect to a database >> (namely, psql and pg_dump). Specifically, add support for using -d and >> --dbname to specify the name of the database. This means that -d can no >> longer be used to turn on debugging mode, and the long option --debug must >> be used instead. >> >> This removes a long-standing footgun, in which people assume that the -d >> option behaves the same as other programs. Indeed, because it takes no >> arguments, and because the first non-option argument is the database name, >> it still appears to work. However, people then wonder why pgbench is so >> darn verbose all the time! :) >> >> This is a breaking change, but fixing it this way seems to have the least >> total impact, as the number of people using the debug mode of pgbench is >> likely quite small. Further, those already using the long option are >> unaffected, and those using the short one simply need to replace '-d' with >> '--debug', arguably making their scripts a little more self-documenting in >> the process. > > I think this is a generally reasonable proposal, except I don't know > whether this breakage is acceptable. AFAICT there are two fundamental > behavior changes folks would observe: > > * "-d <database_name>" would cease to emit the debugging output, and while > enabling debug mode might've been unintentional in most cases, it might > actually have been intentional in others. > I think this is the more severe of the two issues, because it's a silent change. Everything will seem to work, but the user won't get the debug info (if they actually wanted it). > * "-d" with no argument or with a following option would begin failing, and > users would need to replace "-d" with "--debug". > I think this is fine. > Neither of these seems particularly severe to me, especially for a > benchmarking program, but I'd be curious to hear what others think. > I agree the -d option may be confusing, but is it worth it? I don't know, it depends on how often people actually get confused by it, and I don't recall hitting this (nor hearing about others). To be honest I didn't even realize pgbench even has a debug switch ... But I'd like to mention this is far from our only tool using "-d" to enable debug mode. A quick git-grep shows postgres, initdb, pg_archivecleanup and pg_combinebackup do the same thing. So maybe it's not that inconsistent overall. (Note: I didn't check if the other cases may lead to the same confusion where people enable debug accidentally. Maybe not.) regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Mar 1, 2024, at 8:07 PM, Tomas Vondra wrote:
On 3/1/24 23:41, Nathan Bossart wrote:>> I think this is a generally reasonable proposal, except I don't know> whether this breakage is acceptable. AFAICT there are two fundamental> behavior changes folks would observe:>> * "-d <database_name>" would cease to emit the debugging output, and while> enabling debug mode might've been unintentional in most cases, it might> actually have been intentional in others.>I think this is the more severe of the two issues, because it's a silentchange. Everything will seem to work, but the user won't get the debuginfo (if they actually wanted it).
Indeed. Hopefully the user will notice soon when inspecting the standard error
output.
> * "-d" with no argument or with a following option would begin failing, and> users would need to replace "-d" with "--debug".>I think this is fine.
Yeah. It will force the user to fix it immediately.
> Neither of these seems particularly severe to me, especially for a> benchmarking program, but I'd be curious to hear what others think.>I agree the -d option may be confusing, but is it worth it? I don'tknow, it depends on how often people actually get confused by it, and Idon't recall hitting this (nor hearing about others). To be honest Ididn't even realize pgbench even has a debug switch ...
I'm the one that has a habit to use -d to specify the database name. I
generally include -d for pgbench and then realized that I don't need the debug
information because it is not for database specification.
But I'd like to mention this is far from our only tool using "-d" toenable debug mode. A quick git-grep shows postgres, initdb,pg_archivecleanup and pg_combinebackup do the same thing. So maybe it'snot that inconsistent overall.
As Greg said none of these programs connects to the database.
I don't like to break backward compatibility but in this case I suspect that it
is ok. I don't recall the last time I saw a script that makes use of -d option.
How often do you need a pgbench debug information?
On 2024-Mar-01, Euler Taveira wrote: > I don't like to break backward compatibility but in this case I suspect that it > is ok. I don't recall the last time I saw a script that makes use of -d option. > How often do you need a pgbench debug information? I wondered what the difference actually is, so I checked. In -i mode, the only difference is that if the tables don't exist before hand, we receive the NOTICE that it doesn't. In normal mode, the -d switch emits so much junk that I would believe if somebody told me that passing -d distorted the benchmark results; and it's hard to believe that such output is valuable for anything other than debugging pgbench itself. All in all, I support the original patch. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "I love the Postgres community. It's all about doing things _properly_. :-)" (David Garamond)
On Mon, Mar 11, 2024 at 04:59:36PM +0100, Alvaro Herrera wrote: > All in all, I support the original patch. I'll commit this in a few days if there are no objections. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Tue, Mar 19, 2024 at 01:47:53PM -0500, Nathan Bossart wrote: > On Mon, Mar 11, 2024 at 04:59:36PM +0100, Alvaro Herrera wrote: >> All in all, I support the original patch. > > I'll commit this in a few days if there are no objections. Actually, I just took a look at the patch and it appears to need a rebase as well as additional documentation updates for the new -d/--dbname option. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Rebased version attached (v2), with another sentence in the sgml to explain the optional use of -d
Cheers,
Greg
Attachment
On Tue, Mar 19, 2024 at 09:15:22PM -0400, Greg Sabino Mullane wrote: > Rebased version attached (v2), with another sentence in the sgml to explain > the optional use of -d cfbot seems quite unhappy with this: https://cirrus-ci.com/build/6429518263484416 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
My mistake. Attached please find version 3, which should hopefully make cfbot happy again.
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: tested, passed Did a quick review of this one; CFbot is now happy, local regression tests all pass. I think the idea here is sane; it's particularly confusing for some tools included in the main distribution to have differentsemantics, and this seems low on the breakage risk here, so worth the tradeoffs IMO. The new status of this patch is: Ready for Committer
On Wed, Mar 20, 2024 at 11:57:25AM -0400, Greg Sabino Mullane wrote: > My mistake. Attached please find version 3, which should hopefully make > cfbot happy again. Here is what I have staged for commit. I plan to commit this within the next few days. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
Committed. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com