Thread: postgres_fdw hint messages
The postgres_fdw tests contain this (as amended by patch 0001): ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw'); ERROR: invalid option "password" HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, async_capable, parallel_commit, keep_connections This annoys developers who are working on libpq connection options, because any option added, removed, or changed causes this test to need to be updated. It's also questionable how useful this hint is in its current form, considering how long it is and that the options are in an implementation-dependent order. Possible changes: - Hide the hint from this particular test (done in the attached patches). - Keep the hint, but maybe make it sorted? - Remove all the hints like this from foreign data commands. - Don't show the hint when there are more than N valid options. - Do some kind of "did you mean" like we have for column names. Thoughts?
Attachment
>The postgres_fdw tests contain this (as amended by patch 0001):
>ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw');
>ERROR: invalid option "password"
>HINT: Valid options in this context are: service, passfile,
>channel_binding, connect_timeout, dbname, host, hostaddr, port, options,
>application_name, keepalives, keepalives_idle, keepalives_interval,
>keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert,
>sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer,
>ssl_min_protocol_version, ssl_max_protocol_version, gssencmode,
>krbsrvname, gsslib, target_session_attrs, use_remote_estimate,
>fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable,
>fetch_size, batch_size, async_capable, parallel_commit, keep_connections
>This annoys developers who are working on libpq connection options,
>because any option added, removed, or changed causes this test to need
>to be updated.
>It's also questionable how useful this hint is in its current form,
>considering how long it is and that the options are in an
>implementation-dependent order.
>Possible changes:
>- Hide the hint from this particular test (done in the attached patches).+1
Attachment
>The postgres_fdw tests contain this (as amended by patch 0001):
>ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw');
>ERROR: invalid option "password"
>HINT: Valid options in this context are: service, passfile,
>channel_binding, connect_timeout, dbname, host, hostaddr, port, options,
>application_name, keepalives, keepalives_idle, keepalives_interval,
>keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert,
>sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer,
>ssl_min_protocol_version, ssl_max_protocol_version, gssencmode,
>krbsrvname, gsslib, target_session_attrs, use_remote_estimate,
>fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable,
>fetch_size, batch_size, async_capable, parallel_commit, keep_connections>This annoys developers who are working on libpq connection options,
>because any option added, removed, or changed causes this test to need
>to be updated.>It's also questionable how useful this hint is in its current form,
>considering how long it is and that the options are in an
>implementation-dependent order.>Possible changes:
>- Hide the hint from this particular test (done in the attached patches).+1
I vote for this option.Less work for future developers changes, I think worth the effort.Anyway, in alphabetical order, it's a lot easier for users to read.Patch attached.
Attachment
The postgres_fdw tests contain this (as amended by patch 0001):
ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw');
ERROR: invalid option "password"
HINT: Valid options in this context are: service, passfile,
channel_binding, connect_timeout, dbname, host, hostaddr, port, options,
application_name, keepalives, keepalives_idle, keepalives_interval,
keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert,
sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer,
ssl_min_protocol_version, ssl_max_protocol_version, gssencmode,
krbsrvname, gsslib, target_session_attrs, use_remote_estimate,
fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable,
fetch_size, batch_size, async_capable, parallel_commit, keep_connections
This annoys developers who are working on libpq connection options,
because any option added, removed, or changed causes this test to need
to be updated.
It's also questionable how useful this hint is in its current form,
considering how long it is and that the options are in an
implementation-dependent order.
Possible changes:
- Hide the hint from this particular test (done in the attached patches).
- Keep the hint, but maybe make it sorted?
- Remove all the hints like this from foreign data commands.
- Don't show the hint when there are more than N valid options.
- Do some kind of "did you mean" like we have for column names.
Thoughts?
On Thu, Aug 25, 2022 at 9:42 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > The postgres_fdw tests contain this (as amended by patch 0001): > > ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw'); > ERROR: invalid option "password" > HINT: Valid options in this context are: service, passfile, > channel_binding, connect_timeout, dbname, host, hostaddr, port, options, > application_name, keepalives, keepalives_idle, keepalives_interval, > keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, > sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, > ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, > krbsrvname, gsslib, target_session_attrs, use_remote_estimate, > fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, > fetch_size, batch_size, async_capable, parallel_commit, keep_connections > > This annoys developers who are working on libpq connection options, > because any option added, removed, or changed causes this test to need > to be updated. > > It's also questionable how useful this hint is in its current form, > considering how long it is and that the options are in an > implementation-dependent order. I think the place to list the legal options is in the documentation, not the HINT. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Aug 25, 2022 at 9:42 AM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: >> HINT: Valid options in this context are: service, passfile, >> channel_binding, connect_timeout, dbname, host, hostaddr, port, options, >> application_name, keepalives, keepalives_idle, keepalives_interval, >> keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, >> sslkey, sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, >> ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, >> krbsrvname, gsslib, target_session_attrs, use_remote_estimate, >> fdw_startup_cost, fdw_tuple_cost, extensions, updatable, truncatable, >> fetch_size, batch_size, async_capable, parallel_commit, keep_connections >> >> This annoys developers who are working on libpq connection options, >> because any option added, removed, or changed causes this test to need >> to be updated. >> >> It's also questionable how useful this hint is in its current form, >> considering how long it is and that the options are in an >> implementation-dependent order. > I think the place to list the legal options is in the documentation, > not the HINT. I think listing them in a hint is reasonable as long as the hint doesn't get longer than a line or two. This one is entirely out of hand, so I agree with just dropping it. Note that there is essentially identical code in dblink, file_fdw, and backend/foreign/foreign.c. Do we want to nuke them all? Or maybe make a policy decision to suppress such HINTs when there are more than ~10 matches? (The latter policy would likely eventually end by always suppressing everything...) Peter also mentioned the possibility of "did you mean" with a closest match offered. That seems like a reasonable idea if someone is motivated to create the code, which I'm not. I vote for just dropping all these hints for now, while leaving the door open for anyone who wants to write closest-match-offering code. regards, tom lane
On 25.08.22 15:42, Peter Eisentraut wrote: > It's also questionable how useful this hint is in its current form, > considering how long it is and that the options are in an > implementation-dependent order. > > Possible changes: > - Remove all the hints like this from foreign data commands. It appears that there was a strong preference toward this solution, so that's what I implemented in the updated patch set. (Considering the hints that are removed in the tests cases, I don't think this loses any value. What might be useful in practice instead is something like "the option you specified on this foreign server should actually be specified on a user mapping or foreign table", but this would take a fair amount of code to cover a reasonable set of cases, so I'll leave this as a future exercise.)
Attachment
On Fri, Aug 26, 2022 at 12:35:38PM -0400, Tom Lane wrote: > Peter also mentioned the possibility of "did you mean" with a closest > match offered. That seems like a reasonable idea if someone > is motivated to create the code, which I'm not. > > I vote for just dropping all these hints for now, while leaving the > door open for anyone who wants to write closest-match-offering code. Here is a quickly-hacked-together proof-of-concept for using Levenshtein distances to determine which option to include in the hint. Would something like this suffice? If so, I will work on polishing it up a bit. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
Nathan Bossart <nathandbossart@gmail.com> writes: > On Fri, Aug 26, 2022 at 12:35:38PM -0400, Tom Lane wrote: >> I vote for just dropping all these hints for now, while leaving the >> door open for anyone who wants to write closest-match-offering code. > Here is a quickly-hacked-together proof-of-concept for using Levenshtein > distances to determine which option to include in the hint. Would > something like this suffice? If so, I will work on polishing it up a bit. Seems reasonable to me, but (1) there probably needs to be some threshold of closeness, so we don't offer "foobar" when the user wrote "huh" (2) there are several places doing this now, and there will no doubt be more later, so we need to try to abstract the logic so it can be shared. regards, tom lane
On Thu, Sep 01, 2022 at 07:08:49PM -0400, Tom Lane wrote: > (1) there probably needs to be some threshold of closeness, so we don't > offer "foobar" when the user wrote "huh" Agreed. > (2) there are several places doing this now, and there will no doubt > be more later, so we need to try to abstract the logic so it can be > shared. Will do. I'm also considering checking whether the user-provided string is longer than MAX_LEVENSHTEIN_STRLEN so that we can avoid the ERROR from varstr_levenshtein(). Or perhaps varstr_levenshtein() could indicate that the string is too long without ERROR-ing (e.g., by returning -1). If the user-provided string is too long, we'd just omit the hint. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Thu, Sep 01, 2022 at 04:31:20PM -0700, Nathan Bossart wrote: > On Thu, Sep 01, 2022 at 07:08:49PM -0400, Tom Lane wrote: >> (1) there probably needs to be some threshold of closeness, so we don't >> offer "foobar" when the user wrote "huh" > > Agreed. > >> (2) there are several places doing this now, and there will no doubt >> be more later, so we need to try to abstract the logic so it can be >> shared. > > Will do. Here is a new patch. Two notes: * I considered whether to try to unite this new functionality with the existing stuff in parse_relation.c, but the existing code looked a bit too specialized. * I chose a Levenshtein distance of 5 as the threshold of closeness for the hint messages. This felt lenient, but it should hopefully still filter out some of the more ridiculous suggestions. However, it's still little more than a wild guess, so if folks think the threshold needs to be higher or lower, I'd readily change it. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Sep 02, 2022 at 02:26:09PM -0700, Nathan Bossart wrote: > Here is a new patch. Two notes: > > * I considered whether to try to unite this new functionality with the > existing stuff in parse_relation.c, but the existing code looked a bit too > specialized. > > * I chose a Levenshtein distance of 5 as the threshold of closeness for the > hint messages. This felt lenient, but it should hopefully still filter out > some of the more ridiculous suggestions. However, it's still little more > than a wild guess, so if folks think the threshold needs to be higher or > lower, I'd readily change it. Hmm. FWIW I would tend toward simplifying all this code and just drop all the hints rather than increasing the dependency to more levenshtein calculations in those error code paths, which is what Peter E has posted. -- Michael
Attachment
Michael Paquier <michael@paquier.xyz> writes: > On Fri, Sep 02, 2022 at 02:26:09PM -0700, Nathan Bossart wrote: >> * I chose a Levenshtein distance of 5 as the threshold of closeness for the >> hint messages. This felt lenient, but it should hopefully still filter out >> some of the more ridiculous suggestions. However, it's still little more >> than a wild guess, so if folks think the threshold needs to be higher or >> lower, I'd readily change it. > Hmm. FWIW I would tend toward simplifying all this code and just drop > all the hints rather than increasing the dependency to more > levenshtein calculations in those error code paths, which is what > Peter E has posted. Personally I'm not a huge fan of this style of hint either. However, people seem to like the ones for misspelled column names, so I'm betting there will be a majority in favor of this one too. I think the distance limit of 5 is too loose though. I see that it accommodates examples like "passfile" for "password", which seems great at first glance; but it also allows fundamentally silly suggestions like "user" for "server" or "host" for "foo". We'd need something smarter than Levenshtein if we want to offer "passfile" for "password" without looking stupid on a whole lot of other cases --- those words seem close, but they are close semantically not textually. regards, tom lane
On Fri, Sep 02, 2022 at 10:06:54PM -0400, Tom Lane wrote: > I think the distance limit of 5 is too loose though. I see that > it accommodates examples like "passfile" for "password", which > seems great at first glance; but it also allows fundamentally > silly suggestions like "user" for "server" or "host" for "foo". > We'd need something smarter than Levenshtein if we want to offer > "passfile" for "password" without looking stupid on a whole lot > of other cases --- those words seem close, but they are close > semantically not textually. Yeah, it's really only useful for simple misspellings, but IMO even that is rather handy. I noticed that the parse_relation.c stuff excludes matches where more than half the characters are different, so I added that here and lowered the distance limit to 4. This seems to prevent the silly suggestions (e.g., "host" for "foo") while retaining the more believable ones (e.g., "passfile" for "password"), at least for the small set of examples covered in the tests. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On 03.09.22 06:30, Nathan Bossart wrote: > On Fri, Sep 02, 2022 at 10:06:54PM -0400, Tom Lane wrote: >> I think the distance limit of 5 is too loose though. I see that >> it accommodates examples like "passfile" for "password", which >> seems great at first glance; but it also allows fundamentally >> silly suggestions like "user" for "server" or "host" for "foo". >> We'd need something smarter than Levenshtein if we want to offer >> "passfile" for "password" without looking stupid on a whole lot >> of other cases --- those words seem close, but they are close >> semantically not textually. > > Yeah, it's really only useful for simple misspellings, but IMO even that is > rather handy. > > I noticed that the parse_relation.c stuff excludes matches where more than > half the characters are different, so I added that here and lowered the > distance limit to 4. This seems to prevent the silly suggestions (e.g., > "host" for "foo") while retaining the more believable ones (e.g., > "passfile" for "password"), at least for the small set of examples covered > in the tests. I think this code is compact enough and the hints it produces are reasonable, so I think we could go with it. I notice that for column misspellings, the hint is phrased "Perhaps you meant X." whereas here we have "Did you mean X?". Let's make that uniform.
On Tue, Sep 13, 2022 at 08:32:43AM +0200, Peter Eisentraut wrote: > I notice that for column misspellings, the hint is phrased "Perhaps you > meant X." whereas here we have "Did you mean X?". Let's make that uniform. Good point. I attached a new version of the patch that uses the column phrasing. I wasn't sure whether "reference" was the right word to use in this context, but I used it for now for consistency with the column hints. I think "specify" or "use" would work as well. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Attachment
On 13.09.22 21:02, Nathan Bossart wrote: > On Tue, Sep 13, 2022 at 08:32:43AM +0200, Peter Eisentraut wrote: >> I notice that for column misspellings, the hint is phrased "Perhaps you >> meant X." whereas here we have "Did you mean X?". Let's make that uniform. > > Good point. I attached a new version of the patch that uses the column > phrasing. I wasn't sure whether "reference" was the right word to use in > this context, but I used it for now for consistency with the column hints. > I think "specify" or "use" would work as well. I don't think we need a verb there at all. I committed it without a verb.
On Fri, Sep 16, 2022 at 03:54:53PM +0200, Peter Eisentraut wrote: > I don't think we need a verb there at all. I committed it without a verb. Thanks! -- Nathan Bossart Amazon Web Services: https://aws.amazon.com