Re: [PATCH] Add hint for misspelled relations - Mailing list pgsql-hackers

From Steve Chavez
Subject Re: [PATCH] Add hint for misspelled relations
Date
Msg-id CAGRrpzZ27b8fmpyqZvssgvYGVHYtU19x1PGW340axzX6n-bHng@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Add hint for misspelled relations  (Daniel Gustafsson <daniel@yesql.se>)
List pgsql-hackers
Hi Kirill and Daniel, many thanks for your feedback.

> Why do we even do this? Can't we just pass fuzzystate->distance to
varstr_levenshtein_less_equal? It is initialized in outer func to
MAX_FUZZY_DISTANCE + 1

This was an oversight. I've corrected it on the new attached patch.

 > Worst case is that it never contains the a close enough match and
we'll spend a scan + hash + calculation of MAX_REL_HINT_CANDIDATES for each
typo.
> Are there ways you can pare down the scan, perhaps a scankey on relispartition
and only consider base relations?

I've improved the code (new patch attached) and the hash is no longer necessary.
Now the scan omits partitions as you suggested plus they're now restricted to each schema on the search_path.

It was also possible to do more scans for each relevant relkind (RELKIND_RELATION, RELKIND_VIEW, etc) but I wasn't sure if that was more expensive than doing the filtering on the loop as it's currently done.
It also made the code a bit more complicated with more nested loops, so I've left it as is.

Perhaps we should limit the amount of schema scans too in case the search_path has too many schemas?

> Hinting on relations can also give hints for relations the user does not have
permissions on which further reduce the useability.

I've tested the current column hint and it also doesn't consider privileges:

```
create role new nosuperuser;
grant usage on schema public to new;
create table subitems (id int, name text);
grant select (id) on subitems to new;

begin;
set local role to new;
select nam from subitems;
2025-12-09 18:16:38.017 -05 [542091] ERROR:  column "nam" does not exist at character 8
2025-12-09 18:16:38.017 -05 [542091] HINT:  Perhaps you meant to reference the column "subitems.name".
2025-12-09 18:16:38.017 -05 [542091] STATEMENT:  select nam from subitems ;
ERROR:  column "nam" does not exist
LINE 1: select nam from subitems ;
HINT:  Perhaps you meant to reference the column "subitems.name".
```

So perhaps that can be a separate enhancement for both hints?

> The tab-completion in psql is one tool offered to avoid misspellings which
reduce the need.

True, but it's still useful to offer a hint for other clients.

What do you think?

Best regards,
Steve Chavez

On Tue, 2 Dec 2025 at 04:27, Daniel Gustafsson <daniel@yesql.se> wrote:
> On 2 Dec 2025, at 03:46, Steve Chavez <steve@supabase.io> wrote:

> Currently misspelled columns offer a hint but not misspelled relations.

The tab-completion in psql is one tool offered to avoid misspellings which
reduce the need.

> One problem is that scanning all pg_class entries can get expensive on big catalogs, so the number of searches is capped by MAX_REL_HINT_CANDIDATES. I've set this to 4096 arbitrarily, any guidance on what would be a good number is appreciated. Personally I've seen  a catalog that contains 125K tables, with mostly auto generated names. For these cases I don't think the hint helps that much anyway, so it seemed fine to bail here.

What makes hints for columns appealing is that it's a pretty contained problem
across data we've already accumulated, relations are quite different as they
require a catalog lookup making it a lot less appealing.  The number of
relations can easily become quite large, especially when large partitioning
hierarchies are involved, and while capping prevents large scans there is no
guarantee that the MAX_REL_HINT_CANDIDATES set contains the most likely
entries.  Worst case is that it never contains the a close enough match and
we'll spend a scan + hash + calculation of MAX_REL_HINT_CANDIDATES for each
typo.

Are there ways you can pare down the scan, perhaps a scankey on relispartition
and only consider base relations?

Hinting on relations can also give hints for relations the user does not have
permissions on which further reduce the useability.

--
Daniel Gustafsson

Attachment

pgsql-hackers by date:

Previous
From: Nico Williams
Date:
Subject: Re: [oauth] SASL mechanisms
Next
From: Bertrand Drouvot
Date:
Subject: Re: Mark function arguments of type "T *" as "const T *" where possible