=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> Recently, I ran into an issue when trying to put comments on objects in
> plpgsql DO block, for example the following:
>
> DO $$
> DECLARE
> "comment" text := 'This is a comment';
> BEGIN
> COMMENT ON TABLE abc IS 'This is another comment';
> END;
> $$;
>
> Generates the following error message:
> ERROR: 42601: syntax error at or near "ON"
> LINE 5: COMMENT ON TABLE abc IS 'This is another comment';
>
> Renaming the variable from "comment" to something else makes the problem go
> away,
This isn't specific to COMMENT, you can cause problems with variables
named like any statement-introducing keyword. For instance
regression=# do $$
declare update int;
begin
update foo set bar = 42;
end$$;
ERROR: syntax error at or near "foo"
LINE 4: update foo set bar = 42;
^
On the other hand, such a variable works fine as long as you use it
as a variable:
regression=# do $$
declare update int;
begin
update := 42;
raise notice 'update = %', update;
end$$;
NOTICE: update = 42
DO
One idea is to get rid of the ambiguity by making all such words reserved
so far as plpgsql is concerned, but nobody would thank us for that.
It would be a maintenance problem too, because the plpgsql parser doesn't
otherwise have a list of such keywords.
I wonder whether we could improve matters by adjusting the heuristic for
such things in pl_scanner.c:
* If we are at start of statement, prefer unreserved keywords
* over variable names, unless the next token is assignment or
* '[', in which case prefer variable names. (Note we need not
* consider '.' as the next token; that case was handled above,
* and we always prefer variable names in that case.) If we are
* not at start of statement, always prefer variable names over
* unreserved keywords.
The trouble with special-casing unreserved keywords here is precisely
that those only include words that are special to plpgsql, not words
that introduce statements of the main SQL grammar.
Maybe, if we are at start of statement, we should not even consider the
possibility of matching an unqualified identifier to a variable name
unless the next token is assignment or '['. IOW, the logic here would
become (1) if not at statement start, or if next token is assignment or
'[', see if the identifier matches a variable name. (2) If not, see if
it matches an unreserved plpgsql keyword. (3) If not, assume it is a SQL
keyword.
Are there any other cases where recognizing a variable name at statement
start is the correct thing to do? Even if it's not correct, could this
result in worse error messages? I think you probably just end up with
"syntax error" whenever we guess wrong, but I might be missing something.
regards, tom lane