Re: PostgreSQL Gotchas - Mailing list pgsql-general

From Greg Stark
Subject Re: PostgreSQL Gotchas
Date
Msg-id 87fyr4gigj.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: PostgreSQL Gotchas  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: PostgreSQL Gotchas  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: PostgreSQL Gotchas  (Chris Travers <chris@metatrontech.com>)
List pgsql-general
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> and the lexer thinks that it should fold unquoted identifiers to upper
> >> case, then the catalog entries defining these names had better read
> >> PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
> >> today.
>
> > Well the case of unquoted identifiers could be finessed by having it match
> > RELPAGES first and fail over to relpages second. It could even be made to
> > match RelPages and whatever if there isn't any ambiguity.
>
> [ shrug... ]  Sure, you could invent some rule that might sort of work
> most of the time.  But then you've abandoned the sole rationale for the
> entire project, which is to *adhere to the standard*.  Any kind of funny
> business with the case folding rules will make things worse not better
> from that standpoint.

Well sure, it would only be worthwhile if you could come up with rules that
complied with the standard 100% of the time that the standard specifies
behaviour. But if you could do that and satisfy 99% of the backwards
compatibility issues including any catalog related issues then it seems like
it would be worthwhile.

But on further thought, if you want to have pg_dump et al output lowercase
names (which I certainly prefer) then I think what you would have to do is
have a bit that travels with every identifier that indicates whether it was
quoted or not.

So two identifiers match if either is an unquoted identifier and they match
case insensitively. Or if both are quoted and they match case sensitively.

Actually I think you can get closer to the standard if you interpret "case
insensitively" above to mean they match after downcasing the unquoted
identifier(s) or upcasing the unquoted identifier(s) but not if one of them is
unquoted and the other is quoted and mixed case.

There's still some funny business: if you create both "foo" and "FOO" and then
refer to one of them with an unquoted foo. Since it's an unquoted identifier
it matches both. If you give preference to the "FOO" then that would follow
the standard.

Also if you try to create both a "foo" and an unquoted foo then this method
would say that's a conflict whereas the standard would say it was acceptable.

I think this is an improvement on what's there now though. It lets Postgres
have case-preserving case-insensitive unquoted identifiers which people have
asked about multiple times and it solves most of the inter-database
compatibility problems with case sensitivity.

--
greg

pgsql-general by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: On "multi-master"
Next
From: han.holl@informationslogik.nl
Date:
Subject: Re: Postgres logs to syslog LOCAL0