Re: PostgreSQL Gotchas - Mailing list pgsql-general
From | Chris Travers |
---|---|
Subject | Re: PostgreSQL Gotchas |
Date | |
Msg-id | 434ECBE7.7010608@travelamericas.com Whole thread Raw |
In response to | Re: PostgreSQL Gotchas (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-general |
Ok. here are some indepth thoughts after reviewing as many prior threads as I could find on the archives. Tom Lane wrote: >Chris Travers <chris@travelamericas.com> writes: > > >>Tom Lane wrote: >> >> >>>Since the end reward for all this work would be having to read CATALOGS >>>WRITTEN IN ALL UPPER CASE, none of the key developers seem very >>>interested ... >>> >>> >>> >>Why would this be required? >> >> > >If you write, say, > > select max(relpages) from pg_class; > > I have gone back and read the previous discussion and I still do not understand what the real impediment is (to at least putting it on the TODO list, at least). I see a lot of reasons that don't make any sense to me, and have come across one substantial obstacle that has not been mentioned yet to my knowledge. I understood this exmaple from your previous comment. But aside from the aggregate issue, I fail to see why fixing it is a requirement. Perhaps I am being unclear in my thoughts and we are talking past eachother. If I write that statement, and it gets back an error saying that no table is named PG_CLASS, that is my fault as an individual developer. After all I set the configuration option to fold to uppercase, right? The relevant question is very simple. Where does one draw the line between between the responsibility of the programmer and the responsibility of the DBA? Personally I think it is important to offer modes that offer as much standards-compliance as possible. From the previous discussion, it was mentioned that the backend treats identifiers as quoted. It seems to me that this should make it *easier* rather than *harder* to impliment because this is largely a change regarding what a given SQL statement means. I.e. most of the work I had foreseen has already been done. (My proposal would have been to have the backend treat identifiers used internally as already double-quoted.) If you are folding to lower case, your example, select max(relpages) from pg_class, is the same as SELECT MAX("relpages") FROM "pg_class"; If you are folding to upper case, your example is the same as SELECT MAX("RELPAGES") FROM "PG_CLASS"; Of course, we don't expect this to give us any results today. In essence, I don't see why we would expect this to return any results. If you issue an incorrect SQL statement, whose fault is that? Now, the one place where this might create a problem is in the information_schema. The problem here is not the same as any issue I have seen discussed before, but the fact that case folding could create non-standard behavior here absent other changes. The only option I see here is to create a second INFORMATION_SCHEMA with upper case view names. >and the lexer thinks that it should fold unquoted identifiers to upper >case, then the catalog entries defining these names had better read >PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain >today. > Ok. so the only problem out of these three that I see is with MAX. PG_CLASS and RELPAGES are the responsibility of the developer, IMHO. Also, with functions and aggregates, this is not the problem that it is with tables (as the name isn't usually sent back to the client), so I don't know how much logic it would be to differentiate between table/column names which might need to be folded and function/aggregates which could continue the way that they are currently done at least for now. It might also be possible to create duplicate entries to the catelogs for builtin functions/aggregates in the catalogs so that this case folding is not causing the same type of problem. However, for builtin functions/aggregates, I am not sure if this is likely to have any significant performance hit. > So this wouldn't be something you could flip on-the-fly --- at >the latest, an installation would have to commit to upper or lower case >at initdb time, because the initial contents of all the system catalogs >would need to match the choice. > > Ok, so I see the objection basically being that changing the semantics of the SQL statement would need to be done in a way that prevents user-issued queries from having to know which way this is done. There is no way around this objection because it is inconsistant with the very idea of semantic changes to the SQL parser. Yet we have done this in the past in areas I have previously mentioned. So the question really is what is really required to make this work in a semantically clean way. IMO, the following requirement is acceptable: SELECT MAX("relpages") FROM "pg_class" But the following is not: SELECT "max"("relpages") FROM "pg_class" However, I am sure that there will be people who don't see that as OK. So, I would suggest that if they neeed to avoid quoting relpages and pg_class, then they can create a PG_CATALOG schema and a PG_CLASS view. Maybe this could be a pgfoundry project even. But I don't see it as a requirement of the core team. The bigger issue is with "MAX" v. "max" and in "INFORMATION_SCHEMA" v. "information_schema." This might require duplicate entries in the system catalogs. >Please read the previous discussions on the topic, if you want to >pontificate about it. > > Is there still any specific reason why this does not belong on the TODO list? I am not arguing that this should be a priority in development. I am however arguing that since the behavior is non-standard, it might be worth acknowledging this and at least suggesting that it should be fixed at some indefinite point in the future.... If this was a requirement for me, I would hire someone to make the changes and submit a patch... It is just an attempt to ensure that it is on the roadmap at this time. Best Wishes, Chris Travers Metatron Technology Consulting
pgsql-general by date: