Thread: PostgreSQL Gotchas

PostgreSQL Gotchas

From
"Aly S.P Dharshi"
Date:
http://sql-info.de/postgresql/postgres-gotchas.html

Any comments from folks on the list ?

Cheers,

Aly.

--
Aly S.P Dharshi
aly.dharshi@telus.net

      "A good speech is like a good dress
       that's short enough to be interesting
       and long enough to cover the subject"

Re: PostgreSQL Gotchas

From
Alex Turner
Date:
Compared to MySQL ditching referential integrity because of a typo, I would consider these 'gotchas' extremely minor, hence the reason I use Postgresql not MySQL.  Postgresql does what you expect from an RDBMS system out of the box in 99.99% of cases.  I don't have to toggle things on special like, or install additional modules, or check every time I create a foreign key that the system will actualy enforce it.

In short Postgresql makes less work for a DBA than MySQL when you actualy care about your data.  It's cleaner, faster and easier to deal with in all cases of real world database applications.

I'm not even gonna talk about the query planner in MySQL trying to deal with an eight way join, let's just say that if you plan on using anything even near third normal, MySQL will dog.  Oh - yeah - and the fact that it doesn't scale...

If those 'gotchas' are all one has against Postgresql... I don't know why people are still even using Oracle or DB2 ;) MySQL you say, I thought we were talking about RDBMSes here.

MySQL is to linux, what Jet  is to Windows IMHO, oh wait - Jet has foreign keys by default...

Alex

On 10/6/05, Aly S.P Dharshi <aly.dharshi@telus.net> wrote:
http://sql-info.de/postgresql/postgres-gotchas.html

Any comments from folks on the list ?

Cheers,

Aly.

--
Aly S.P Dharshi
aly.dharshi@telus.net

         "A good speech is like a good dress
          that's short enough to be interesting
          and long enough to cover the subject"

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: PostgreSQL Gotchas

From
"Gavin M. Roy"
Date:
This sure sounds like a flamewar bait email?

On Oct 6, 2005, at 11:07 AM, Aly S.P Dharshi wrote:

> http://sql-info.de/postgresql/postgres-gotchas.html
>
> Any comments from folks on the list ?
>
> Cheers,
>
> Aly.
>
> --
> Aly S.P Dharshi
> aly.dharshi@telus.net
>
>      "A good speech is like a good dress
>       that's short enough to be interesting
>       and long enough to cover the subject"
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>

Gavin M. Roy
800 Pound Gorilla
gmr@ehpg.net



Re: PostgreSQL Gotchas

From
"Aly S.P Dharshi"
Date:
No flamewar here, I am just trying to see if opinions of others on this,
as Jim had posted a MySQL one, and that there was a PostgreSQL one, so I
wanted to see if these are valid, if they aren't then that site should be
updated to reflect this.

Cheers,

Aly.

On Thu, 6 Oct 2005, Gavin M. Roy wrote:

> This sure sounds like a flamewar bait email?
>
> On Oct 6, 2005, at 11:07 AM, Aly S.P Dharshi wrote:
>
>> http://sql-info.de/postgresql/postgres-gotchas.html
>>
>> Any comments from folks on the list ?
>>
>> Cheers,
>>
>> Aly.
>>
>> -- Aly S.P Dharshi
>> aly.dharshi@telus.net
>>
>>     "A good speech is like a good dress
>>      that's short enough to be interesting
>>      and long enough to cover the subject"
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>      choose an index scan if your joining column's datatypes do not
>>      match
>>
>
> Gavin M. Roy
> 800 Pound Gorilla
> gmr@ehpg.net
>
>

--
Aly S.P Dharshi
aly.dharshi@telus.net

      "A good speech is like a good dress
       that's short enough to be interesting
       and long enough to cover the subject"

Re: PostgreSQL Gotchas

From
"Gavin M. Roy"
Date:
Then wouldn't it be more appropriate to ask:

What's the status of the things listed on the PostgreSQL gotchas.
Are they bugs?  Are they valid?  Have the been resolved?  What does
the community thing of those gotchas?

I personally don't take an open ended question like "URL: Any
comments?" as trying to be productive.  But I could also just be
burnt out on the whole PgSQL vs MySQL flame war or reading too much
into what I appraised to be terse email.  If so I apologize.

Regards,

Gavin

On Oct 6, 2005, at 12:01 PM, Aly S.P Dharshi wrote:

> No flamewar here, I am just trying to see if opinions of others on
> this, as Jim had posted a MySQL one, and that there was a
> PostgreSQL one, so I wanted to see if these are valid, if they
> aren't then that site should be updated to reflect this.
>
> Cheers,
>
> Aly.
>
> On Thu, 6 Oct 2005, Gavin M. Roy wrote:
>
>
>> This sure sounds like a flamewar bait email?
>>
>> On Oct 6, 2005, at 11:07 AM, Aly S.P Dharshi wrote:
>>
>>
>>> http://sql-info.de/postgresql/postgres-gotchas.html
>>> Any comments from folks on the list ?
>>> Cheers,
>>> Aly.
>>> -- Aly S.P Dharshi
>>> aly.dharshi@telus.net
>>>
>>>     "A good speech is like a good dress
>>>      that's short enough to be interesting
>>>      and long enough to cover the subject"
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>>      choose an index scan if your joining column's datatypes do not
>>>      match
>>>
>>
>> Gavin M. Roy
>> 800 Pound Gorilla
>> gmr@ehpg.net
>>
>>
>>
>
> --
> Aly S.P Dharshi
> aly.dharshi@telus.net
>
>      "A good speech is like a good dress
>       that's short enough to be interesting
>       and long enough to cover the subject"
>


Re: PostgreSQL Gotchas

From
Scott Marlowe
Date:
They're all valid, but most apply to versions < 8.0 or < 7.4 even, and
the others are pretty esoteric issues that you don't see often.  The
missing from clause thing is likely to be the biggest surprise most
folks run into.

I find the supposed bad performance of aggregates is bunk.  On my
workstation (not even a server, just my personal workstation with all
kinds of other crap running on it) I get a response time of about 1
second for a count(*) in version 7.4 of postgresql:

select count(*) from locators;
  count
---------
 1000000
(1 row)

takes anywhere from .75 to 1.0 second.

The same dataset, on oracle takes 0.75 seconds on a very fast oracle
server, and anywhere from 5 to 20 seconds on a slower server, like one
equivalent to my workstation.


On Thu, 2005-10-06 at 14:01, Aly S.P Dharshi wrote:
> No flamewar here, I am just trying to see if opinions of others on this,
> as Jim had posted a MySQL one, and that there was a PostgreSQL one, so I
> wanted to see if these are valid, if they aren't then that site should be
> updated to reflect this.
>
> Cheers,
>
> Aly.
>
> On Thu, 6 Oct 2005, Gavin M. Roy wrote:
>
> > This sure sounds like a flamewar bait email?
> >
> > On Oct 6, 2005, at 11:07 AM, Aly S.P Dharshi wrote:
> >
> >> http://sql-info.de/postgresql/postgres-gotchas.html
> >>
> >> Any comments from folks on the list ?
> >>
> >> Cheers,
> >>
> >> Aly.
> >>
> >> -- Aly S.P Dharshi
> >> aly.dharshi@telus.net
> >>
> >>     "A good speech is like a good dress
> >>      that's short enough to be interesting
> >>      and long enough to cover the subject"
> >>
> >> ---------------------------(end of broadcast)---------------------------
> >> TIP 9: In versions below 8.0, the planner will ignore your desire to
> >>      choose an index scan if your joining column's datatypes do not
> >>      match
> >>
> >
> > Gavin M. Roy
> > 800 Pound Gorilla
> > gmr@ehpg.net
> >
> >

Re: PostgreSQL Gotchas

From
"Aly S.P Dharshi"
Date:
Okay that is a fair statement to make, hence restating:

- What is the status of those items listed on the PostgreSQL gotchas
- Are they bugs ?
- Are they valid statements ?
- If they are bugs are they resolved ?
- What does the PG community thing of this list of gotchas ?

http://sql-info.de/postgresql/postgres-gotchas.html


On Thu, 6 Oct 2005, Gavin M. Roy wrote:

> Then wouldn't it be more appropriate to ask:
>
> What's the status of the things listed on the PostgreSQL gotchas.  Are they
> bugs?  Are they valid?  Have the been resolved?  What does the community thing
> of those gotchas?
>
> I personally don't take an open ended question like "URL: Any comments?" as
> trying to be productive.  But I could also just be burnt out on the whole
> PgSQL vs MySQL flame war or reading too much into what I appraised to be terse
> email.  If so I apologize.
>
> Regards,
>
> Gavin

--
Aly S.P Dharshi
aly.dharshi@telus.net

      "A good speech is like a good dress
       that's short enough to be interesting
       and long enough to cover the subject"

Re: PostgreSQL Gotchas

From
Steve Atkins
Date:
On Thu, Oct 06, 2005 at 12:07:12PM -0600, Aly S.P Dharshi wrote:

> http://sql-info.de/postgresql/postgres-gotchas.html
>
> Any comments from folks on the list ?

The implicit from and mildly non-standard case folding are the only
two that approach being real issues people may stumble over, IMO. If
you look at the others that may be problematic in real use they're
mostly for archaic versions of PG.

The others that are still valid with 8.0 are... not so much problems
with PG as differences in the way PG works that users should be aware
of.

Cheers,
  Steve

Re: PostgreSQL Gotchas

From
"Dann Corbit"
Date:
Don't think so.
The author sounds like a PostgreSQL proponent to me.

It also sounds like most of the issues have been addressed with recent
builds.

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Gavin M. Roy
> Sent: Thursday, October 06, 2005 11:40 AM
> To: Aly S.P Dharshi
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] PostgreSQL Gotchas
>
> This sure sounds like a flamewar bait email?
>
> On Oct 6, 2005, at 11:07 AM, Aly S.P Dharshi wrote:
>
> > http://sql-info.de/postgresql/postgres-gotchas.html
> >
> > Any comments from folks on the list ?
> >
> > Cheers,
> >
> > Aly.
> >
> > --
> > Aly S.P Dharshi
> > aly.dharshi@telus.net
> >
> >      "A good speech is like a good dress
> >       that's short enough to be interesting
> >       and long enough to cover the subject"
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >       choose an index scan if your joining column's datatypes do not
> >       match
> >
>
> Gavin M. Roy
> 800 Pound Gorilla
> gmr@ehpg.net
>
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

Re: PostgreSQL Gotchas

From
Alex Turner
Date:
Sorry.

Alex

On 10/6/05, Gavin M. Roy <gmr@ehpg.net> wrote:
This sure sounds like a flamewar bait email?

On Oct 6, 2005, at 11:07 AM, Aly S.P Dharshi wrote:

> http://sql-info.de/postgresql/postgres-gotchas.html
>
> Any comments from folks on the list ?
>
> Cheers,
>
> Aly.
>
> --
> Aly S.P Dharshi
> aly.dharshi@telus.net
>
>      "A good speech is like a good dress
>       that's short enough to be interesting
>       and long enough to cover the subject"
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>

Gavin M. Roy
800 Pound Gorilla
gmr@ehpg.net



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: PostgreSQL Gotchas

From
CSN
Date:
> On 10/6/05, Aly S.P Dharshi <aly ( dot ) dharshi (
at ) telus ( dot ) net> wrote:
>
>
http://sql-info.de/postgresql/postgres-gotchas.html
>
>     Any comments from folks on the list ?

- It's a lot shorter than MySQL's gotchas list.
- 8 of the 13 are for versions of PostgreSQL <= 8.1
- Of the remaining, I consider "select as" to be
really trivial (and it appears a work-around can be
hacked).
- lowercase folding. I DO sometimes wish I could use
fieldID, etc. without quoting it.
- I've never found count(*) to be slow.
- I don't know enough about the "UNICODE means
"UTF-8"" and "RANDOM() failures" to comment.

CSN



__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com

Re: PostgreSQL Gotchas

From
David Fetter
Date:
On Thu, Oct 06, 2005 at 12:07:12PM -0600, Aly S.P Dharshi wrote:
> http://sql-info.de/postgresql/postgres-gotchas.html
>
> Any comments from folks on the list ?

Several.

First, it looks to me like this is a pretty transparent attempt to
troll, so I'm not going to go there.

Second, if you actually go through that list, you find:

* Problems fixed in the last version or three of PostgreSQL.  This is
  an incentive to upgrade, not a gotcha.

* Problems that will be fixed in the next version of PostgreSQL.
  This means that problems get on developers' radar and get fixed.  I
  suppose by some extremely un-generous method of assessment, this
  could be a gotcha.

* Incomplete information, e.g. not mentioning Slony-I as an upgrade
  path.

* One deviation from the standard that won't be changed: fold-to-lower.

* Nits so minor as not to be worth addressing (non-optional AS in FROM
  clauses vs. SQL standard's making that AS optional there)

Third, there are real gotchas, but those are generally reported as
bugs on pgsql-bugs and acted upon as such.

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: PostgreSQL Gotchas

From
CSN
Date:

--- CSN <cool_screen_name90001@yahoo.com> wrote:
> - 8 of the 13 are for versions of PostgreSQL <= 8.1

Doh!

- 8 of the 13 are for versions of PostgreSQL < 8.1!




__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com

Re: PostgreSQL Gotchas

From
Neil Conway
Date:
On Thu, 2005-06-10 at 12:07 -0600, Aly S.P Dharshi wrote:
> http://sql-info.de/postgresql/postgres-gotchas.html
>
> Any comments from folks on the list ?

"SELECT column alias, ...": this is a known issue. AFAIK it is not easy
to solve.

"Unquoted object names fold to lower case": this is intentional, both
because the developers prefer this behavior and because it is consistent
with the behavior of prior PostgreSQL versions.

"Implicit FROM item and unintended cross joins": fixed in 8.1, as the
gotcha notes.

"COUNT(*) very slow": this is a known issue -- see the -hackers archives
for many prior discussions. MVCC makes this hard to solve effectively
(whether applications should actually be using COUNT(*) on large tables
with no WHERE clause is another matter...)

-Neil



Re: PostgreSQL Gotchas

From
Ben
Date:
CSN wrote:

>- I don't know enough about the "UNICODE means
>"UTF-8"" and "RANDOM() failures" to comment.
>
>
I'm hardly an expert, but I've done enough with unicode to know that you
can easily convert utf-8 to any other flavor of unicode you might want
to use. Though, why you'd want to use something other than utf-8 in the
first place I don't understand.

Re: PostgreSQL Gotchas

From
Stephen Frost
Date:
* Aly S.P Dharshi (aly.dharshi@telus.net) wrote:
> - What is the status of those items listed on the PostgreSQL gotchas
> - Are they bugs ?
> - Are they valid statements ?
> - If they are bugs are they resolved ?
> - What does the PG community thing of this list of gotchas ?
>
> http://sql-info.de/postgresql/postgres-gotchas.html

There's only 6 things on there which apply to the current version of
PostgreSQL, and one of those will be fixed in 8.1.

The remaining items are:

SELECT column alias,...
  This issue looks like it has probably been hashed over a few times and
  at the end of the day the answer is "Best to use AS for clarity
  anyway, and it is documented to be necessary, and Postgres upstream
  has decided to require it instead of making alot of things reserved
  which wouldn't otherwise need to be."

Unquoted object names fold to lower case
  I don't really see this as too much of an issue, personally, but I do
  know some people have run into it.  The example they give seems a bit
  off tho, as I thought Oracle just folded to upper-case (in which case
  "table" != table in Oracle, but "table" = table in Postgres, kind of
  depends on how you look at it).  I'm not sure the spec says one way or
  another.  I've had problems with this when people were building things
  against a MySQL database w/o proper quoting and assuming it was
  case-sensitive when under Postgres it wasn't.

RANDOM() failures
  This looks like an interesting problem, and might be a bug to be
  fixed.  I think it depends on the what the SQL specification says for
  this case, if it says anything at all, as to if it should be
  considered a bug or not.  On the surface it certainly does look like
  an oddity though.

COUNT(*) very slow
  As someone else has pointed out, it's only slow if you've got a large
  dataset.  There's certainly workarounds for this issue (generally
  involving a couple of functions for keeping track of the number of
  rows).  In this way, Postgres actually gives the user the flexibility
  to choose the performance loss on insert/deletes which is necessary to
  track the number of rows seperately, or not to.

UNICODE means "UTF-8"
  This is an interesting point.  To be perfectly honest, it seems like
  Postgres' UNICODE/UTF-8/etc support could stand to be somewhat better.
  I've not used it much myself but I do see comments about it on the
  lists from time to time.  I think there was some work done on this
  for 8.1 and I expect support will only continue to improve in these
  areas.

  Just my 2c.

      Thanks,

        Stephen

Attachment

Re: PostgreSQL Gotchas

From
Bruce Momjian
Date:
Stephen Frost wrote:
> UNICODE means "UTF-8"
>   This is an interesting point.  To be perfectly honest, it seems like
>   Postgres' UNICODE/UTF-8/etc support could stand to be somewhat better.
>   I've not used it much myself but I do see comments about it on the
>   lists from time to time.  I think there was some work done on this
>   for 8.1 and I expect support will only continue to improve in these
>   areas.

Yes, this is completely fixed in 8.1, except we allow UNICODE to be a
synonym for UTF-8.  We also added Win32 support for UTF8 in 8.1.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: PostgreSQL Gotchas

From
Andrew Sullivan
Date:
On Thu, Oct 06, 2005 at 01:19:33PM -0600, Aly S.P Dharshi wrote:
> Okay that is a fair statement to make, hence restating:
>
> - What is the status of those items listed on the PostgreSQL gotchas

Some of them are for sure gotchas.  For users of
tranditionally-non-toy (as opposed to recently-non-toy) systems, I'd
think that there are three serious issues: the column alias
(because it strictly violates the SQL definition), the folding to
lower case (same reason), and the UNICODE==UTF-8 issue (because it
violates the UNICODE definition).

> - Are they valid statements ?
> - If they are bugs are they resolved ?

Some of them are valid and resolved (the site actually notes as much
in many cases).

One thing I do note is that there's a workaround for the have-to-dump
issue: you can use Slony to upgrade (by design).  It won't work for
everyone, but it's a big step up in many cases.

A

--
Andrew Sullivan  | ajs@crankycanuck.ca
"The year's penultimate month" is not in truth a good way of saying
November.
        --H.W. Fowler

Re: PostgreSQL Gotchas

From
"Jim C. Nasby"
Date:
On Thu, Oct 06, 2005 at 12:29:51PM -0700, David Fetter wrote:
> * Problems that will be fixed in the next version of PostgreSQL.
>   This means that problems get on developers' radar and get fixed.  I
>   suppose by some extremely un-generous method of assessment, this
>   could be a gotcha.
>
> * Incomplete information, e.g. not mentioning Slony-I as an upgrade
>   path.

From the few times I've emailed the author he seems happy to accept
better text. I'm sure he'd be happy to mention slony as an upgrade
option (though it would be best if there was a good doc describing this
we could point to). Likewise I believe he'd note things that will be
fixed in a given version (such as 8.1 or 8.2).

> * One deviation from the standard that won't be changed: fold-to-lower.

Does the standard even specify which case you fold to?

I agree, this is only a gotcha if you're used to the very non-standard
behavior in MySQL.

> * Nits so minor as not to be worth addressing (non-optional AS in FROM
>   clauses vs. SQL standard's making that AS optional there)

*shrug* I think it's useful to be able to see why something you might
expect to work doesn't.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: PostgreSQL Gotchas

From
"Jim C. Nasby"
Date:
On Thu, Oct 06, 2005 at 04:18:03PM -0400, Stephen Frost wrote:
> COUNT(*) very slow
>   As someone else has pointed out, it's only slow if you've got a large
>   dataset.  There's certainly workarounds for this issue (generally
>   involving a couple of functions for keeping track of the number of
>   rows).  In this way, Postgres actually gives the user the flexibility
>   to choose the performance loss on insert/deletes which is necessary to
>   track the number of rows seperately, or not to.

Actually, I think the real issue here is that unlike every other
database I've used, PostgreSQL can't do covering index scans, where only
the index is read and not the base table. This is due to not having MVCC
visibility info in the index. But there has been discussion on -hackers
about a way to get close to this behavior; something along the lines of
setting a bit in the index once a tuple is visible to all running
transactions. This would be a win on larger indexes that don't have a
lot of insert/update/delete activity.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: PostgreSQL Gotchas

From
"Jim C. Nasby"
Date:
On Thu, Oct 06, 2005 at 02:30:53PM -0400, Alex Turner wrote:
> MySQL is to linux, what Jet is to Windows IMHO, oh wait - Jet has foreign
> keys by default...

MySQL is the WindowsME of databases <- first hit searching for MySQL on
CafePress. :)
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: PostgreSQL Gotchas

From
David Fetter
Date:
On Thu, Oct 06, 2005 at 05:29:14PM -0500, Jim C. Nasby wrote:
> On Thu, Oct 06, 2005 at 12:29:51PM -0700, David Fetter wrote:
> > * Problems that will be fixed in the next version of PostgreSQL.
> > This means that problems get on developers' radar and get fixed.
> > I suppose by some extremely un-generous method of assessment, this
> > could be a gotcha.
> >
> > * Incomplete information, e.g. not mentioning Slony-I as an
> > upgrade path.
>
> From the few times I've emailed the author he seems happy to accept
> better text.  I'm sure he'd be happy to mention slony as an upgrade
> option (though it would be best if there was a good doc describing
> this we could point to).  Likewise I believe he'd note things that
> will be fixed in a given version (such as 8.1 or 8.2).

I'll draft something up :)

> > * One deviation from the standard that won't be changed: fold-to-lower.
>
> Does the standard even specify which case you fold to?

It specifies fold-to-upper, and that's just wrong. :P

> I agree, this is only a gotcha if you're used to the very
> non-standard behavior in MySQL.
>
> > * Nits so minor as not to be worth addressing (non-optional AS in
> > FROM clauses vs. SQL standard's making that AS optional there)
>
> *shrug* I think it's useful to be able to see why something you
> might expect to work doesn't.

I suppose it's good to have those.  There are other things I think of
as bugs, and would like to have addressed internally before I send
them off to gotchas land.  PG's incomplete support for DOMAINs, for
example, or the not-quite-it handling of composite types, or things
that are missing (imho) important features...

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: PostgreSQL Gotchas

From
Jeffrey Melloy
Date:
Neil Conway wrote:

>
>"COUNT(*) very slow": this is a known issue -- see the -hackers archives
>for many prior discussions. MVCC makes this hard to solve effectively
>(whether applications should actually be using COUNT(*) on large tables
>with no WHERE clause is another matter...)
>
>-Neil
>
>
And it's not like a count(*) on an Oracle database of any decently-sized
dataset is blazing fast, or even in blazing's ballpark.

The only thing I could see actually being an issue is the random() one
and add missing from.  The rest are trivial.  The random() thing is
interesting, esoteric, and probably has never been a problem in a real
situation.  (Or has exactly once, when he wrote that gotcha)

Jeff

Re: PostgreSQL Gotchas

From
"Jim C. Nasby"
Date:
On Thu, Oct 06, 2005 at 12:54:43PM -0700, CSN wrote:
> - lowercase folding. I DO sometimes wish I could use
> fieldID, etc. without quoting it.

I believe that may be against ANSI SQL. In any case, the only databases
I can think of that don't fold-case in some form are MySQL and Access.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: PostgreSQL Gotchas

From
CSN
Date:
Yep, I think the SQL spec says fold to uppercase. I'm
not sure why PostgreSQL folds to lowercase instead,
but if folding has to occur, I prefer lowercase.

CSN


--- "Jim C. Nasby" <jnasby@pervasive.com> wrote:

> On Thu, Oct 06, 2005 at 12:54:43PM -0700, CSN wrote:
> > - lowercase folding. I DO sometimes wish I could
> use
> > fieldID, etc. without quoting it.
>
> I believe that may be against ANSI SQL. In any case,
> the only databases
> I can think of that don't fold-case in some form are
> MySQL and Access.
> --
> Jim C. Nasby, Sr. Engineering Consultant
> jnasby@pervasive.com
> Pervasive Software      http://pervasive.com
> work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf
> cell: 512-569-9461
>




__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Unquoted object names fold to lower case
>   I don't really see this as too much of an issue, personally, but I do
>   know some people have run into it.  The example they give seems a bit
>   off tho, as I thought Oracle just folded to upper-case (in which case
>   "table" != table in Oracle, but "table" = table in Postgres, kind of
>   depends on how you look at it).  I'm not sure the spec says one way or
>   another.

Yeah, it does, and the spec sides with Oracle.  This is explained
exactly in the manual:

: Quoting an identifier also makes it case-sensitive, whereas unquoted
: names are always folded to lower case. For example, the identifiers FOO,
: foo, and "foo" are considered the same by PostgreSQL, but "Foo" and
: "FOO" are different from these three and each other. (The folding of
: unquoted names to lower case in PostgreSQL is incompatible with the SQL
: standard, which says that unquoted names should be folded to upper
: case. Thus, foo should be equivalent to "FOO" not "foo" according to the
: standard. If you want to write portable applications you are advised to
: always quote a particular name or never quote it.)


> RANDOM() failures
>   This looks like an interesting problem, and might be a bug to be
>   fixed.  I think it depends on the what the SQL specification says for
>   this case, if it says anything at all, as to if it should be
>   considered a bug or not.  On the surface it certainly does look like
>   an oddity though.

Well, it's surely legitimate material as a "gotcha".  The example is
taken from
http://archives.postgresql.org/pgsql-general/2004-11/msg01375.php
and the "previous discussion" referred to is this thread:
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00082.php

As you can see, there wasn't a lot of agreement that we ought to change
it.  Arguably, if we did change it we'd get ripped for the "gotcha" of
poor optimization when the user forgets to mark nonvolatile functions
properly.  (Personally, though, I'm in favor of tightening it up.)

            regards, tom lane

Re: PostgreSQL Gotchas

From
Michael Fuhr
Date:
On Thu, Oct 06, 2005 at 05:47:36PM -0500, Jeffrey Melloy wrote:
> The only thing I could see actually being an issue is the random() one
> and add missing from.  The rest are trivial.  The random() thing is
> interesting, esoteric, and probably has never been a problem in a real
> situation.  (Or has exactly once, when he wrote that gotcha)

The random() issue has a workaround that the gotchas page doesn't
mention:

SELECT id, is_true
FROM (SELECT id, RANDOM() < 0.5 AS is_true FROM some_table OFFSET 0) AS t_tmp
WHERE is_true;

Tom Lane brought up the OFFSET 0 trick a couple of days ago in the
"Avoiding evaluating functions twice" thread:

http://archives.postgresql.org/pgsql-general/2005-10/msg00107.php

--
Michael Fuhr

Re: PostgreSQL Gotchas

From
Ian Barwick
Date:
On 10/7/05, Michael Fuhr <mike@fuhr.org> wrote:
> On Thu, Oct 06, 2005 at 05:47:36PM -0500, Jeffrey Melloy wrote:
> > The only thing I could see actually being an issue is the random() one
> > and add missing from.  The rest are trivial.  The random() thing is
> > interesting, esoteric, and probably has never been a problem in a real
> > situation.  (Or has exactly once, when he wrote that gotcha)
>
> The random() issue has a workaround that the gotchas page doesn't
> mention:
>
> SELECT id, is_true
> FROM (SELECT id, RANDOM() < 0.5 AS is_true FROM some_table OFFSET 0) AS t_tmp
> WHERE is_true;
>
> Tom Lane brought up the OFFSET 0 trick a couple of days ago in the
> "Avoiding evaluating functions twice" thread:
>
> http://archives.postgresql.org/pgsql-general/2005-10/msg00107.php

Thanks, I've updated the list.

I'd like to take the opportunity to point out that following the
original, unexpected "success" of the MySQL gotchas list, I created
one for PostgreSQL for the sake of "balance". Though I'm really having
to scrape the barrel for material ;-).

The MySQL list is a little outdated; I'm going through it with updates
for version 5.

Ian Barwick

Re: PostgreSQL Gotchas

From
Richard Huxton
Date:
CSN wrote:
> Yep, I think the SQL spec says fold to uppercase. I'm
> not sure why PostgreSQL folds to lowercase instead,
> but if folding has to occur, I prefer lowercase.

I think preference was why lowercase was chosen many moons ago. It's
stayed that way because otherwise existing users would be storming the
-hackers list with pitchforks and flaming torches.

--
   Richard Huxton
   Archonet Ltd

Re: PostgreSQL Gotchas

From
"Jim C. Nasby"
Date:
On Thu, Oct 06, 2005 at 07:29:44PM -0400, Tom Lane wrote:
> Well, it's surely legitimate material as a "gotcha".  The example is
> taken from
> http://archives.postgresql.org/pgsql-general/2004-11/msg01375.php
> and the "previous discussion" referred to is this thread:
> http://archives.postgresql.org/pgsql-hackers/2004-10/msg00082.php
>
> As you can see, there wasn't a lot of agreement that we ought to change
> it.  Arguably, if we did change it we'd get ripped for the "gotcha" of
> poor optimization when the user forgets to mark nonvolatile functions
> properly.  (Personally, though, I'm in favor of tightening it up.)

Aren't there a good number of performance issues if you mis-mark a
function?

In any case, ISTM it'd be much better to perform poorly rather than give
bad/wrong results. I don't really see much dissention there, so I'd vote
for making the change for 8.2.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: PostgreSQL Gotchas

From
"Jim C. Nasby"
Date:
On Fri, Oct 07, 2005 at 08:44:34AM +0100, Richard Huxton wrote:
> CSN wrote:
> >Yep, I think the SQL spec says fold to uppercase. I'm
> >not sure why PostgreSQL folds to lowercase instead,
> >but if folding has to occur, I prefer lowercase.
>
> I think preference was why lowercase was chosen many moons ago. It's
> stayed that way because otherwise existing users would be storming the
> -hackers list with pitchforks and flaming torches.

If any change was made I'm sure it would be to allow the user to decide
which way case was folded. But IMHO, anyone messing around with object
names that won't fold is asking for trouble anyway.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: PostgreSQL Gotchas

From
Nikolay Samokhvalov
Date:
I use PostgeSQL less than year. Before I worked with MS SQL Server
2000, MySQL 3&4, Oracle8i and Interbase. Also, I studied standards
SQL:1999 and SQL:2003. So, after switching to PostgreSQL I've
encountered with several things that seem strange to me. Of course,
several of them are simply not implemented yet and are in the list of
unsopported features:
http://www.postgresql.org/docs/8.0/interactive/unsupported-features-sql-standard.html.
But some seem to be 'old diseases'.

Here is my list of the issues:
http://chernowiki.ru/Dev/PostgreSQLComparedWithSQL2003AndOracleSQLServerDB2Etc

Perhaps I'm wrong with some issues - any comments are welcome.

On 08/10/05, Jim C. Nasby <jnasby@pervasive.com> wrote:
> On Fri, Oct 07, 2005 at 08:44:34AM +0100, Richard Huxton wrote:
> > CSN wrote:
> > >Yep, I think the SQL spec says fold to uppercase. I'm
> > >not sure why PostgreSQL folds to lowercase instead,
> > >but if folding has to occur, I prefer lowercase.
> >
> > I think preference was why lowercase was chosen many moons ago. It's
> > stayed that way because otherwise existing users would be storming the
> > -hackers list with pitchforks and flaming torches.
>
> If any change was made I'm sure it would be to allow the user to decide
> which way case was folded. But IMHO, anyone messing around with object
> names that won't fold is asking for trouble anyway.
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>


--
Best regards,
Nikolay

Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Sat, Oct 08, 2005 at 03:32:00PM +0400, Nikolay Samokhvalov wrote:
> http://chernowiki.ru/Dev/PostgreSQLComparedWithSQL2003AndOracleSQLServerDB2Etc
>
> Perhaps I'm wrong with some issues - any comments are welcome.

The only thing I can comment on is updatable views. You can make
updatable views using RULEs. The only thing is that they're not
*automatically* updateable.

In theory, if someone came up will a program that from given <view
definition> produced the appropriate INSERT, UPDATE and DELETE rules,
it might be incorporated. Currently you just have to do it manually...

Case-insensetive text comparisons can be acheived using the citext
module on gborg.

http://gborg.postgresql.org/project/citext/projdisplay.php

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Nikolay Samokhvalov
Date:
On 08/10/05, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Sat, Oct 08, 2005 at 03:32:00PM +0400, Nikolay Samokhvalov wrote:
> > http://chernowiki.ru/Dev/PostgreSQLComparedWithSQL2003AndOracleSQLServerDB2Etc
> >
> > Perhaps I'm wrong with some issues - any comments are welcome.
>
> The only thing I can comment on is updatable views. You can make
> updatable views using RULEs. The only thing is that they're not
> *automatically* updateable.

OK, I'll make this correction. But for me, updatable views are views
for which DBMS supports  insert/update/delete operations as for
tables. Ideally, people shouldn't distinguish table and view - that's
what theory stands for (see Date's thoutghs about it:
http://www.dbmsmag.com/int9410.html, he also has a cycle of articles
on this theme: http://www.dbdebunk.citymax.com/page/page/622302.htm).
PostgreSQL doesn't support updates even for simple views such as
select-with-restriction. What it does support - not updatable views,
but some kind of INSTEAD OFF triggers (another form of).

>
> In theory, if someone came up will a program that from given <view
> definition> produced the appropriate INSERT, UPDATE and DELETE rules,
> it might be incorporated. Currently you just have to do it manually...

There is a good theory (Date), but it cannot be implemented for any
practical DBMS. The cause lies in differences between theory and
practice. And the major difference is possibility to define tables w/o
PK (in other words, possible duplicate rows). Nevertheless, all major
commercial RDMSs support some subset of views that can be updated..
SQL:2003 defines a quite large subset, but the definition is pretty
mazy...

>
> Case-insensetive text comparisons can be acheived using the citext
> module on gborg.
>
> http://gborg.postgresql.org/project/citext/projdisplay.php

Thanks for the link, I'll try it. However, I suppose that such basic
feature as support collations should be implemented in core.

>
> Have a nice day,
> --
> Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > tool for doing 5% of the work and then sitting around waiting for someone
> > else to do the other 95% so you can sue them.
>
>
>


--
Best regards,
Nikolay

Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Sat, Oct 08, 2005 at 06:05:29PM +0400, Nikolay Samokhvalov wrote:
> On 08/10/05, Martijn van Oosterhout <kleptog@svana.org> wrote:
> > The only thing I can comment on is updatable views. You can make
> > updatable views using RULEs. The only thing is that they're not
> > *automatically* updateable.
>
> OK, I'll make this correction. But for me, updatable views are views
> for which DBMS supports  insert/update/delete operations as for
> tables. Ideally, people shouldn't distinguish table and view - that's
> what theory stands for (see Date's thoutghs about it:
> http://www.dbmsmag.com/int9410.html, he also has a cycle of articles
> on this theme: http://www.dbdebunk.citymax.com/page/page/622302.htm).
> PostgreSQL doesn't support updates even for simple views such as
> select-with-restriction. What it does support - not updatable views,
> but some kind of INSTEAD OFF triggers (another form of).

You've got me confused. What are INSTEAD OF triggers?

PostgreSQL does support views that look exactly like tables. You can
decide on INSERT what to do whith fields not in the view, which columns
you allow UPDATE and what the semantics should be for DELETE if the
view is a join on multiple tables. All PostgreSQL doesn't do is create
these rules for you.

For an example see here, all the way at the bottom. RULEs are not
TRIGGERs:

http://www.varlena.com/varlena/GeneralBits/82.php

<snip>
> PK (in other words, possible duplicate rows). Nevertheless, all major
> commercial RDMSs support some subset of views that can be updated..
> SQL:2003 defines a quite large subset, but the definition is pretty
> mazy...

PostgreSQL allows any view to be updatable, no matter how complex it
is. You just have to create the rules yourself.

There have been attempts to automate the process, they just havn't been
clean enough to pass muster. And people who really want updateable
views can make them already.

Hope this clarifies things for you,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Peter Eisentraut
Date:
Neil Conway wrote:
> "SELECT column alias, ...": this is a known issue. AFAIK it is not
> easy to solve.

It's easy to solve.  There's even a patch linked from the page.

I seem to remember that some (versions of some?) other SQL
implementation did not *allow* the AS there, complicating matters for
people moving to PostgreSQL.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: PostgreSQL Gotchas

From
Ian Harding
Date:
On 10/8/05, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
> I use PostgeSQL less than year. Before I worked with MS SQL Server
> 2000, MySQL 3&4, Oracle8i and Interbase. Also, I studied standards
> SQL:1999 and SQL:2003. So, after switching to PostgreSQL I've
> encountered with several things that seem strange to me. Of course,
> several of them are simply not implemented yet and are in the list of
> unsopported features:
> http://www.postgresql.org/docs/8.0/interactive/unsupported-features-sql-standard.html.
> But some seem to be 'old diseases'.
>
> Here is my list of the issues:
> http://chernowiki.ru/Dev/PostgreSQLComparedWithSQL2003AndOracleSQLServerDB2Etc
>
> Perhaps I'm wrong with some issues - any comments are welcome.
>
From the referenced page...

"We can insert into column of type TIME the value such as '12:15' and
then obtain '12:15:00' making select, but we couldn't do the same with
dates."

Huh?  Minutes and seconds have a valid value of zero.  Months and days
don't.  Date types need to contain a valid date, and time types need
to contain a valid time.  Anything else seems like, well, MySQL.

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Tom Lane wrote:

>Stephen Frost <sfrost@snowman.net> writes:
>
>
>>Unquoted object names fold to lower case
>>  I don't really see this as too much of an issue, personally, but I do
>>  know some people have run into it.  The example they give seems a bit
>>  off tho, as I thought Oracle just folded to upper-case (in which case
>>  "table" != table in Oracle, but "table" = table in Postgres, kind of
>>  depends on how you look at it).  I'm not sure the spec says one way or
>>  another.
>>
>>
>
>Yeah, it does, and the spec sides with Oracle.  This is explained
>exactly in the manual:
>
>: Quoting an identifier also makes it case-sensitive, whereas unquoted
>: names are always folded to lower case. For example, the identifiers FOO,
>: foo, and "foo" are considered the same by PostgreSQL, but "Foo" and
>: "FOO" are different from these three and each other. (The folding of
>: unquoted names to lower case in PostgreSQL is incompatible with the SQL
>: standard, which says that unquoted names should be folded to upper
>: case. Thus, foo should be equivalent to "FOO" not "foo" according to the
>: standard. If you want to write portable applications you are advised to
>: always quote a particular name or never quote it.)
>
>
Any chance of adding a configuration option for future versions in order
to set case folding behavior?  It seems that PostgreSQL has really
attempted to be as standards-compliant as possible and this is one area
where improvement could be made without breaking backward compatibility...

We already have configuration options for setting whether NULL = NULL
returns TRUE or NULL, and a few other legacy options for preventing past
non-standard behavior from breaking compatibility with current versions.

>
>
>
>>RANDOM() failures
>>  This looks like an interesting problem, and might be a bug to be
>>  fixed.  I think it depends on the what the SQL specification says for
>>  this case, if it says anything at all, as to if it should be
>>  considered a bug or not.  On the surface it certainly does look like
>>  an oddity though.
>>
>>
>
>Well, it's surely legitimate material as a "gotcha".  The example is
>taken from
>http://archives.postgresql.org/pgsql-general/2004-11/msg01375.php
>and the "previous discussion" referred to is this thread:
>http://archives.postgresql.org/pgsql-hackers/2004-10/msg00082.php
>
>As you can see, there wasn't a lot of agreement that we ought to change
>it.  Arguably, if we did change it we'd get ripped for the "gotcha" of
>poor optimization when the user forgets to mark nonvolatile functions
>properly.  (Personally, though, I'm in favor of tightening it up.)
>
>
It all depends on which is worse?  Incorrect results or bad performance
on poorly written functions?  I would side with bad performance any day
if it was user error and could be fixed with the help of the people on
the PERFORM list.  If we didn't believe that, we would all be using
MySQL, right?

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Scott Marlowe
Date:
On Thu, 2005-10-13 at 13:09, Chris Travers wrote:

> Any chance of adding a configuration option for future versions in order
> to set case folding behavior?  It seems that PostgreSQL has really
> attempted to be as standards-compliant as possible and this is one area
> where improvement could be made without breaking backward compatibility...
>
> We already have configuration options for setting whether NULL = NULL
> returns TRUE or NULL, and a few other legacy options for preventing past
> non-standard behavior from breaking compatibility with current versions.

The problem, as explained to me when I floated this idea, is that the
CATALOGS are all in lower case, and many references to them are behind
the case folding mechanism (i.e. they get accessed by their lower case
names directly by backend software).

So, one would also have to have some way of working around that.

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Nikolay Samokhvalov wrote:

>On 08/10/05, Martijn van Oosterhout <kleptog@svana.org> wrote:
>
>
>>On Sat, Oct 08, 2005 at 03:32:00PM +0400, Nikolay Samokhvalov wrote:
>>
>>
>>>http://chernowiki.ru/Dev/PostgreSQLComparedWithSQL2003AndOracleSQLServerDB2Etc
>>>
>>>Perhaps I'm wrong with some issues - any comments are welcome.
>>>
>>>
>>The only thing I can comment on is updatable views. You can make
>>updatable views using RULEs. The only thing is that they're not
>>*automatically* updateable.
>>
>>
>
>OK, I'll make this correction. But for me, updatable views are views
>for which DBMS supports  insert/update/delete operations as for
>tables. Ideally, people shouldn't distinguish table and view - that's
>what theory stands for (see Date's thoutghs about it:
>http://www.dbmsmag.com/int9410.html, he also has a cycle of articles
>on this theme: http://www.dbdebunk.citymax.com/page/page/622302.htm).
>
>
Who do you mean by "people?"  Users?  DBA's?

Now, obviously the DBA will always be able to distinguish between them.
Otherwise he/she wouldn't be much of a DBA would he/she?

>PostgreSQL doesn't support updates even for simple views such as
>select-with-restriction. What it does support - not updatable views,
>but some kind of INSTEAD OFF triggers (another form of).
>
>
Ok...  Your complaint is that PostgreSQL doesn't support automatically
generating insert/update/delete rules for views.  Ultimately, however,
this may not be done simply because there are other priorities which
people are devoting their time and energy to instead.  I know some
people have put in some work on this item but I don't know what the
status is.

Also, what SQL Server 2003 calles a trigger, we call a rule.  What we
call a trigger is different and I am not sure SQL Server 2003 has such
an equivalent.

>
>
>>In theory, if someone came up will a program that from given <view
>>definition> produced the appropriate INSERT, UPDATE and DELETE rules,
>>it might be incorporated. Currently you just have to do it manually...
>>
>>
>
>There is a good theory (Date), but it cannot be implemented for any
>practical DBMS. The cause lies in differences between theory and
>practice. And the major difference is possibility to define tables w/o
>PK (in other words, possible duplicate rows). Nevertheless, all major
>commercial RDMSs support some subset of views that can be updated..
>SQL:2003 defines a quite large subset, but the definition is pretty
>mazy...
>
>

Again, this is likely doable.  There are hidden fields that I suppose
could be extended in a view to reference unique rows (maybe ctid since
the whole thing is expanded in a single SQL statement, but I haven't
tried it).

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Chris Travers <chris@travelamericas.com> writes:
> Any chance of adding a configuration option for future versions in order
> to set case folding behavior?  It seems that PostgreSQL has really
> attempted to be as standards-compliant as possible and this is one area
> where improvement could be made without breaking backward compatibility...

Fixing the case issue is not *anywhere* near that easy.  See past discussions.

>> http://archives.postgresql.org/pgsql-general/2004-11/msg01375.php
>> and the "previous discussion" referred to is this thread:
>> http://archives.postgresql.org/pgsql-hackers/2004-10/msg00082.php
>>
>> As you can see, there wasn't a lot of agreement that we ought to change
>> it.  Arguably, if we did change it we'd get ripped for the "gotcha" of
>> poor optimization when the user forgets to mark nonvolatile functions
>> properly.  (Personally, though, I'm in favor of tightening it up.)
>
> It all depends on which is worse?  Incorrect results or bad performance
> on poorly written functions?  I would side with bad performance any day
> if it was user error and could be fixed with the help of the people on
> the PERFORM list.

Well, as I said, I'm personally in favor of tightening up the
query-flattening rules, but that will not in itself make volatile
functions in subselects 100% gotcha-free.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Scott Marlowe wrote:

>On Thu, 2005-10-13 at 13:09, Chris Travers wrote:
>
>The problem, as explained to me when I floated this idea, is that the
>CATALOGS are all in lower case, and many references to them are behind
>the case folding mechanism (i.e. they get accessed by their lower case
>names directly by backend software).
>
>So, one would also have to have some way of working around that.
>
>
>

So basically, the problem is that any fix for case folding would touch a
fair bit of code and possibly cause other problems.  However, I haven't
seen anyone worry about performance issues in such a fix, just that it
might be a fair bit of work.

However, if it is not on the TODO list already, it seems that it should
be there as it really is necessary for standards-compliance.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Chris Travers <chris@travelamericas.com> writes:
> So basically, the problem is that any fix for case folding would touch a
> fair bit of code and possibly cause other problems.  However, I haven't
> seen anyone worry about performance issues in such a fix, just that it
> might be a fair bit of work.

More that it's likely to create serious forwards, backwards, and
sideways compatibility problems :-(.  The work involved is mostly
in figuring out how to deal with that.

Since the end reward for all this work would be having to read CATALOGS
WRITTEN IN ALL UPPER CASE, none of the key developers seem very
interested ...

            regards, tom lane

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Tom Lane wrote:

>Chris Travers <chris@travelamericas.com> writes:
>
>
>>So basically, the problem is that any fix for case folding would touch a
>>fair bit of code and possibly cause other problems.  However, I haven't
>>seen anyone worry about performance issues in such a fix, just that it
>>might be a fair bit of work.
>>
>>
>
>More that it's likely to create serious forwards, backwards, and
>sideways compatibility problems :-(.  The work involved is mostly
>in figuring out how to deal with that.
>
>
Just thinking based on this conversation....  Obviously, you and others
are the authorities on the code that I am not.  However, here are the
thoughts that occur to me....

First, I am not sure that it is necessary to make things that bad.  It
seems that the main objection has to do with table and column names in
system catalogs.  Otherwise, whichever case a table name is in, it is
in.  I don't see anyone running out and saying "we should match either
upper or lower cases in a single session without knowing which one a
table is in" though I admit that the usability issues might lead to such
a request in the future.  Perhaps this is the biggest sideways
compatibility issue you were mentioning.

Otherwise, it might be sufficient for the backend to quote  the names of
the catalogs anytime it uses them.  Again, this means touching a lot of
code and possibly creating bugs, etc.  But I haven't heard any reason to
think that it would be a (significant) performance hit.

>Since the end reward for all this work would be having to read CATALOGS
>WRITTEN IN ALL UPPER CASE, none of the key developers seem very
>interested ...
>
>
Why would this be required?  If an individual developer wants to have a
db with folding to uppercase, that developer can easily quote any
references to system tables himself.  I don't see why this is a
requirement.  "Case sensitivity" is "Case sensitivity" is "Case
sensitivity" ...

Furthermore, if you are going to go there, then you have to deal with
the inevitable question of what we do when people request the same
behavior for other tables.  See above.  However, none of this is
required to be standards-compatibile, unless I am missing something else....

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Chris Travers <chris@travelamericas.com> writes:
> Tom Lane wrote:
>> Since the end reward for all this work would be having to read CATALOGS
>> WRITTEN IN ALL UPPER CASE, none of the key developers seem very
>> interested ...
>>
> Why would this be required?

If you write, say,

    select max(relpages) from pg_class;

and the lexer thinks that it should fold unquoted identifiers to upper
case, then the catalog entries defining these names had better read
PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
today.  So this wouldn't be something you could flip on-the-fly --- at
the latest, an installation would have to commit to upper or lower case
at initdb time, because the initial contents of all the system catalogs
would need to match the choice.

Please read the previous discussions on the topic, if you want to
pontificate about it.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Thu, Oct 13, 2005 at 12:21:14PM -0700, Chris Travers wrote:
> >Since the end reward for all this work would be having to read CATALOGS
> >WRITTEN IN ALL UPPER CASE, none of the key developers seem very
> >interested ...
> >
> >
> Why would this be required?  If an individual developer wants to have a
> db with folding to uppercase, that developer can easily quote any
> references to system tables himself.  I don't see why this is a
> requirement.  "Case sensitivity" is "Case sensitivity" is "Case
> sensitivity" ...

I think you're slightly missing the point. Most people won't need to
change any queries because unquoted identifiers are case-folded anyway.
Which way doesn't matter, as long as it's consistant. No the main
objection would be this (see below). Does anyone really want this? I'll
take lowercase anyday... (Note, it's not exact, some things aren't SQL
identifiers so would be left alone. But idea is the same.)

TEST=# \l
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+----------
 TEMPLATE0 | POSTGRES | UNICODE
 TEMPLATE1 | POSTGRES | UNICODE
<snip>
(6 rows)

TEST=# \df+ bit
                                                        List of functions
 Result data type |   Schema   | Name |  Argument data types  |  Owner   | Language | Source code |
Description

------------------+------------+------+-----------------------+----------+----------+-------------+------------------------------
 BIT              | PG_CATALOG | BIT  | BIGINT                | POSTGRES | INTERNAL | BITFROMINT8 | INT8 To BITSTRING
 BIT              | PG_CATALOG | BIT  | BIT, INTEGER, BOOLEAN | POSTGRES | INTERNAL | BIT         | Adjust BIT() to
typomodlength 
 BIT              | PG_CATALOG | BIT  | INTEGER               | POSTGRES | INTERNAL | BITFROMINT4 | INT4 To BITSTRING

(3 rows)

TEST=# \d website
     Table "PUBLIC.WEBSITE"
  Column   |  Type   | Modifiers
-----------+---------+-----------
 FILENAME  | TEXT    |
 TITLE     | TEXT    |
 PARENT    | TEXT    |
 HITS      | INTEGER |
 HIDDEN    | BOOLEAN |
 HITSMONTH | INTEGER |
Indexes:
    "WEBSITE_FILENAME" UNIQUE, BTREE (FILENAME)


--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Ok. here are some indepth thoughts after reviewing as many prior threads
as I could find on the archives.

Tom Lane wrote:

>Chris Travers <chris@travelamericas.com> writes:
>
>
>>Tom Lane wrote:
>>
>>
>>>Since the end reward for all this work would be having to read CATALOGS
>>>WRITTEN IN ALL UPPER CASE, none of the key developers seem very
>>>interested ...
>>>
>>>
>>>
>>Why would this be required?
>>
>>
>
>If you write, say,
>
>    select max(relpages) from pg_class;
>
>
I have gone back and read the previous discussion and I still do not
understand what the real impediment is (to at least putting it on the
TODO list, at least).  I see a lot of reasons that don't make any sense
to me, and have come across one substantial obstacle that has not been
mentioned yet to my knowledge.

I understood this exmaple from your previous comment.  But aside from
the aggregate issue, I fail to see why fixing it is a requirement.
Perhaps I am being unclear in my thoughts and we are talking past eachother.

If I write that statement, and it gets back an error saying that no
table is named PG_CLASS, that is my fault as an individual developer.
After all I set the configuration option to fold to uppercase, right?

The relevant question is very simple.  Where does one draw the line
between between the responsibility of the programmer and the
responsibility of the DBA?  Personally I think it is important to offer
modes that offer as much standards-compliance as possible.

 From the previous discussion, it was mentioned that the backend treats
identifiers as quoted.  It seems to me that this should make it *easier*
rather than *harder* to impliment because this is largely a change
regarding what a given SQL statement means.  I.e. most of the work I had
foreseen has already been done.  (My proposal would have been to have
the backend treat identifiers used internally as already double-quoted.)

If you are folding to lower case, your example, select max(relpages)
from pg_class, is the same as SELECT MAX("relpages") FROM "pg_class";

If you are folding to upper case, your example is the same as SELECT
MAX("RELPAGES") FROM "PG_CLASS";  Of course, we don't expect this to
give us any results today.  In essence, I don't see why we would expect
this to return any results.  If you issue an incorrect SQL statement,
whose fault is that?

Now, the one place where this might create a problem is in the
information_schema.  The problem here is not the same as any issue I
have seen discussed before, but the fact that case folding could create
non-standard behavior here absent other changes.  The only option I see
here is to create a second INFORMATION_SCHEMA with upper case view names.

>and the lexer thinks that it should fold unquoted identifiers to upper
>case, then the catalog entries defining these names had better read
>PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
>today.
>
Ok. so the only problem out of these three that I see is with MAX.
PG_CLASS and RELPAGES are the responsibility of the developer, IMHO.
Also, with functions and aggregates, this is not the problem that it is
with tables (as the name isn't usually sent back to the client), so I
don't know how much logic it would be to differentiate between
table/column names which might need to be folded  and
function/aggregates which could continue the way that they are currently
done at least for now.  It might also be possible to create duplicate
entries to the catelogs for builtin functions/aggregates in the catalogs
so that this case folding is not causing the same type of problem.
However, for builtin functions/aggregates, I am not sure if this is
likely to have any significant performance hit.

>  So this wouldn't be something you could flip on-the-fly --- at
>the latest, an installation would have to commit to upper or lower case
>at initdb time, because the initial contents of all the system catalogs
>would need to match the choice.
>
>
Ok, so I see the objection basically being that changing the semantics
of the SQL statement would need to be done in a way that prevents
user-issued queries from having to know which way this is done.  There
is no way around this objection because it is inconsistant with the very
idea of semantic changes to the SQL parser.  Yet we have done this in
the past in areas I have previously mentioned.  So the question really
is what is really required to make this work in a semantically clean
way.  IMO, the following requirement is acceptable:

SELECT MAX("relpages") FROM "pg_class"

But the following is not:

SELECT "max"("relpages") FROM "pg_class"

However, I am sure that there will be people who don't see that as OK.
So, I would suggest that if they neeed to avoid quoting relpages and
pg_class, then they can create a PG_CATALOG schema and a PG_CLASS view.
Maybe this could be a pgfoundry project even.  But I don't see it as a
requirement of the core team.  The bigger issue is with "MAX" v. "max"
and in "INFORMATION_SCHEMA" v. "information_schema."   This might
require duplicate entries in the system catalogs.

>Please read the previous discussions on the topic, if you want to
>pontificate about it.
>
>
Is there still any specific reason why this does not belong on the TODO
list?

I am not arguing that this should be a priority in development.  I am
however arguing that since the behavior is non-standard, it might be
worth acknowledging this and at least suggesting that it should be fixed
at some indefinite point in the future....  If this was a requirement
for me, I would hire someone to make the changes and submit a patch...
It is just an attempt to ensure that it is on the roadmap at this time.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Martijn van Oosterhout wrote:

>On Thu, Oct 13, 2005 at 12:21:14PM -0700, Chris Travers wrote:
>
>
>>>Since the end reward for all this work would be having to read CATALOGS
>>>WRITTEN IN ALL UPPER CASE, none of the key developers seem very
>>>interested ...
>>>
>>>
>>>
>>>
>
>I think you're slightly missing the point. Most people won't need to
>change any queries because unquoted identifiers are case-folded anyway.
>
>
Right.  That is part of the case for at least putting this on the TODO list.

>Which way doesn't matter, as long as it's consistant. No the main
>objection would be this (see below). Does anyone really want this? I'll
>take lowercase anyday... (Note, it's not exact, some things aren't SQL
>identifiers so would be left alone. But idea is the same.)
>
>
Ok.  If someone wanted it, we would see a patch already.  I understand
this.  Probably making AS optional would be a bigger priority because
such a patch exists already :-)

However.....

This is not only non-standard behavior but can also cause headaches in
certain porting circumstances. Adding it to the TODO list would at least
invite others to take this up *if* they (or their customers) required
it.  The work I see required would minimally be:

1)  Add a GUC variable to allow for changing the case folding behavior.
2)  Add duplicate entries for builtin aggregates and functions in the
catalogs but with upper-case names
3)  Adding a duplicate INFORMATION_SCHEMA with views off information_schema

Additionally if people depend on case folding for catalogs, you would
want to create a PG_CATALOG schema with views off pg_catalog.  However,
this seems to be peripheral and not central to the issues of managing
the data given that identifiers are treated as already double-quoted.

I *could* see a case where a company that does portability work (like
EnterpriseDB, which specializes in Oracle portability) might find such a
feature as important.  Adding the item to the TODO list would invite
patches where they are needed.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> If you write, say,
>
>     select max(relpages) from pg_class;
>
> and the lexer thinks that it should fold unquoted identifiers to upper
> case, then the catalog entries defining these names had better read
> PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
> today.

Well the case of unquoted identifiers could be finessed by having it match
RELPAGES first and fail over to relpages second. It could even be made to
match RelPages and whatever if there isn't any ambiguity.

I think the problem case arises when you have code that has a quoted
"relpages". In that case the code is just not going to work unless the column
really is named "relpages" not "RELPAGES".

So to be completely backward compatible you need "relpages" as well as an
unquoted relpages to work. And to be spec compliant you need "RELPAGES" to
work too. That makes things a bit sticky.

--
greg

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> and the lexer thinks that it should fold unquoted identifiers to upper
>> case, then the catalog entries defining these names had better read
>> PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
>> today.

> Well the case of unquoted identifiers could be finessed by having it match
> RELPAGES first and fail over to relpages second. It could even be made to
> match RelPages and whatever if there isn't any ambiguity.

[ shrug... ]  Sure, you could invent some rule that might sort of work
most of the time.  But then you've abandoned the sole rationale for the
entire project, which is to *adhere to the standard*.  Any kind of funny
business with the case folding rules will make things worse not better
from that standpoint.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> and the lexer thinks that it should fold unquoted identifiers to upper
> >> case, then the catalog entries defining these names had better read
> >> PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
> >> today.
>
> > Well the case of unquoted identifiers could be finessed by having it match
> > RELPAGES first and fail over to relpages second. It could even be made to
> > match RelPages and whatever if there isn't any ambiguity.
>
> [ shrug... ]  Sure, you could invent some rule that might sort of work
> most of the time.  But then you've abandoned the sole rationale for the
> entire project, which is to *adhere to the standard*.  Any kind of funny
> business with the case folding rules will make things worse not better
> from that standpoint.

Well sure, it would only be worthwhile if you could come up with rules that
complied with the standard 100% of the time that the standard specifies
behaviour. But if you could do that and satisfy 99% of the backwards
compatibility issues including any catalog related issues then it seems like
it would be worthwhile.

But on further thought, if you want to have pg_dump et al output lowercase
names (which I certainly prefer) then I think what you would have to do is
have a bit that travels with every identifier that indicates whether it was
quoted or not.

So two identifiers match if either is an unquoted identifier and they match
case insensitively. Or if both are quoted and they match case sensitively.

Actually I think you can get closer to the standard if you interpret "case
insensitively" above to mean they match after downcasing the unquoted
identifier(s) or upcasing the unquoted identifier(s) but not if one of them is
unquoted and the other is quoted and mixed case.

There's still some funny business: if you create both "foo" and "FOO" and then
refer to one of them with an unquoted foo. Since it's an unquoted identifier
it matches both. If you give preference to the "FOO" then that would follow
the standard.

Also if you try to create both a "foo" and an unquoted foo then this method
would say that's a conflict whereas the standard would say it was acceptable.

I think this is an improvement on what's there now though. It lets Postgres
have case-preserving case-insensitive unquoted identifiers which people have
asked about multiple times and it solves most of the inter-database
compatibility problems with case sensitivity.

--
greg

Re: PostgreSQL Gotchas

From
Jan Wieck
Date:
On 10/13/2005 2:40 PM, Tom Lane wrote:
> Chris Travers <chris@travelamericas.com> writes:
>> So basically, the problem is that any fix for case folding would touch a
>> fair bit of code and possibly cause other problems.  However, I haven't
>> seen anyone worry about performance issues in such a fix, just that it
>> might be a fair bit of work.
>
> More that it's likely to create serious forwards, backwards, and
> sideways compatibility problems :-(.  The work involved is mostly
> in figuring out how to deal with that.
>
> Since the end reward for all this work would be having to read CATALOGS
> WRITTEN IN ALL UPPER CASE, none of the key developers seem very
> interested ...

Just an idea ...

if the release that offers UPPER case folding as an option also makes
sure that all internal and tool references to system catalog objects are
properly quoted (as they IMHO should be anyway), then it would be
reduced to a third party tool/application problem accessing the system
catalog in a database that has this new config option selected.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> So two identifiers match if either is an unquoted identifier and they match
> case insensitively. Or if both are quoted and they match case sensitively.

Which part of "adhere to the standard" are you failing to get?  The
standard is 100% clear about what it means by case folding, and the
above is not it.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > So two identifiers match if either is an unquoted identifier and they match
> > case insensitively. Or if both are quoted and they match case sensitively.
>
> Which part of "adhere to the standard" are you failing to get?  The
> standard is 100% clear about what it means by case folding, and the
> above is not it.

I guess I had a few unstated assumptions:

1) That we don't care what algorithm the standard specifies as long as we're
   functionally equivalent.

2) That we don't care about producing errors in all the cases where following
   the standard would produce an error. Ie, that as long as we accept any code
   that the standard says should work we don't care if we sometimes accept
   code that the standard says should produce errors like "column not found".

Also my position somewhat evolved between the beginning of the mail and the
end. Specifically I think we can aim to be compatible with specific common
standards-compliant idioms even if we don't follow the letter of the standard.

For example I think there's a ton of code out there that is nigh impossible to
port to Postgres only because it defines tables with columns like "FOO" and
then refers to them as foo or FOO. And vice versa, code that defines columns
as foo or FOO and then refers to them in queries as "FOO". There's no reason
to say it's useless to support that and make lots of code work fine under
Postgres even if that doesn't mean we fit the standard 100%.

--
greg

Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Fri, Oct 14, 2005 at 10:34:53AM -0400, Jan Wieck wrote:
> On 10/13/2005 2:40 PM, Tom Lane wrote:
> >Since the end reward for all this work would be having to read CATALOGS
> >WRITTEN IN ALL UPPER CASE, none of the key developers seem very
> >interested ...
>
> Just an idea ...
>
> if the release that offers UPPER case folding as an option also makes
> sure that all internal and tool references to system catalog objects are
> properly quoted (as they IMHO should be anyway), then it would be
> reduced to a third party tool/application problem accessing the system
> catalog in a database that has this new config option selected.

The main problem I see is that the most people who are complaining are
people who create their tables as "MixedCaseQuoted" and then refer to
them as Unquoted. Even following the SQL standard precisely will not
fix this, given that behaviour is non-standard also.

Seems to me we'd be better off creating an option
"lowercase_quoted_anyway" which solves everything, at the expense of
being even less compliant.

Once you've loaded the schema, you pg_dump it again and then you can
reload it without the option. All fixed...
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> The main problem I see is that the most people who are complaining are
> people who create their tables as "MixedCaseQuoted" and then refer to
> them as Unquoted. Even following the SQL standard precisely will not
> fix this, given that behaviour is non-standard also.

> Seems to me we'd be better off creating an option
> "lowercase_quoted_anyway" which solves everything, at the expense of
> being even less compliant.

Hmm.  That solves the problem for people who are used to MySQL or Access
or whatever it is that behaves like that, but that problem hasn't really
been on the radar screen that I've noticed.  The push-back I've been
paying attention to is from people who say "this application expects the
standard-mandated behavior, why aren't you following the standard?"

Although ... it's true that there seem to be very few apps relying on
case sensitivity per se, ie, expecting "Foo" and "foo" to be different.
The complaints that I can remember were about programs that expected
"FOO" and FOO (not quoted) to be the same.  So always-smash-to-lower-
case might indeed solve most of the real-world problem for the Oracle
camp as well.

Comments anyone?

            regards, tom lane

Re: PostgreSQL Gotchas

From
Peter Eisentraut
Date:
Tom Lane wrote:
> Although ... it's true that there seem to be very few apps relying on
> case sensitivity per se, ie, expecting "Foo" and "foo" to be
> different. The complaints that I can remember were about programs
> that expected "FOO" and FOO (not quoted) to be the same.  So
> always-smash-to-lower- case might indeed solve most of the real-world
> problem for the Oracle camp as well.

That seems to be the right compromise.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Greg Stark wrote:

>Tom Lane <tgl@sss.pgh.pa.us> writes:
>
>
>
>>If you write, say,
>>
>>    select max(relpages) from pg_class;
>>
>>and the lexer thinks that it should fold unquoted identifiers to upper
>>case, then the catalog entries defining these names had better read
>>PG_CLASS, RELPAGES, and MAX, not the lower-case names they contain
>>today.
>>
>>
>
>Well the case of unquoted identifiers could be finessed by having it match
>RELPAGES first and fail over to relpages second. It could even be made to
>match RelPages and whatever if there isn't any ambiguity.
>
>
Personally I don't see why relpages and pg_class are problems.  After
all, the backend treats identifiers as already quoted internally.  If
you change what the language means, you are going to get backward
compatibility issues.  End of story.  But we have done this sort of
thing before so it is not the end of the world as long as we provide a
way to get back.  Indeed treating identifiers as already quoted should
prevent a huge number of bugs that might otherwise be problematic.  But
because there are no guarantees that the system catalogs won't change
with each new major release, this isn't a problem.  I.e. it isn't the
responsibility of the core development team (IMO) to make this work.
The information_schema is a bigger problem, but I don't see why you
can't have two information schemas with cases to match each behavior.
Sure it means a little more disk space, but I don't really see why it
would be a major issue.

You do have an issue with MAX v. max which could get nasty.  I have
proposed solving this one with a duplicate entry in the system catalogs
with upper names for aggregates and builtin functions.

Basically, if what I have read is accurate, this seems to be the least
amount of work to make the following statement work:
SELECT max("relpages") FROM "pg_class";

What we don't want to see is the requirement for a statement even less
standard, like:
SELECT "max"("relpage") FROM "pg_class";

Note that if you allow system catalogs to be created in upper case, and
the backend treats identifiers as already double-quoted, then you have
to go through and case fold these again, which is problematic, impacts
performance, and adds the potential for many bugs.

>I think the problem case arises when you have code that has a quoted
>"relpages". In that case the code is just not going to work unless the column
>really is named "relpages" not "RELPAGES".
>
>So to be completely backward compatible you need "relpages" as well as an
>unquoted relpages to work. And to be spec compliant you need "RELPAGES" to
>work too. That makes things a bit sticky.
>
>
Is this really a requirement given that the system catalogs are not
guaranteed to be backwards compatible and have been changed in the past?

Best Wishes,
Chris Travers

Re: PostgreSQL Gotchas

From
Lincoln Yeoh
Date:
At 11:43 AM 10/15/2005 -0400, Tom Lane wrote:

>Martijn van Oosterhout <kleptog@svana.org> writes:
> > Seems to me we'd be better off creating an option
> > "lowercase_quoted_anyway" which solves everything, at the expense of
> > being even less compliant.

I think that'll be a good option to have.

>paying attention to is from people who say "this application expects the
>standard-mandated behavior, why aren't you following the standard?"

Because sometimes too many people don't follow the official standard. And
sometimes the standard sucks ;).

It have been normal to have everything in upper case decades ago, but
nowadays the new norm appears to be everything in lower case (ever seen
what kids nowadays type/txt out?).

Not saying that's right, but I find everything in lower case easier to read
than everything in upper case.

>Although ... it's true that there seem to be very few apps relying on
>case sensitivity per se, ie, expecting "Foo" and "foo" to be different.
>The complaints that I can remember were about programs that expected
>"FOO" and FOO (not quoted) to be the same.  So always-smash-to-lower-
>case might indeed solve most of the real-world problem for the Oracle
>camp as well.
>
>Comments anyone?

If it's an option and doesn't cost very much I'm for it.

It must be a very small subset who rely on "Foo" and "foo" to be different
AND also rely on "FOO" and FOO to be the same. Wonder what's the term for
the even smaller subset who intentionally would want that.

Link.


Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Sun, Oct 16, 2005 at 10:00:25PM +0800, Lincoln Yeoh wrote:
> >Martijn van Oosterhout <kleptog@svana.org> writes:
> >> Seems to me we'd be better off creating an option
> >> "lowercase_quoted_anyway" which solves everything, at the expense of
> >> being even less compliant.
>
> I think that'll be a good option to have.

Actually, perhaps an even more restricted version would be better.
Lowercase quoted identifiers only if they are all uppercase. So then:

Foo == FOO = foo == "foo" == "FOO" != "Foo"

assuming it can be implemented with some mediocum of efficiency.

> It must be a very small subset who rely on "Foo" and "foo" to be different
> AND also rely on "FOO" and FOO to be the same. Wonder what's the term for
> the even smaller subset who intentionally would want that.

The strictly standards compliant group? But then, they'd probably want
us to remove LIMIT/OFFSET so I'm not sure we need to worry about that.

Note to implementor: In 'SELECT 1 as "Title"', the quoted string should
not be lowercased, even if you are lowercasing everything else...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
For the record, I am trying to flesh out my thoughts on this issue.  I
am not asking Tom to do this BTW :-) because I am sure he has more
pressing contributions to make.  However, because it can be an issue for
those migrating from, say, Oracle, it would be good to have a discussion
on what would be minimally required to make this happen.

>
>Well sure, it would only be worthwhile if you could come up with rules that
>complied with the standard 100% of the time that the standard specifies
>behaviour.
>
I doubt that the standard says anything about system catalogs.  If our
system catalogs are in lower case, but case is normally folded to upper
case, then this can create some problems, but if the backend is already
treating identifiers as already quoted when it looks in the catalogs, I
see this as a non-issue.  The system catalogs are not guaranteed to be
backwards compatible anyway.  For those cases of aggregates and
functions where we get problems like not finding the MAX aggregate, we
can solve that by having two MAX aggregates which are identical, but one
is in upper case and the other in lower case.

The same would be required

> But if you could do that and satisfy 99% of the backwards
>compatibility issues including any catalog related issues then it seems like
>it would be worthwhile.
>
>
What you are asking here is functionally impossible if you expect user
queries to operate as is against the catalogs when the semantics of the
language have changed.  When you have semantic issues regarding
standards-compliance, you are going to have to choose between
standards-compliance and backwards-compatibility.  Hence I was
suggesting that an option be provided to the user to choose between
these options.

Scratches head....  I guess if you really wanted to allow the queries to
really go without modifications, I guess you could create a second
(duplicate) PG_CATALOG schema with views like PG_CLASS....  This seems
to me to be iceing on the cake, however, and not part of the minimal
requirements.  Gotta love VIEWs :-)

I guess that would solve every case of backwards-compatibility that I
can find where the problem is not bad application design (of the sort
that currently causes problems).

>But on further thought, if you want to have pg_dump et al output lowercase
>names (which I certainly prefer) then I think what you would have to do is
>have a bit that travels with every identifier that indicates whether it was
>quoted or not.
>
>
If this is what you want, then leave folding to lower case.  Simple :-)

>So two identifiers match if either is an unquoted identifier and they match
>case insensitively. Or if both are quoted and they match case sensitively.
>
>
Sorry, not standards-compliant, as Tom pointed out....

Here are the basic assumptions I would make in such a solution:

1)  We want the option of choosing standards-compliance over
backwards-compatibility.
2)  When in doubt, throw and error and abort the transaction.
3)  NEVER EVER guess as to what the user really meant when the exact
instructions are ambiguous.

If we didn't make these assumptions, we would be using MySQL :-)

Best Wishes,
Chris Travers
Metatron Technology COnsulting

Attachment

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Jan Wieck wrote:

> On 10/13/2005 2:40 PM, Tom Lane wrote:
>
>> Chris Travers <chris@travelamericas.com> writes:
>>
>>> So basically, the problem is that any fix for case folding would
>>> touch a fair bit of code and possibly cause other problems.
>>> However, I haven't seen anyone worry about performance issues in
>>> such a fix, just that it might be a fair bit of work.
>>
>>
>> More that it's likely to create serious forwards, backwards, and
>> sideways compatibility problems :-(.  The work involved is mostly
>> in figuring out how to deal with that.
>>
>> Since the end reward for all this work would be having to read CATALOGS
>> WRITTEN IN ALL UPPER CASE, none of the key developers seem very
>> interested ...
>
>
> Just an idea ...
>
> if the release that offers UPPER case folding as an option also makes
> sure that all internal and tool references to system catalog objects
> are properly quoted (as they IMHO should be anyway), then it would be
> reduced to a third party tool/application problem accessing the system
> catalog in a database that has this new config option selected.

This is largely what I have been suggesting.  However, you still have
two (small) problems that could be solved fairly easily I think.  Take
Tom's example:

SELECT MAX(relpages) FROM pg_class;

Now, I am perfectly happy to require relpages and pg_class to be quoted
(if you don't want to, there are always views....), however, MAX is also
an identifier, which if I understand Tom correctly is also folded to
lower case.  If you fold MAX to upper case, you may not find the
aggregate.  I don't see an easy answer aside from having separate
entries for "max" and "MAX" which are functionally the same.

Secondly, you have one other area where you could get non-standard
behavior if you require the user to quote lower-case system
identifiers.  This is the information_schema.  Solving this would simply
require a separate INFORMATION_SCHEMA, maybe composed as views off the
views and tables in information_schema.

Finally, if this was such a pressing issue, I am sure it would already
be on the TODO list.  My suggestion is that we put it on the TODO list
simply as a way of ensuring that when people run into this problem, we
are showing that we are open to contributions in this area.  I am sure
that there are firms out there which will run into it eventually because
Oracle is standards-compliant in this area and there are many
applications that people are trying to port over which could run into
these sort of issues.  This is open source and I see no problem with
pushing the work in an area like this back to the contributor who is
helping his/her customer.  After all, this helps companies which do
consulting make money...

If someone can make a case for preserving case in identifiers (which I
doubt can be made given the fact that not all identifiers are created
equal, so is MAX() the same as Max() the same as mAx()?), then we can
put that in the TODO list as well.  The TODO list is already long, so I
don't see any reason not to put an entry there.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Attachment

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Actually, perhaps an even more restricted version would be better.
> Lowercase quoted identifiers only if they are all uppercase. So then:

No, I think the original proposal was better.  This one doesn't fix
things for the lusers who think "MixedCase" and unquoted MixedCase
should be the same.

> Note to implementor: In 'SELECT 1 as "Title"', the quoted string should
> not be lowercased, even if you are lowercasing everything else...

You don't get to have that, I think, because the lexer is not context
aware.  It's not clear to me why it's a good idea anyway.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Sat, Oct 15, 2005 at 05:46:03PM -0700, Chris Travers wrote:
> This is largely what I have been suggesting.  However, you still have
> two (small) problems that could be solved fairly easily I think.  Take
> Tom's example:
>
> SELECT MAX(relpages) FROM pg_class;
>
> Now, I am perfectly happy to require relpages and pg_class to be quoted
> (if you don't want to, there are always views....), however, MAX is also
> an identifier, which if I understand Tom correctly is also folded to
> lower case.  If you fold MAX to upper case, you may not find the
> aggregate.  I don't see an easy answer aside from having separate
> entries for "max" and "MAX" which are functionally the same.

I think you're certainly waving away a lot of potential issues with saying
that "as long as system identifiers are quoted we're fine". Tom's
example is just the tip of the iceberg.

- consider functions like nextval/currval/substring. Some of these are
created and used internally too, should they always be quoted? Would
these functions all appear twice in the functions list?
- consider oid,xmin,xmax,etc are they system identifiers? You can't
provide two copies of those, since each column can only appear once in
the catalogs.
- consider pg_dump, psql and other such tools with intimate knowledge
of the catalogs, they would all need to updated.
- consider the number of third party apps that uses these "system
identifiers". Updating all of those is a massive task, for what gain?

There is some sympathy around here for the view that in this case the
standard is simply wrong. As said on this thread already, uppercase
only was appropriate for twenty years ago but not now. Older
programming languages tended to have their keywords in uppercase, yet
no modern language would consider such an act.

So, I think in the long run there is going to be some sympathy for
helping people with "QUOTEDUPPERCASE" identifiers to port their stuff
but I don't think PostgreSQL itself will ever default to folding up.
Hence the discussion about folding quoted identifiers also, which
solves the issue without masses of changes everywhere.

Curiously, MySQL does as we do, although they don't accept standard
quoting at all (using backquotes). Do people complain about them too?

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Martijn van Oosterhout
Date:
On Sun, Oct 16, 2005 at 12:40:25PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Actually, perhaps an even more restricted version would be better.
> > Lowercase quoted identifiers only if they are all uppercase. So then:
>
> No, I think the original proposal was better.  This one doesn't fix
> things for the lusers who think "MixedCase" and unquoted MixedCase
> should be the same.

Okay, how do we refer to the "RI_FKey_cascade_del" function then? Or
trigger "RI_ConstraintTrigger_2204908". Using $$ quoting? Invent
another quoting style?

> > Note to implementor: In 'SELECT 1 as "Title"', the quoted string should
> > not be lowercased, even if you are lowercasing everything else...
>
> You don't get to have that, I think, because the lexer is not context
> aware.  It's not clear to me why it's a good idea anyway.

Depends at what level you do the conversion. But lowercasing here means
that people producing HTML tables using psql and providing multi word
titles are going to be mightily disappointed if their titles are
lowercased on them.

Then again, if it's an optional switch, maybe it doesn't matter...
Lowercasing only full uppercased identifiers is more something you
might have turned on all the time.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: PostgreSQL Gotchas

From
Matthew
Date:
Tom Lane wrote:
> Although ... it's true that there seem to be very few apps relying on
> case sensitivity per se, ie, expecting "Foo" and "foo" to be different.
> The complaints that I can remember were about programs that expected
> "FOO" and FOO (not quoted) to be the same.  So always-smash-to-lower-
> case might indeed solve most of the real-world problem for the Oracle
> camp as well.

Is it worth having a GUC variable that enables / disable this?   I hate
to add GUCs but this seems like a fairly significant change in expected
behavior.  Not sure it's a good idea, but I thought I should ask the
question anyway.

Matt


Re: PostgreSQL Gotchas

From
Peter Eisentraut
Date:
Tom Lane wrote:
> > Note to implementor: In 'SELECT 1 as "Title"', the quoted string
> > should not be lowercased, even if you are lowercasing everything
> > else...
>
> You don't get to have that, I think, because the lexer is not context
> aware.  It's not clear to me why it's a good idea anyway.

This is used to create pretty column headers in table output.  We'd
surely get criticism if this no longer worked.

I guess you could get around that if you leave the case-folding in the
lexer as is but instead make the nameeq function case insensitive.
This is sort of the way the Windows file system works (I think).  (And
we know that *never* leads to problems... :-/)

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Okay, how do we refer to the "RI_FKey_cascade_del" function then? Or
> trigger "RI_ConstraintTrigger_2204908". Using $$ quoting? Invent
> another quoting style?

Yeah, I was wondering about those myself.  Probably we could just change
their names to something that could still be spelled.  Or do nothing ---
I can't see any obvious reason why the average user would need to
reference those functions/triggers anyway, and if he did, he could turn
off the force-lowercase flag.

> Depends at what level you do the conversion. But lowercasing here means
> that people producing HTML tables using psql and providing multi word
> titles are going to be mightily disappointed if their titles are
> lowercased on them.

If they care about case, they wouldn't be turning this flag on anyway.

What we are looking for here is a simple, understandable hack that
solves 95% of the problem for legacy applications.  Not something that
we'd encourage people to have on at all times.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> I guess you could get around that if you leave the case-folding in the
> lexer as is but instead make the nameeq function case insensitive.

If you do that then you have to nail down the behavior at initdb time
(else flipping the switch will corrupt all your system catalog indexes).

To be selective about which occurrences of identifiers get smashed to
lower, I think we would have to postpone the smashing into analyze.c.
This would be a lot messier than doing it once in the lexer --- but on
the other hand, it would avoid violating the rule stated in gram.y,
namely that lexer and grammar behavior must not vary depending on GUC
variable settings.  On the third hand, we could get around that
restriction by not allowing the GUC variable to change within a session,
which might be a good idea anyway to avoid confusing applications.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Matthew wrote:

> Tom Lane wrote:
>
>> Although ... it's true that there seem to be very few apps relying on
>> case sensitivity per se, ie, expecting "Foo" and "foo" to be different.
>> The complaints that I can remember were about programs that expected
>> "FOO" and FOO (not quoted) to be the same.  So always-smash-to-lower-
>> case might indeed solve most of the real-world problem for the Oracle
>> camp as well.
>
>
> Is it worth having a GUC variable that enables / disable this?   I
> hate to add GUCs but this seems like a fairly significant change in
> expected behavior.  Not sure it's a good idea, but I thought I should
> ask the question anyway.

It is a good idea for a very simple reason.  When given a choice between
standards-compliance and backwards-compatibility with prior versions, it
is important to give the option to make this change.  The only other
option is to make it a initdb time decision.

Autocommit is a bad example.  A better option is the GUC variable that
allows you to go back to the way we used to do things and allow NULL =
NULL to return TRUE instead of NULL.  In both these cases, the
difference is a semantic change in what a given SQL statement means.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Matthew <matthew@zeut.net> writes:
> Is it worth having a GUC variable that enables / disable this?

That's a given, I think.  We're certainly not going to make smash-to-
lower-case the only available behavior.

One issue we might want to reflect on is how easy it should be to change
the variable's setting.  Our bad experience with autocommit leaves me a
bit wary of fundamental behavioral changes that can be flicked on and
off at whim.  However, we have pretty much the same problem staring us
in the face for backslash-in-literal-strings behavior, so maybe we can
find an answer we like for both issues at the same time.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Martijn van Oosterhout wrote:

>On Sat, Oct 15, 2005 at 05:46:03PM -0700, Chris Travers wrote:
>
>
>>This is largely what I have been suggesting.  However, you still have
>>two (small) problems that could be solved fairly easily I think.  Take
>>Tom's example:
>>
>>SELECT MAX(relpages) FROM pg_class;
>>
>>Now, I am perfectly happy to require relpages and pg_class to be quoted
>>(if you don't want to, there are always views....), however, MAX is also
>>an identifier, which if I understand Tom correctly is also folded to
>>lower case.  If you fold MAX to upper case, you may not find the
>>aggregate.  I don't see an easy answer aside from having separate
>>entries for "max" and "MAX" which are functionally the same.
>>
>>
>
>I think you're certainly waving away a lot of potential issues with saying
>that "as long as system identifiers are quoted we're fine". Tom's
>example is just the tip of the iceberg.
>
>- consider functions like nextval/currval/substring. Some of these are
>created and used internally too, should they always be quoted? Would
>these functions all appear twice in the functions list?
>
>
I think I mentioned that the case issue is of particular importance for
functions and aggregates because we don't want to type:

SELECT "max"("relpages") FROM "pg_class" and this is definitely not
standard anyway.

I think any functions that are shipped with PostgreSQL should come in
both an upper case and lower case form.  I am willing to commit to
building these in the system catalogs.  Same with basic types (like
"INT" and "int") for example.  Or another option would just be to lower
case all reserved SQL words though this would lead to a wide range of
possible issues.

>- consider oid,xmin,xmax,etc are they system identifiers? You can't
>provide two copies of those, since each column can only appear once in
>the catalogs.
>
>
No you can't but if you need to refer to oid, xmin, and xmax from
outside the backend, you can always quote them.   Besides if you really
nead that level of compatibility, consider the following....

CREATE VIEW "PG_CLASS" AS
SELECT "oid" AS "OID", "relname" as "RELNAME", .....
FROM "pg_class";
and then create associated rules.  So yes, we can have two entries in
the system catalogs if we need to because views are wonderful things
:-)  The only real exception is with functions and agregates, and these
are bigger issues.

BTW, not to split hairs here too much, but for those customers that need
this now, I suppose one could always use this same trick.

CREATE SCHEMA "PUBLIC";
CREATE VIEW "PUBLIC"."TABLE1" AS
SELECT "field1" AS "FIELD1"....
FROM "public"."table1";
CREATE RULE irule AS ON INSERT INTO "TABLE1"
DO INSTEAD INSERT INTO "public"."table1" ("field1"...) VALUES
(NEW."FIELD1"...);
etc....

Could be a lot of work, but it should be possible to write a Perl script
to generate these for you :-)

Sounds like a project for me :-P

>- consider pg_dump, psql and other such tools with intimate knowledge
>of the catalogs, they would all need to updated.
>
>
If you need to create backward compatibility via views, my proposal
would allow for that.  Also if this is a GUC variable, there is no
reason it can't be set by the application where needed.  Also, as far as
I am concerned, I would rather leave the system catalogs alone and
change them as *little* as possible for these reasons.

>- consider the number of third party apps that uses these "system
>identifiers". Updating all of those is a massive task, for what gain?
>
>
See above regarding views.  The only difference between shipping it with
this mode enabled is that it provides an easy means for customers to
avoid the problem and we have a centralized solution.

>There is some sympathy around here for the view that in this case the
>standard is simply wrong. As said on this thread already, uppercase
>only was appropriate for twenty years ago but not now. Older
>programming languages tended to have their keywords in uppercase, yet
>no modern language would consider such an act.
>
>So, I think in the long run there is going to be some sympathy for
>helping people with "QUOTEDUPPERCASE" identifiers to port their stuff
>but I don't think PostgreSQL itself will ever default to folding up.
>
>
Who ever said anything about defaulting.  I simply said an option should
be present.  FWIW I prefer the way PostgreSQL does it, but I have to
recognize that it will bite people migrating from RDBMS's that do follow
the standards.

>Hence the discussion about folding quoted identifiers also, which
>solves the issue without masses of changes everywhere.
>
>Curiously, MySQL does as we do, although they don't accept standard
>quoting at all (using backquotes). Do people complain about them too?
>
>
MySQL's unquoted table names being case sensitive is even more broken
than ours ;-)  Besides if people using MySQL cared about standards, I
don't think they would still be using MySQL...

It seems to me that the first step in any solution is going to be
creating a tool that automatically generates views in appropriate schema
using upper case (or maybe optionally lower case) table and column
names.  This should be fairly simple to do.  Then such a tool could be
used to provide whatever sort of compatibility we need.  Indeed this may
in fact be more flexible in helping people migrate to
BrokenSystemsLikeMySQL because it would allow you to create views off
tables using lower case names as well.

A second step will be to ensure that the backend really does treat
identifiers as already double-quoted.

However, this is only a partial fix.  It doesn't solve the larger issue
of (albeit badly written but standards-compliant) queries like:

SELECT "MYCOLUMN" FROM MYTABLE;

My fundamental point is that the above statement means something
specific according to the SQL standard, and it means something
*different* in PostgreSQL.  The time may not be ripe for a solution but
if it is at least on the TODO list, then maybe people will be less bit
by it if and when we actually do support this behavior.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Chris Travers <chris@verkiel.metatrontech.com> writes:
> I make the following assumptions:

> 1)  All backend references to identifiers are treated as quoted by the
> backend as mentioned in a previous thread.
> 2)  Same with official clients like psql.
> 3)  We don't guarantee complete backward compatibility in the system
> catalogs anyway (note that this is a trivial problem to solve if 1 above
> is solved-- schemas, views, and rules can accomplish some cool tricks
> :-) ).

None of these assumptions are acceptable, particularly not #2.  In
reality #2 is a proxy for "I'm willing to break all existing clients
that look at the system catalogs".  Which we're not.

            regards, tom lane

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Since nobody has told me why my proposal doesn't work, I figure I will
point out that it does solve this problem nicely.  However, it makes the
assumptions that we treat the identifiers as quoted when the backend
uses them to look them up in the system catalogs.  Also I am rethinking
a few details....  I make the following assumptions:

1)  All backend references to identifiers are treated as quoted by the
backend as mentioned in a previous thread.
2)  Same with official clients like psql.
3)  We don't guarantee complete backward compatibility in the system
catalogs anyway (note that this is a trivial problem to solve if 1 above
is solved-- schemas, views, and rules can accomplish some cool tricks
:-) ).

If these are not done, then it would be too much work to do at the
moment even if a customer required it.  However, if this is the case, we
need to get these areas done first (they probably should be done anyway).

Quite frankly if system catalog backwards-compatibility is that much of
an issue for people, if others commit to the rest of the work, I will be
happy to create a PG_CATALOG schema with views off pg_catalog....  All
then that we need to do is to put it in the search path :-)  Oh, and the
same for information_schema and INFORMATION_SCHEMA tool.

>
>Okay, how do we refer to the "RI_FKey_cascade_del" function then? Or
>trigger "RI_ConstraintTrigger_2204908". Using $$ quoting? Invent
>another quoting style?
>
>
Right, and when we start reqiring non-standard quoting styles, then we
are not standards-compliant either :-)

Also it occurs to me that we that there are three options for dealing
with case issues.  These do *not* need to be done at the parser level as
long as the backend treats catalog identifiers as previously quoted.
But there are really two options...

1)  Try to differentiate between function names and aggregates in the
parser.  This seems like a non-starter to me.  But if I am missing
something I am open to suggestions....

2)  My original idea of duplicate catalog entries.  I.e. one for "MAX"
and one for "max"   Again, I am willing to do this work if people decide
that they are going to go ahead and try to get this in the next release.

Best Wishes,
Chris Travers
Metatron Technology Consulting

Attachment

Re: PostgreSQL Gotchas

From
Jan Wieck
Date:
On 10/16/2005 12:40 PM, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
>> Actually, perhaps an even more restricted version would be better.
>> Lowercase quoted identifiers only if they are all uppercase. So then:
>
> No, I think the original proposal was better.  This one doesn't fix
> things for the lusers who think "MixedCase" and unquoted MixedCase
> should be the same.

What is bad about leaving pg_catalog all lower case and expect everyone
to query the catalog quoted? We would be totally free as to the
flexibility of the new option. One setting could even act 100% according
to the standard ... a feature I'm missing so far in all the proposals.


Jan

>
>> Note to implementor: In 'SELECT 1 as "Title"', the quoted string should
>> not be lowercased, even if you are lowercasing everything else...
>
> You don't get to have that, I think, because the lexer is not context
> aware.  It's not clear to me why it's a good idea anyway.
>
>             regards, tom lane


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: PostgreSQL Gotchas

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> What is bad about leaving pg_catalog all lower case and expect everyone
> to query the catalog quoted?

The fact that it will break every nontrivial client currently in
existence.  Those quotes aren't there in the clients and we can't
suddenly mandate them to become there.  Also, that approach does not
work for built-in functions (MAX, etc)

            regards, tom lane

Re: PostgreSQL Gotchas

From
Jan Wieck
Date:
On 10/17/2005 10:16 AM, Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
>> What is bad about leaving pg_catalog all lower case and expect everyone
>> to query the catalog quoted?
>
> The fact that it will break every nontrivial client currently in
> existence.  Those quotes aren't there in the clients and we can't
> suddenly mandate them to become there.  Also, that approach does not
> work for built-in functions (MAX, etc)

Ugh ... got me on the MAX part.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #

Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Tom Lane wrote:

>Chris Travers <chris@verkiel.metatrontech.com> writes:
>
>
>>I make the following assumptions:
>>
>>
>
>
>
>>1)  All backend references to identifiers are treated as quoted by the
>>backend as mentioned in a previous thread.
>>2)  Same with official clients like psql.
>>3)  We don't guarantee complete backward compatibility in the system
>>catalogs anyway (note that this is a trivial problem to solve if 1 above
>>is solved-- schemas, views, and rules can accomplish some cool tricks
>>:-) ).
>>
>>
>
>None of these assumptions are acceptable, particularly not #2.  In
>reality #2 is a proxy for "I'm willing to break all existing clients
>that look at the system catalogs".  Which we're not.
>
>
Ok then.

#1  Is the biggie.  If it is not done, it should be done before one can
go any further.  If it is solved, however, then it is trivial to fix 2
and 3.  This week I am going to begin work on a "shadow upper case
views" tool that could be used to automatically solve #'s 2 and 3.  This
tool would be used to automatically create separate schemas and views
with all upper case names (including OID.  If CTID, XMAX, and XMIN are
needed, they could be added easily enough.

I.e. #1 is the only assumption which will fully *break* my proposal and
is the only clear prerequisite.  #2 and #3 were voiced with the idea of
making the change minimalist.  However, again, there are tricks with
views that can be used to solve these problems and it is largely an
issue of automating this process so we don't have to create views by
hand for every table in the pg_catalog.

On second thought, this automation would be needed anyway because the
information schema tables and views would also be effected.  And this
automation may be good enough to resolve many of the issues in existing
versions that people may have with mixed quoting environments.

Best Wishes,
Chris Travers
Metatron Technology Consulting


Re: PostgreSQL Gotchas

From
Chris Travers
Date:
Jan Wieck wrote:

> On 10/17/2005 10:16 AM, Tom Lane wrote:
>
>> Jan Wieck <JanWieck@Yahoo.com> writes:
>>
>>> What is bad about leaving pg_catalog all lower case and expect
>>> everyone to query the catalog quoted?
>>
>>
>> The fact that it will break every nontrivial client currently in
>> existence.  Those quotes aren't there in the clients and we can't
>> suddenly mandate them to become there.  Also, that approach does not
>> work for built-in functions (MAX, etc)
>
>
> Ugh ... got me on the MAX part.

For built-in functions and aggregates, I see no option other than having
duplicate entries in the catalogs.  I.e. have an aggregate called "max"
and a duplicate aggregate called "MAX" (nobody has explained to me why
this is a bad idea aside from the single point of truth concerns).

The rest can be solved with clever uses of views :-)  Furthermore the
creation of these views should be easily automated.  I am willing to
write a tool to automate this.

Re: PostgreSQL Gotchas

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Matthew <matthew@zeut.net> writes:
> > Is it worth having a GUC variable that enables / disable this?
>
> That's a given, I think.  We're certainly not going to make smash-to-
> lower-case the only available behavior.

A GUC variable for this would be quite nice..  I had some difficulty
porting a MySQL application to PostgreSQL because it used PHP and PEAR,
which allows you to create a database 'object' and then reference things
inside it which are actually column names, ie:

$db = ...;
$myvar = $db->Column1;

That kind of stuff, where it did quoting on the actual DB request and
made moving it from MySQL to PostgreSQL a big pain, even though the rest
of the application was pretty good SQL.  What made it that much more
difficult was that there were places in the application which used
direct SQL statements and the identifiers in those weren't quoted. :/

Or maybe that was the original problem...  Anyhow, I *think* PEAR has
been changed now to not quote but I'm not 100% sure; I just fixed the
application, noted the issue, and moved on.

> One issue we might want to reflect on is how easy it should be to change
> the variable's setting.  Our bad experience with autocommit leaves me a
> bit wary of fundamental behavioral changes that can be flicked on and
> off at whim.  However, we have pretty much the same problem staring us
> in the face for backslash-in-literal-strings behavior, so maybe we can
> find an answer we like for both issues at the same time.

That'd be very nice...

    Thanks,

        Stephen

Attachment