Thread: Case insensitive selects?
Does pgsql support this and how would I do it? david
It is in the list archives several times. All you need is to use some basic SQL. select * from mytable where upper('my criteria') = upper(mytable.info); Adam Lang Systems Engineer Rutgers Casualty Insurance Company http://www.rutgersinsurance.com ----- Original Message ----- From: "David Reid" <dreid@jetnet.co.uk> To: <pgsql-general@postgresql.org> Sent: Wednesday, February 14, 2001 9:58 AM Subject: [GENERAL] Case insensitive selects? > Does pgsql support this and how would I do it? > > david
Hi use it if u absolutly need it.. Using a function on a column name doesnt use the index associated with that column,.. So exercise this option with some amount of thinking.. Anand On Wed, Feb 14, 2001 at 11:39:47AM -0500, Adam Lang wrote: >It is in the list archives several times. > >All you need is to use some basic SQL. > >select * from mytable where upper('my criteria') = upper(mytable.info); > >Adam Lang >Systems Engineer >Rutgers Casualty Insurance Company >http://www.rutgersinsurance.com >----- Original Message ----- >From: "David Reid" <dreid@jetnet.co.uk> >To: <pgsql-general@postgresql.org> >Sent: Wednesday, February 14, 2001 9:58 AM >Subject: [GENERAL] Case insensitive selects? > > >> Does pgsql support this and how would I do it? >> >> david
Indexes *can* and *will* be used if you create the appropiate functional indexes, i.e: CREATE INDEX idx_table_field_upper ON table(upper(field)); SELECT field FROM table WHERE upper(field) LIKE upper('some string'); Michael Fork - CCNA - MCP - A+ Network Support - Toledo Internet Access - Toledo Ohio On Thu, 15 Feb 2001, Anand Raman wrote: > Hi > use it if u absolutly need it.. Using a function on a column name > doesnt use the index associated with that column,.. So exercise this > option with some amount of thinking.. > > Anand > On Wed, Feb 14, 2001 at 11:39:47AM -0500, Adam Lang wrote: > >It is in the list archives several times. > > > >All you need is to use some basic SQL. > > > >select * from mytable where upper('my criteria') = upper(mytable.info); > > > >Adam Lang > >Systems Engineer > >Rutgers Casualty Insurance Company > >http://www.rutgersinsurance.com > >----- Original Message ----- > >From: "David Reid" <dreid@jetnet.co.uk> > >To: <pgsql-general@postgresql.org> > >Sent: Wednesday, February 14, 2001 9:58 AM > >Subject: [GENERAL] Case insensitive selects? > > > > > >> Does pgsql support this and how would I do it? > >> > >> david >
On Thu, 15 Feb 2001, Michael Fork wrote: > Indexes *can* and *will* be used if you create the appropiate > functional indexes, i.e: > > CREATE INDEX idx_table_field_upper ON table(upper(field)); > > SELECT field FROM table WHERE upper(field) LIKE upper('some string'); Hmmm...I'd hate to have two indexes on every field I query like this, one case-senstive, one case-insensitve (like the one you create here). Is there a configuration option or something that will tell pgsql to do case-insensitive comparisons (kinda like MS SQL Server has)? That could save us on indexing overhead, since we want all of our WHERE comparisons to be case-insensitive, anyway. I should also not that we're also using --with-multibyte and having all of our databases use Unicode exclusively. Thanks! David
David Wheeler <david@wheeler.net> writes: >> Indexes *can* and *will* be used if you create the appropiate >> functional indexes, i.e: >> >> CREATE INDEX idx_table_field_upper ON table(upper(field)); >> >> SELECT field FROM table WHERE upper(field) LIKE upper('some string'); > Hmmm...I'd hate to have two indexes on every field I query like this, one > case-senstive, one case-insensitve (like the one you create here). Is > there a configuration option or something that will tell pgsql to do > case-insensitive comparisons (kinda like MS SQL Server has)? That could > save us on indexing overhead, since we want all of our WHERE comparisons > to be case-insensitive, anyway. Then why are you bothering to maintain a case-sensitive index? There's no free lunch available here; if you think there is, then you are misunderstanding what an index is. Either the index is in case-sensitive order, or it's not. regards, tom lane
On Thu, 15 Feb 2001, Tom Lane wrote: > Then why are you bothering to maintain a case-sensitive index? Because while some queries do a case-insensitive query, others do not, in the sense that I do not everywhere convert the string to compare to lower case. > There's no free lunch available here; if you think there is, then you > are misunderstanding what an index is. Either the index is in > case-sensitive order, or it's not. Well, I think I understand pretty well what an index is. But I don't get that the earlier example was of a case-insensitive index, but of an index where all the entries were forced into lower case (or upper case, as the case may be [pun not intended]). Thus, if I have this index: CREATE INDEX idx_mime_type__name ON mime_type(LOWER(name)); and I execute this query: SELECT * FROM mime_type WHERE name = 'text/HTML'; Will it use the index I created above or not? I'm assuming not unless I rewrite the query like this: SELECT * FROM mime_type WHERE name = LOWER('text/HTML'); But then I wouldn't call the index I created "case-insensitive." But I would be happy to know if I'm missing something here. Thanks, David
If you are going to be only doing case-insensitive compares, why would you have two indexes on the field? Although I am no guru on PostgreSQL internals or database theory, a case insensitive select on a mixed case index would not work for the following reason (correct me if i am wrong): 1) becuase of ASCII values and the way btree indexes are ordered, 'A' and 'a' are not store next to each other, meaning that you cannot map all the caracters of the index to the same case on the fly w/o missing a chunk of index (unless you wanted to make multiple passes through the index, which would negate any speed gains of *not* having multiple indexes becuase of the exponential growth, i.e. searching for 'that' would require 16 passes thru -- what, What, wHat, whAt, whaT, etc.) Michael Fork - CCNA - MCP - A+ Network Support - Toledo Internet Access - Toledo Ohio On Thu, 15 Feb 2001, David Wheeler wrote: > > On Thu, 15 Feb 2001, Michael Fork wrote: > > > Indexes *can* and *will* be used if you create the appropiate > > functional indexes, i.e: > > > > CREATE INDEX idx_table_field_upper ON table(upper(field)); > > > > SELECT field FROM table WHERE upper(field) LIKE upper('some string'); > > Hmmm...I'd hate to have two indexes on every field I query like this, one > case-senstive, one case-insensitve (like the one you create here). Is > there a configuration option or something that will tell pgsql to do > case-insensitive comparisons (kinda like MS SQL Server has)? That could > save us on indexing overhead, since we want all of our WHERE comparisons > to be case-insensitive, anyway. > > I should also not that we're also using --with-multibyte and having all of > our databases use Unicode exclusively. > > Thanks! > > David >
David Wheeler <david@wheeler.net> writes: > Thus, if I have this index: > > CREATE INDEX idx_mime_type__name ON mime_type(LOWER(name)); > > and I execute this query: > > SELECT * > FROM mime_type > WHERE name = 'text/HTML'; > > Will it use the index I created above or not? I'm assuming not unless I > rewrite the query like this: > > SELECT * > FROM mime_type > WHERE name = LOWER('text/HTML'); Not then either; you'd need to write SELECT * FROM mime_type WHERE LOWER(name) = LOWER('text/HTML'); or equivalently SELECT * FROM mime_type WHERE LOWER(name) = 'text/html'; which is what will result from constant-folding anyway. The details of invocation seem beside the point, however. The point is that a btree index is all about sort order, and the sort order of data viewed case-sensitively is quite different from the sort order of monocased data. Perhaps in an ASCII universe you could play some tricks to make the same index serve both purposes, but it'll never work in non-ASCII locales ... regards, tom lane
About a lifetime ago I used to do quite a bit of work w/ Oracle. It's command line sql tool had some pretty nice features that I haven't been able to find in psql. I was wondering if any of the following existed.... I think the first was called break on which altered in output from something like name date qty ------------------------- Fred 01-JAN-2000 10 Fred 10-JAN-2000 13 Fred 01-JUL-2000 1 Fred 01-DEC-2000 100 Bob 01-JAN-2000 5 Bob 10-MAY-2000 10 to name date qty ------------------------- Fred 01-JAN-2000 10 10-JAN-2000 13 01-JUL-2000 1 01-DEC-2000 100 Bob 01-JAN-2000 5 10-MAY-2000 10 it also allowed for things like compute sum which would activate on breaks but I don't recall how they worked. The other thing I'd love to be able to do is get user input while running a sql file. I don't recall how this worked exactly but the script would either accept variables calling the script or prompt for them. So if I had a sql script in a file named contact_report. And I did prod=> \i contact_report 01-JAN-2000 31-DEC-2001 then it would load the script and replace IIRC &1 and &2 with the respective dates listed on command line. It also had an ACCEPT command that would cause it to prompt for input from user and assign to a varable name. like accept amount prompt 'Enter the amount to search for: ' select * from foo where quantity = &amount These made it very easy to build simple reports and scripts for less technical end users. Is any of this possible with psql? Thanks, James ->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-< James Thompson 138 Cardwell Hall Manhattan, Ks 66506 785-532-0561 Kansas State University Department of Mathematics ->->->->->->->->->->->->->->->->->->---<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<-<
James Thompson writes: > About a lifetime ago I used to do quite a bit of work w/ Oracle. > > It's command line sql tool had some pretty nice features that I haven't > been able to find in psql. I was wondering if any of the following > existed.... > > I think the first was called break on which altered in output from > something like > > name date qty > ------------------------- > Fred 01-JAN-2000 10 > Fred 10-JAN-2000 13 > Fred 01-JUL-2000 1 > Fred 01-DEC-2000 100 > Bob 01-JAN-2000 5 > Bob 10-MAY-2000 10 > > to > > name date qty > ------------------------- > Fred 01-JAN-2000 10 > 10-JAN-2000 13 > 01-JUL-2000 1 > 01-DEC-2000 100 > Bob 01-JAN-2000 5 > 10-MAY-2000 10 > > it also allowed for things like compute sum which would activate on breaks > but I don't recall how they worked. This seems to be a thing for a report generator. Try pgaccess. > The other thing I'd love to be able to do is get user input while running > a sql file. I don't recall how this worked exactly but the script would > either accept variables calling the script or prompt for them. > > So if I had a sql script in a file named contact_report. And I did > > prod=> \i contact_report 01-JAN-2000 31-DEC-2001 > > then it would load the script and replace IIRC &1 and &2 with the > respective dates listed on command line. You can use \set to set variables. > It also had an ACCEPT command that would cause it to prompt for input from > user and assign to a varable name. Try \echo -n 'Prompt: ' \set varname `read input; echo $input` -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
On Thu, 15 Feb 2001, Tom Lane wrote: > Not then either; you'd need to write > > SELECT * > FROM mime_type > WHERE LOWER(name) = LOWER('text/HTML'); > > or equivalently > > SELECT * > FROM mime_type > WHERE LOWER(name) = 'text/html'; > > which is what will result from constant-folding anyway. Yes, of course; my oversight. > The details of invocation seem beside the point, however. The point is > that a btree index is all about sort order, and the sort order of data > viewed case-sensitively is quite different from the sort order of > monocased data. Perhaps in an ASCII universe you could play some tricks > to make the same index serve both purposes, but it'll never work in > non-ASCII locales ... Hmmm...somehow, MS gets it to work in SQL Server. Lord knows how (or if it's effective or fast), but I won't worry about it (since the last thing I want to do is switch to NT!). I'll just code more carefully per the examples above to ensure proper index use. Thanks, David
Hmmm... I'm trying to create an index, CREATE INDEX idx_server__host_name ON server(LOWER(host_name)); But it won't create. Here's the error: ERROR: DefineIndex: function 'upper(varchar)' does not exist Anyone know what's up with that? The table does have the host_name column of type VARCHAR. Thanks, David
Forgot to mention, I'm using 7.03. Thanks, David On Thu, 15 Feb 2001, David Wheeler wrote: > Hmmm... I'm trying to create an index, > > CREATE INDEX idx_server__host_name ON server(LOWER(host_name)); > > But it won't create. Here's the error: > > ERROR: DefineIndex: function 'upper(varchar)' does not exist > > Anyone know what's up with that? The table does have the host_name column > of type VARCHAR. > > Thanks, > > David
On Thu, 15 Feb 2001, Michael Fork wrote: > Indexes *can* and *will* be used if you create the appropiate > functional indexes, i.e: > > CREATE INDEX idx_table_field_upper ON table(upper(field)); > > SELECT field FROM table WHERE upper(field) LIKE upper('some string'); Hmmm...I'd hate to have two indexes on every field I query like this, one case-senstive, one case-insensitve (like the one you create here). Is there a configuration option or something that will tell pgsql to do case-insensitive comparisons (kinda like MS SQL Server has)? That could save us on indexing overhead, since we want all of our WHERE comparisons to be case-insensitive, anyway. I should also not that we're also using --with-multibyte and having all of our databases use Unicode exclusively. Thanks! David
> Hmmm...I'd hate to have two indexes on every field I query like this, one > case-senstive, one case-insensitve (like the one you create here). Is > there a configuration option or something that will tell pgsql to do > case-insensitive comparisons (kinda like MS SQL Server has)? That could > save us on indexing overhead, since we want all of our WHERE comparisons > to be case-insensitive, anyway. If you want all of them to be case insensitive then make the upper ( or lower() ) index and don't make any case sensitive queries! :-) Make sure all your queries use upper() or lower() around the field and value you're comparing and you're golden.. Unless I've misunderstood you, I don't see the problem.. SELECT * FROM whatever WHERE lower(myfield) = lower('myvalue'); -- and make your index on lower(myfield)... Viola! -Mitch
> On Thu, 15 Feb 2001, Michael Fork wrote: > > > Indexes *can* and *will* be used if you create the appropiate > > functional indexes, i.e: > > > > CREATE INDEX idx_table_field_upper ON table(upper(field)); > > > > SELECT field FROM table WHERE upper(field) LIKE upper('some string'); > > Hmmm...I'd hate to have two indexes on every field I query like this, one > case-senstive, one case-insensitve (like the one you create here). Is > there a configuration option or something that will tell pgsql to do > case-insensitive comparisons (kinda like MS SQL Server has)? That could > save us on indexing overhead, since we want all of our WHERE comparisons > to be case-insensitive, anyway. I was wondering if we could do case-insensitive index waking by doing looking for CAR as: CAR CAr CaR Car cAR cAr caR car Basically you look for CAR, then back up in the btree, to CA and look for r instead of R. I relized the number of tests would exponentially explode, but isn't it just like btree walking where we back up to test the lowercase of the letter. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> On Fri, 16 Feb 2001, Bruce Momjian wrote: > > > Yes, our CREATE INDEX lower(col) already does that, but you do have to > > use lower(col) when doing the query. > > Right, that's what I'm suggesting a configuration that automates the > lower(col) bit in CREATE INDEX and that automates the lower(col) in > queries. > > BTW, I've run into some snags with CREATE INDEX lower(col). First it > wouldn't work because my col was varchar (fixec by creating a new > function) and then because I tried to combine columns: > > CREATE UNIQUE INDEX idx_name ON server(lower(col1), col2); Ewe, that is a tough one. We don't support multi-column functional indexes, do we? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Fri, 16 Feb 2001, Michael Fork wrote: > This is the function Tom Lane told me to use in < 7.1 (IIRC, this will > cause problems in >= 7.1, so you have to remember to remove from your > dump) > > CREATE FUNCTION "upper" (varchar ) RETURNS text AS 'upper' LANGUAGE > 'INTERNAL'; Yeah, I found that function posted to the list last May. Thanks, Michael. David
This is the function Tom Lane told me to use in < 7.1 (IIRC, this will cause problems in >= 7.1, so you have to remember to remove from your dump) CREATE FUNCTION "upper" (varchar ) RETURNS text AS 'upper' LANGUAGE 'INTERNAL'; Michael Fork - CCNA - MCP - A+ Network Support - Toledo Internet Access - Toledo Ohio On Thu, 15 Feb 2001, David Wheeler wrote: > Hmmm... I'm trying to create an index, > > CREATE INDEX idx_server__host_name ON server(LOWER(host_name)); > > But it won't create. Here's the error: > > ERROR: DefineIndex: function 'upper(varchar)' does not exist > > Anyone know what's up with that? The table does have the host_name column > of type VARCHAR. > > Thanks, > > David >
On Fri, 16 Feb 2001, Bruce Momjian wrote: > Yes, our CREATE INDEX lower(col) already does that, but you do have to > use lower(col) when doing the query. Right, that's what I'm suggesting a configuration that automates the lower(col) bit in CREATE INDEX and that automates the lower(col) in queries. BTW, I've run into some snags with CREATE INDEX lower(col). First it wouldn't work because my col was varchar (fixec by creating a new function) and then because I tried to combine columns: CREATE UNIQUE INDEX idx_name ON server(lower(col1), col2); But I see that either they all have to be inside the function or for there be be no function. Will 7.1 support mixing like this? Thanks for your prompt responses, Bruce. David
On Fri, 16 Feb 2001, Bruce Momjian wrote: > I was wondering if we could do case-insensitive index waking by doing > looking for CAR as: > > CAR > CAr > CaR > Car > cAR > cAr > caR > car > > Basically you look for CAR, then back up in the btree, to CA and look > for r instead of R. I relized the number of tests would exponentially > explode, but isn't it just like btree walking where we back up to test > the lowercase of the letter. Wouldn't it be more efficient to just have a single, case-insensitive index, and then have the query engine automagically compare to the index in a case-insensitive way? I'm assuming that this is the sort of approach MS takes, which is why one has to choose the sort order at installation time. If I choose case-insensitive Unicode, then I would expect the server to do these things for me behind the scenes: * When I create an index, automatically convert all char/varchar/text fields with lower(). * When I do a query, automatically use lower() on all fields and values queried against. The result would be the same as Mitch describes, only I don't have to do the work in my queries. The database would assume I want case-insensitive matching based on some configuration I set, and do all the lower()s for me. Perhaps the configuration could be set on a per-database basis (like character set now is with multibyte). Does that make sense? Best David
> Wouldn't it be more efficient to just have a single, case-insensitive > index, and then have the query engine automagically compare to the index > in a case-insensitive way? I'm assuming that this is the sort of approach > MS takes, which is why one has to choose the sort order at installation > time. If I choose case-insensitive Unicode, then I would expect the server > to do these things for me behind the scenes: > Yes, our CREATE INDEX lower(col) already does that, but you do have to use lower(col) when doing the query. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Fri, 16 Feb 2001, Bruce Momjian wrote: > Ewe, that is a tough one. We don't support multi-column functional > indexes, do we? Too bad, because I could use that. I'm getting around it for now by making sure the records in col1 are always lower case, anyway. Thanks, David
Tom Lane wrote: [snip] > > Hmmm...I'd hate to have two indexes on every field I query like this, one > > case-senstive, one case-insensitve (like the one you create here). Is > > there a configuration option or something that will tell pgsql to do > > case-insensitive comparisons (kinda like MS SQL Server has)? That could > > save us on indexing overhead, since we want all of our WHERE comparisons > > to be case-insensitive, anyway. > > Then why are you bothering to maintain a case-sensitive index? > > There's no free lunch available here; if you think there is, then you > are misunderstanding what an index is. Either the index is in > case-sensitive order, or it's not. I've actually been thinking about this and maybe this is possible with some smarts in the query parser. If you have an index on lower(fieldname) then consider the following query: select * from table1, table2 where table1.a = table2.b; (the index is on lower(table1.a). Now, it should be true that a = b implies lower(a) = lower(b), so the above query is equivalent to: select * from table1, table2 where table1.a = table2.b and lower(table1.a) = lower(table2.b); This query can use the index and produce the correct result. Am I missing anything? -- Martijn van Oosterhout <kleptog@cupid.suninternet.com> http://cupid.suninternet.com/~kleptog/
On Sun, 18 Feb 2001, Martijn van Oosterhout wrote: > Tom Lane wrote: > > [snip] > > > > Then why are you bothering to maintain a case-sensitive index? > > > > There's no free lunch available here; if you think there is, then you > > are misunderstanding what an index is. Either the index is in > > case-sensitive order, or it's not. > > I've actually been thinking about this and maybe this is possible with > some smarts in the query parser. If you have an index on > lower(fieldname) then consider the following query: > > select * > from table1, table2 > where table1.a = table2.b; > > (the index is on lower(table1.a). > > Now, it should be true that a = b implies lower(a) = lower(b), so the > above query is equivalent to: > > select * > from table1, table2 > where table1.a = table2.b > and lower(table1.a) = lower(table2.b); > > This query can use the index and produce the correct result. Am I > missing anything? This is almost exactly what I was thinking of. David