Thread: select fails on indexed varchars.
Hi, First off I'm running: links=# select version() ; version --------------------------------------------------------------------- PostgreSQL 7.0.3 on i686-pc-linux-gnu, compiled by gcc egcs-2.91.66 (1 row) Now, if I have a table with an index, I'm not able to do some selects on it. To reproduce: links=# create table foo ( a char(25) ); CREATE links=# create index foodx on foo (a); CREATE links=# insert into foo values ('Test/Test'); INSERT 29689 1 links=# select * from foo; a --------------------------- Test/Test (1 row) links=# select * from foo where a like 'Test/%' links-# ; a --- (0 rows) # Strange result, why 0 rows, the select failed. links=# select * from foo where a like 'Test%'; a --------------------------- Test/Test (1 row) # This one's fine. links=# select * from foo where a like 'Test/T%'; a --------------------------- Test/Test (1 row) # And so is this one. It seems the /% causes a problem. If I drop the index, the selects work fine. Is this a bug, am I missing something? If you need any other system info, please let me know. I did an RPM install on a pretty plain Redhat 6.2 system. Please reply to alex@gossamer-threads.com with any ideas. Thanks! Alex -------------------- Gossamer Threads Inc. ---------------------- Alex Krohn Email: alex@gossamer-threads.com Internet Consultant Phone: (604) 687-5804 http://www.gossamer-threads.com Fax : (604) 687-5806
Alex Krohn <alex@gossamer-threads.com> writes: > links=# select * from foo where a like 'Test/%' > links-# ; > a > --- > (0 rows) This looks like an artifact of the known problems with LIKE index optimization in non-ASCII locales. What locale are you running the postmaster in? regards, tom lane
Alex Krohn <alex@gossamer-threads.com> writes: > [postgres@penguin pgsql]$ locale > LANG=en_US > LC_CTYPE="en_US" > LC_NUMERIC="en_US" > LC_TIME="en_US" > LC_COLLATE="en_US" > LC_MONETARY="en_US" > LC_MESSAGES="en_US" > LC_ALL=en_US > [postgres@penguin pgsql]$ > Postmaster is running as user pgsql. Any ideas on workarounds? Use locale "C" unless you have a really good reason why you need en_US sorting order. Beware of changing the postmaster's locale on the fly, however, since that will leave you with corrupted (out-of-order) indexes. Safest to dump/initdb in new locale/reload. regards, tom lane
Alex Krohn <alex@gossamer-threads.com> writes: >> Beware of changing the postmaster's locale on the fly, however, >> since that will leave you with corrupted (out-of-order) indexes. >> Safest to dump/initdb in new locale/reload. > How would I go about changing that? Setting LANG and LC_ALL in the pgsql > users home directory .bashrc? Or do I need to edit the startup file? I'd recommend setting LANG/LC_xxx directly in the script you use to fire up the postmaster. This ensures it will be right no matter whether the postmaster is launched by a boot script, by hand by someone logged in as pgsql, by hand by someone su'd from another account with different locale, yadda yadda. regards, tom lane
Alex Krohn <alex@gossamer-threads.com> writes: > So I added: > LANG=C > LC_ALL=C > to the /etc/rc.d/init.d/postgres file and stop, restarted the server. I > then dropped and recreated the database. However still same results. LC_COLLATE overrides LC_ALL, I think --- didn't you previously show us that all the LC_xxx family variables were set in your default environment? You may need to set (or unset if you prefer) all of 'em. Also, I'd really recommend an initdb, not the above half-baked approach, because the above will not fix any problems that the template1 indexes might have with a changed sort order. regards, tom lane
Alex Krohn <alex@gossamer-threads.com> writes: > I added to the startup file: > LANG=C > LC_CTYPE=C > LC_NUMERIC=C > LC_TIME=C > LC_COLLATE=C > LC_MONETARY=C > LC_MESSAGES=C > LC_ALL=C Seems reasonable. It's possible you needed "export" commands in there too, but I wouldn't have thought so (anything coming in from the outer environment should be exported already). > After this, my create test and select still produced the same error. Ugh. Hm, maybe I'm barking up the wrong tree. Let's try a direct test. What do you get from select 'a_b'::text < 'ac'::text; select 'A_B'::text < 'ac'::text; On my machine, these produce 't' in C locale, but 'f' in en_US locale. regards, tom lane
Hi, > Alex Krohn <alex@gossamer-threads.com> writes: > >> Beware of changing the postmaster's locale on the fly, however, > >> since that will leave you with corrupted (out-of-order) indexes. > >> Safest to dump/initdb in new locale/reload. > > > How would I go about changing that? Setting LANG and LC_ALL in the pgsql > > users home directory .bashrc? Or do I need to edit the startup file? > > I'd recommend setting LANG/LC_xxx directly in the script you use to fire > up the postmaster. This ensures it will be right no matter whether the > postmaster is launched by a boot script, by hand by someone logged in as > pgsql, by hand by someone su'd from another account with different > locale, yadda yadda. So I added: LANG=C LC_ALL=C to the /etc/rc.d/init.d/postgres file and stop, restarted the server. I then dropped and recreated the database. However still same results. Is this only my installation that has this problem? It's just a very plain RedHat 6.2 with rpm'd install of Postgres, so I'm a little wary about how many other people will be having this problem as well. Cheers, Alex -------------------- Gossamer Threads Inc. ---------------------- Alex Krohn Email: alex@gossamer-threads.com Internet Consultant Phone: (604) 687-5804 http://www.gossamer-threads.com Fax : (604) 687-5806
Hi, > > So I added: > > LANG=C > > LC_ALL=C > > to the /etc/rc.d/init.d/postgres file and stop, restarted the server. I > > then dropped and recreated the database. However still same results. > > LC_COLLATE overrides LC_ALL, I think --- didn't you previously show us > that all the LC_xxx family variables were set in your default > environment? You may need to set (or unset if you prefer) all of 'em. > > Also, I'd really recommend an initdb, not the above half-baked approach, > because the above will not fix any problems that the template1 indexes > might have with a changed sort order. I added to the startup file: LANG=C LC_CTYPE=C LC_NUMERIC=C LC_TIME=C LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C LC_ALL=C as well as to the postgres users default environment. I then shut down postmaster, and as user postgres ran `initdb /var/lib/pgsql`. I then ran /etc/rc.d/init.d/postgres start as root, and then as user postgres ran `createdb mytest`. After this, my create test and select still produced the same error. Ugh. Cheers, Alex
Hi Tom, > > [postgres@penguin pgsql]$ locale > > LANG=en_US > > LC_CTYPE="en_US" > > LC_NUMERIC="en_US" > > LC_TIME="en_US" > > LC_COLLATE="en_US" > > LC_MONETARY="en_US" > > LC_MESSAGES="en_US" > > LC_ALL=en_US > > [postgres@penguin pgsql]$ > > > Postmaster is running as user pgsql. Any ideas on workarounds? > > Use locale "C" unless you have a really good reason why you need > en_US sorting order. > > Beware of changing the postmaster's locale on the fly, however, > since that will leave you with corrupted (out-of-order) indexes. > Safest to dump/initdb in new locale/reload. How would I go about changing that? Setting LANG and LC_ALL in the pgsql users home directory .bashrc? Or do I need to edit the startup file? Cheers, Alex
Hi Tom, > Alex Krohn <alex@gossamer-threads.com> writes: > > links=# select * from foo where a like 'Test/%' > > links-# ; > > a > > --- > > (0 rows) > > This looks like an artifact of the known problems with LIKE index > optimization in non-ASCII locales. What locale are you running the > postmaster in? Is this what you are looking for: [postgres@penguin pgsql]$ locale LANG=en_US LC_CTYPE="en_US" LC_NUMERIC="en_US" LC_TIME="en_US" LC_COLLATE="en_US" LC_MONETARY="en_US" LC_MESSAGES="en_US" LC_ALL=en_US [postgres@penguin pgsql]$ Postmaster is running as user pgsql. Any ideas on workarounds? Cheers, Alex
Alex Krohn <alex@gossamer-threads.com> writes: >> On my machine, these produce 't' in C locale, but 'f' in en_US locale. > Seem to be in C locale: So it does. Okay, what was the complete test case again? I'm afraid I didn't save your original message because I wrote it off as a known problem ... regards, tom lane
Hi Tom, > > I added to the startup file: > > LANG=C > > LC_CTYPE=C > > LC_NUMERIC=C > > LC_TIME=C > > LC_COLLATE=C > > LC_MONETARY=C > > LC_MESSAGES=C > > LC_ALL=C > > Seems reasonable. It's possible you needed "export" commands in there > too, but I wouldn't have thought so (anything coming in from the outer > environment should be exported already). > > > After this, my create test and select still produced the same error. Ugh. > > Hm, maybe I'm barking up the wrong tree. Let's try a direct test. > What do you get from > > select 'a_b'::text < 'ac'::text; > > select 'A_B'::text < 'ac'::text; > > On my machine, these produce 't' in C locale, but 'f' in en_US locale. Seem to be in C locale: links=# select 'a_b'::text < 'ac'::text; ?column? ---------- t (1 row) links=# select 'A_B'::text < 'ac'::text; ?column? ---------- t (1 row) links=# Cheers, Alex -------------------- Gossamer Threads Inc. ---------------------- Alex Krohn Email: alex@gossamer-threads.com Internet Consultant Phone: (604) 687-5804 http://www.gossamer-threads.com Fax : (604) 687-5806
Alex Krohn <alex@gossamer-threads.com> writes: >>> Seem to be in C locale: >> >> So it does. Okay, what was the complete test case again? >> I'm afraid I didn't save your original message because I wrote it off >> as a known problem ... > Here it is: > links=# create table foo ( a char(25) ); > CREATE > links=# create index foodx on foo (a); > CREATE > links=# insert into foo values ('Test/Test'); > INSERT 29689 1 > links=# select * from foo; > a > --------------------------- > Test/Test > (1 row) > links=# select * from foo where a like 'Test/%'; > a > --- > (0 rows) How odd. I get 'Test/Test' from the last select, under both 7.0.2 and current sources, when using C locale. The query certainly looks like the kind that would suffer from the LIKE-optimization problem in non-C locales ... but we seem to have established that you've gotten the postmaster switched into C locale. What does EXPLAIN VERBOSE select * from foo where a like 'Test/%'; show? regards, tom lane
Alex Krohn <alex@gossamer-threads.com> writes: >> What does EXPLAIN VERBOSE select * from foo where a like 'Test/%'; >> show? Well, the indexqual is just what it should be for C locale: :indxqual (( { EXPR :typeOid 16 :opType op :oper { OPER :opno 1061 :opid 1052 :opresulttype 16 } :args ( { VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 1042 :constlen -1 :constisnull false :constvalue 9 [ 9 0 0 0 84 101 115 116 47 ] :constbyval false })} { EXPR :typeOid 16 :opType op :oper { OPER :opno 1058 :opid 1049 :opresulttype 16 } :args ( { VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 1042 :constlen -1 :constisnull false :constvalue 9 [ 9 0 0 0 84 101 115 116 48 ] :constbyval false })})) This mess translates as a >= 'Test/'::bpchar AND a < 'Test0'::bpchar which is what the LIKE index optimizer is supposed to generate. I infer that one or the other of these conditions yields false on your machine, which should not be happening if the thing is in C locale. regards, tom lane
Hi Tom, > Alex Krohn <alex@gossamer-threads.com> writes: > >> On my machine, these produce 't' in C locale, but 'f' in en_US locale. > > > Seem to be in C locale: > > So it does. Okay, what was the complete test case again? > I'm afraid I didn't save your original message because I wrote it off > as a known problem ... Here it is: links=# create table foo ( a char(25) ); CREATE links=# create index foodx on foo (a); CREATE links=# insert into foo values ('Test/Test'); INSERT 29689 1 links=# select * from foo; a --------------------------- Test/Test (1 row) links=# select * from foo where a like 'Test/%'; a --- (0 rows) Cheers, Alex
Hi Tom, > >> So it does. Okay, what was the complete test case again? > >> I'm afraid I didn't save your original message because I wrote it off > >> as a known problem ... > > > Here it is: > > > links=# create table foo ( a char(25) ); > > CREATE > > links=# create index foodx on foo (a); > > CREATE > > links=# insert into foo values ('Test/Test'); > > INSERT 29689 1 > > links=# select * from foo; > > a > > --------------------------- > > Test/Test > > (1 row) > > > links=# select * from foo where a like 'Test/%'; > > a > > --- > > (0 rows) > > How odd. I get 'Test/Test' from the last select, under both 7.0.2 > and current sources, when using C locale. The query certainly looks > like the kind that would suffer from the LIKE-optimization problem in > non-C locales ... but we seem to have established that you've gotten > the postmaster switched into C locale. > > What does EXPLAIN VERBOSE select * from foo where a like 'Test/%'; > show? Here's a cut and paste, not sure if there is a nicer way to output it: links=# EXPLAIN VERBOSE select * from foo where a like 'Test/%'; NOTICE: QUERY DUMP: { INDEXSCAN :startup_cost 0.00 :total_cost 8.14 :rows 10 :width 12 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM:resno 1 :restype 1042 :restypmod 29 :resname a :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR:varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1}}) :qpqual ({ EXPR :typeOid16 :opType op :oper { OPER :opno 1211 :opid 850 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1042:vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 25 :constlen -1 :constisnull false :constvalue 10 [ 10 0 0 0 84 101 115 116 47 37 ] :constbyval false })}) :lefttree <> :righttree <> :extprm () :locprm ():initplan <> :nprm 0 :scanrelid 1 :indxid ( 18825) :indxqual (({ EXPR :typeOid 16 :opType op :oper { OPER :opno 1061:opid 1052 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold1 :varoattno 1} { CONST :consttype 1042 :constlen -1 :constisnull false :constvalue 9 [ 9 0 0 0 84 101 115 11647 ] :constbyval false })} { EXPR :typeOid 16 :opType op :oper { OPER :opno 1058 :opid 1049 :opresulttype 16 } :args({ VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype1042 :constlen -1 :constisnull false :constvalue 9 [ 9 0 0 0 84 101 115 116 48 ] :constbyval false })})) :indxqualorig(({ EXPR :typeOid 16 :opType op :oper { OPER :opno 1061 :opid 1052 :opresulttype 16 } :args ({ VAR :varno 1:varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 1042 :constlen -1 :constisnullfalse :constvalue 9 [ 9 0 0 0 84 101 115 116 47 ] :constbyval false })} { EXPR :typeOid 16 :opType op :oper{ OPER :opno 1058 :opid 1049 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup0 :varnoold 1 :varoattno 1} { CONST :consttype 1042 :constlen -1 :constisnull false :constvalue 9 [ 9 0 0 084 101 115 116 48 ] :constbyval false })})) :ind NOTICE: QUERY PLAN: Index Scan using foodx on foo (cost=0.00..8.14 rows=10 width=12) EXPLAIN links=# Cheers, Alex
> links=# select * from foo where a < 'Test0'::bpchar; > a > --- > (0 rows) > links=# > Are you saying the second test should have returned true under C locale? Yes. You are not really in C locale, or at least your postmaster isn't. This looks like ISO sorting rules to me --- perhaps you are in en_US locale. > Is this a version dependant bug? Will downgrading to 6.x get me going? No. Fix your locale. I'm afraid I don't have any more advice to offer on that than I gave already ... regards, tom lane
Hi Tom, > >> What does EXPLAIN VERBOSE select * from foo where a like 'Test/%'; > >> show? > > Well, the indexqual is just what it should be for C locale: > > :indxqual (( > { EXPR :typeOid 16 :opType op :oper > { OPER :opno 1061 :opid 1052 :opresulttype 16 } :args ( > { VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} > { CONST :consttype 1042 :constlen -1 :constisnull false :constvalue 9 [ 9 0 0 0 84 101 115 116 47 ] :constbyval false})} > { EXPR :typeOid 16 :opType op :oper > { OPER :opno 1058 :opid 1049 :opresulttype 16 } :args ( > { VAR :varno 1 :varattno 1 :vartype 1042 :vartypmod 29 :varlevelsup 0 :varnoold 1 :varoattno 1} > { CONST :consttype 1042 :constlen -1 :constisnull false :constvalue 9 [ > 9 0 0 0 84 101 115 116 48 ] :constbyval false })})) > > This mess translates as > > a >= 'Test/'::bpchar AND a < 'Test0'::bpchar > > which is what the LIKE index optimizer is supposed to generate. > I infer that one or the other of these conditions yields false on your > machine, which should not be happening if the thing is in C locale. Here's what I get: links=# select * from foo where a >= 'Test/'::bpchar; a --------------------------- Test/Test (1 row) links=# select * from foo where a < 'Test0'::bpchar; a --- (0 rows) links=# Are you saying the second test should have returned true under C locale? Is this a version dependant bug? Will downgrading to 6.x get me going? Cheers, Alex