Thread: like-operator on index-scan
Hello, I'd like to know if anyone can reproduce a strange error in V7.02 (debian i386). It happens when an index-scan is done in an index with more than 80 entries, using a like-match, where the %-wildcard directly follows a '/'. A simple example (which creates a database testdb) is appended. In my installation the first count is 99, the second 0. ciao Andreas #! /bin/sh set -e psql <<. \set ON_ERROR_STOP 1 create database testdb; \c testdb create table test (a text); . i=1 while test $i -lt 100; do echo "insert into test values('/a');" i=`expr $i + 1` done | psql -d testdb psql -d testdb <<. select count(*) from test where a like '/%'; create index test1 on test(a); select count(*) from test where a like '/%'; \c template1 drop database testdb;
Andreas Degert <ad@papyrus-gmbh.de> writes: > I'd like to know if anyone can reproduce a strange error in V7.02 > (debian i386). It happens when an index-scan is done in an index with > more than 80 entries, using a like-match, where the %-wildcard > directly follows a '/'. Someone else just reported this a couple days ago. Very odd. I suspect the problem is locale-related; what LOCALE do you run the postmaster in? Also, it might help to look at the output of EXPLAIN VERBOSE for the misbehaving query. That would let us see what indexscan limits are being generated. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Andreas Degert <ad@papyrus-gmbh.de> writes: > > I'd like to know if anyone can reproduce a strange error in V7.02 > > (debian i386). It happens when an index-scan is done in an index with > > more than 80 entries, using a like-match, where the %-wildcard > > directly follows a '/'. > > Someone else just reported this a couple days ago. Very odd. I suspect > the problem is locale-related; what LOCALE do you run the postmaster in? yes, i just found that out myself (should check my mail more often :)). It seems it doesn't make any difference which locale it is; I tried de_DE (that's what I'm using normally), en_US, en_GB, ... > > Also, it might help to look at the output of EXPLAIN VERBOSE for > the misbehaving query. That would let us see what indexscan limits > are being generated. This is the output of explain verbose select count(*) from test where a like '/%'; (hand-formatted.. first and last time i've done that :)) { AGG :startup_cost 2.02 :total_cost 2.02 :rows 1 :width 4 :state <> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23 :restypmod -1 :resname count :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { AGGREG :aggname count :basetype 0 :aggtype 23 :target { CONST :consttype 23 :constlen 4 :constisnull false :constvalue 4 [ 1 0 0 0 ] :constbyval true } :usenulls false :aggstartrue :aggdistinct false } }) :qpqual <> :lefttree { INDEXSCAN :startup_cost 0.00 :total_cost 2.01:rows 1 :width 4 :state <> :qptargetlist <> :qpqual ({ EXPR :typeOid 16 :opType op :oper { OPER :opno1209 :opid 850 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 25 :constlen -1 :constisnull false :constvalue 6 [ 6 0 0 0 47 37 ] :constbyval false } ) }) :lefttree <> :righttree <> :extprm () :locprm () :initplan<> :nprm 0 :scanrelid 1 :indxid ( 30208) :indxqual (({ EXPR :typeOid 16 :opType op :oper { OPER:opno 667 :opid 743 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 25 :constlen -1 :constisnull false :constvalue 5 [ 5 0 0 0 47 ] :constbyval false } )} { EXPR :typeOid 16 :opType op :oper { OPER :opno 664 :opid 740 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod-1 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 25 :constlen -1:constisnull false :constvalue 5 [ 5 0 0 0 48 ] :constbyval false } ) } )) :indxqualorig (({ EXPR :typeOid 16 :opType op :oper { OPER :opno 667 :opid 743 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1} { CONST :consttype 25 :constlen -1 :constisnull false :constvalue 5 [ 5 0 0 0 47 ] :constbyval false } ) } { EXPR :typeOid16 :opType op :oper { OPER :opno 664 :opid 740 :opresulttype 16 } :args ({ VAR :varno 1 :varattno1 :vartype 25 :vartypmod -1 :varlevelsup 0 :varnoold 1 :varoattno 1} {CONST :consttype 25 :constlen -1 :constisnull false :constvalue 5 [ 5 0 0 0 48 ] :constbyval false } ) } )) :indxorderdir 1 } :righttree <> :extprm () :locprm () :initplan<> :nprm 0 } It looks like the like-match is converted into a "x >= a and x < b" form of expression (opno 664 is text_lt and opno 667 is text_ge, and 47 == ascii(/)), which doesn't work with collating order in most locales? But it must be more compicated, because the query works when there are less then 80 entries in the index. How many go onto a page? ciao Andreas
Andreas Degert <ad@papyrus-gmbh.de> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Also, it might help to look at the output of EXPLAIN VERBOSE for >> the misbehaving query. That would let us see what indexscan limits >> are being generated. > This is the output of > explain verbose select count(*) from test where a like '/%'; > (hand-formatted.. first and last time i've done that :)) > :indxqual > (({ EXPR > :typeOid 16 :opType op > :oper { OPER :opno 667 :opid 743 :opresulttype 16 } > :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 > :varlevelsup 0 :varnoold 1 :varoattno 1} > { CONST :consttype 25 :constlen -1 :constisnull false > :constvalue 5 [ 5 0 0 0 47 ] :constbyval false } > )} > { EXPR > :typeOid 16 :opType op > :oper { OPER :opno 664 :opid 740 :opresulttype 16 } > :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 > :varlevelsup 0 :varnoold 1 :varoattno 1} > { CONST :consttype 25 :constlen -1 :constisnull false > :constvalue 5 [ 5 0 0 0 48 ] :constbyval false } > ) > } > )) > It looks like the like-match is converted into a "x >= a and x < b" > form of expression (opno 664 is text_lt and opno 667 is text_ge, and > 47 == ascii(/)), which doesn't work with collating order in most > locales? What we've got here is x >= '/' AND x < '0', which should work as far as I can see, unless your machine uses a really peculiar collation order. What happens if you try the query in the form select count(*) from test where a >= '/' and a < '0' Do you get the same behavior? If so, try changing the index bounds to see where it works and stops working. > But it must be more compicated, because the query works when > there are less then 80 entries in the index. How many go onto a page? More than that, I'd think, at least for strings as short as you showed in your example. An index item only has about a dozen bytes of overhead, so for short strings you ought to get three or four hundred per index page. You can check this by looking to see if the index file has grown past its minimum size of 2 pages (16K). The whole thing is quite peculiar. Your example works fine for me; can anyone else duplicate the failure, and if so on what platform? regards, tom lane