Thread: like-operator on index-scan

like-operator on index-scan

From
Andreas Degert
Date:
Hello,

I'd like to know if anyone can reproduce a strange error in V7.02
(debian i386). It happens when an index-scan is done in an index with
more than 80 entries, using a like-match, where the %-wildcard
directly follows a '/'. A simple example (which creates a database
testdb) is appended. In my installation the first count is 99, the
second 0.

ciao
Andreas

#! /bin/sh
set -e
psql <<.
\set ON_ERROR_STOP 1
create database testdb;
\c testdb
create table test (a text);
.
i=1
while test $i -lt 100; do echo "insert into test values('/a');" i=`expr $i + 1`
done | psql -d testdb
psql -d testdb <<.
select count(*) from test where a like '/%';
create index test1 on test(a);
select count(*) from test where a like '/%';
\c template1
drop database testdb;


Re: like-operator on index-scan

From
Tom Lane
Date:
Andreas Degert <ad@papyrus-gmbh.de> writes:
> I'd like to know if anyone can reproduce a strange error in V7.02
> (debian i386). It happens when an index-scan is done in an index with
> more than 80 entries, using a like-match, where the %-wildcard
> directly follows a '/'.

Someone else just reported this a couple days ago.  Very odd.  I suspect
the problem is locale-related; what LOCALE do you run the postmaster in?

Also, it might help to look at the output of EXPLAIN VERBOSE for
the misbehaving query.  That would let us see what indexscan limits
are being generated.
        regards, tom lane


Re: like-operator on index-scan

From
Andreas Degert
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Andreas Degert <ad@papyrus-gmbh.de> writes:
> > I'd like to know if anyone can reproduce a strange error in V7.02
> > (debian i386). It happens when an index-scan is done in an index with
> > more than 80 entries, using a like-match, where the %-wildcard
> > directly follows a '/'.
> 
> Someone else just reported this a couple days ago.  Very odd.  I suspect
> the problem is locale-related; what LOCALE do you run the postmaster in?

yes, i just found that out myself (should check my mail more often :)).

It seems it doesn't make any difference which locale it is; I tried
de_DE (that's what I'm using normally), en_US, en_GB, ...

> 
> Also, it might help to look at the output of EXPLAIN VERBOSE for
> the misbehaving query.  That would let us see what indexscan limits
> are being generated.

This is the output of

explain verbose select count(*) from test where a like '/%';

(hand-formatted.. first and last time i've done that :))

{ AGG :startup_cost 2.02 :total_cost 2.02 :rows 1 :width 4 :state <> :qptargetlist   ({ TARGETENTRY      :resdom {
RESDOM               :resno 1 :restype 23 :restypmod -1                :resname count :reskey 0 :reskeyop 0
  :ressortgroupref 0 :resjunk false }      :expr   { AGGREG                :aggname count :basetype 0 :aggtype 23
        :target { CONST :consttype 23 :constlen 4                                :constisnull false
          :constvalue  4 [ 1 0 0 0 ]                                :constbyval true }                :usenulls false
:aggstartrue                :aggdistinct false }   }) :qpqual <> :lefttree { INDEXSCAN   :startup_cost 0.00 :total_cost
2.01:rows 1 :width 4   :state <> :qptargetlist <>   :qpqual   ({ EXPR      :typeOid 16  :opType op      :oper { OPER
:opno1209 :opid 850 :opresulttype 16 }      :args      ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1
:varlevelsup 0 :varnoold 1 :varoattno 1}       { CONST :consttype 25 :constlen -1 :constisnull false
:constvalue 6 [ 6 0 0 0 47 37 ]  :constbyval false }      )   })   :lefttree <> :righttree <> :extprm () :locprm ()
:initplan<>   :nprm 0  :scanrelid 1 :indxid ( 30208)   :indxqual   (({ EXPR       :typeOid 16  :opType op       :oper {
OPER:opno 667 :opid 743 :opresulttype 16 }       :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1
       :varlevelsup 0 :varnoold 1 :varoattno 1}              { CONST :consttype 25 :constlen -1 :constisnull false
               :constvalue  5 [ 5 0 0 0 47 ]  :constbyval false }             )}     { EXPR       :typeOid 16  :opType
op      :oper { OPER :opno 664 :opid 740 :opresulttype 16 }       :args ({ VAR :varno 1 :varattno 1 :vartype 25
:vartypmod-1                    :varlevelsup 0 :varnoold 1 :varoattno 1}              { CONST :consttype 25 :constlen
-1:constisnull false                      :constvalue  5 [ 5 0 0 0 48 ]  :constbyval false }             )     }   ))
:indxqualorig  (({ EXPR :typeOid 16  :opType op            :oper { OPER :opno 667 :opid 743 :opresulttype 16 }
 :args ({ VAR :varno 1 :varattno 1 :vartype 25                         :vartypmod -1  :varlevelsup 0 :varnoold 1
                :varoattno 1}                   { CONST :consttype 25 :constlen -1 :constisnull false
       :constvalue  5 [ 5 0 0 0 47 ]                           :constbyval false }                  )     }     { EXPR
:typeOid16  :opType op            :oper { OPER :opno 664 :opid 740 :opresulttype 16 }            :args ({ VAR :varno 1
:varattno1 :vartype 25 :vartypmod -1                         :varlevelsup 0 :varnoold 1 :varoattno 1}
{CONST :consttype 25 :constlen -1 :constisnull false                           :constvalue  5 [ 5 0 0 0 48 ]
              :constbyval false }                  )     }   ))   :indxorderdir 1 } :righttree <> :extprm () :locprm ()
:initplan<> :nprm 0
 
}

It looks like the like-match is converted into a "x >= a and x < b"
form of expression (opno 664 is text_lt and opno 667 is text_ge, and
47 == ascii(/)), which doesn't work with collating order in most
locales? But it must be more compicated, because the query works when
there are less then 80 entries in the index. How many go onto a page?

ciao
Andreas


Re: like-operator on index-scan

From
Tom Lane
Date:
Andreas Degert <ad@papyrus-gmbh.de> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Also, it might help to look at the output of EXPLAIN VERBOSE for
>> the misbehaving query.  That would let us see what indexscan limits
>> are being generated.

> This is the output of
> explain verbose select count(*) from test where a like '/%';
> (hand-formatted.. first and last time i've done that :))

>     :indxqual
>     (({ EXPR
>         :typeOid 16  :opType op
>         :oper { OPER :opno 667 :opid 743 :opresulttype 16 }
>         :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1
>                      :varlevelsup 0 :varnoold 1 :varoattno 1}
>                { CONST :consttype 25 :constlen -1 :constisnull false
>                        :constvalue  5 [ 5 0 0 0 47 ]  :constbyval false }
>               )}
>       { EXPR
>         :typeOid 16  :opType op
>         :oper { OPER :opno 664 :opid 740 :opresulttype 16 }
>         :args ({ VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1
>                      :varlevelsup 0 :varnoold 1 :varoattno 1}
>                { CONST :consttype 25 :constlen -1 :constisnull false
>                        :constvalue  5 [ 5 0 0 0 48 ]  :constbyval false }
>               )
>       }
>     ))

> It looks like the like-match is converted into a "x >= a and x < b"
> form of expression (opno 664 is text_lt and opno 667 is text_ge, and
> 47 == ascii(/)), which doesn't work with collating order in most
> locales?

What we've got here is x >= '/' AND x < '0', which should work as far
as I can see, unless your machine uses a really peculiar collation
order.

What happens if you try the query in the form

select count(*) from test where a >= '/' and a < '0'

Do you get the same behavior?  If so, try changing the index bounds to
see where it works and stops working.

> But it must be more compicated, because the query works when
> there are less then 80 entries in the index. How many go onto a page?

More than that, I'd think, at least for strings as short as you showed
in your example.  An index item only has about a dozen bytes of
overhead, so for short strings you ought to get three or four hundred
per index page.  You can check this by looking to see if the index
file has grown past its minimum size of 2 pages (16K).

The whole thing is quite peculiar.  Your example works fine for me;
can anyone else duplicate the failure, and if so on what platform?
        regards, tom lane