LIKE, leading percent, bind parameters and indexes - Mailing list pgsql-hackers

From Rodrigo Hjort
Subject LIKE, leading percent, bind parameters and indexes
Date
Msg-id 731083980605231611p5189ea98j762c76debf3006af@mail.gmail.com
Whole thread Raw
Responses Re: LIKE, leading percent, bind parameters and indexes
List pgsql-hackers
PG-Hackers,<br /><br />I got the following picture:<br /><br />detran=# \d sa_dut.tb_usuario<br />                   
Table"sa_dut.tb_usuario"<br />         Column          |            Type             | Modifiers<br
/>-------------------------+-----------------------------+-----------<br /> numprocesso             |
bigint                     | not null<br /> nome                    | character varying(44)       |<br
/> nomemae                | character varying(44)       |<br /> datanascimento          | date                        |
<br/>Indexes:<br />   "tb_usuario_pkey" PRIMARY KEY, btree (numprocesso)<br />   "ix_usuario_11" btree (nome
varchar_pattern_ops,nomemae varchar_pattern_ops)<br />   "ix_usuario_13" btree (datanascimento, nome
varchar_pattern_ops)<br /><br />As I do not use C locale, I created indexes based on "varchar_pattern_ops".<br />The
issueI'm having is based on the following queries:<br /><br />select * from TB_USUARIO where nome like 'TATIANA
CRISTINAG%'; <br />select * from TB_USUARIO where nome like '%TATIANA CRISTINA G%';<br /><br />For some reasons, I'm
notusing text-search engines, like TSearch2, but only the LIKE operator.<br />Here are the query plans involved:<br
/><br/><br /> detran=# explain analyze select count(*) as x0_0_ from sa_dut.TB_PROCESSO processo0_, sa_dut.TB_USUARIO
usuario1_where (usuario1_.NOME like 'TATIANA CRISTINA G%'  and processo0_.NUMPROCESSO=usuario1_.NUMPROCESSO);<br /><br
/>QUERY PLAN<br
/>--------------------------------------------------------------------------------------------------------------------------------------------------------<br
/> Aggregate (cost=11.94..11.95 rows=1 width=0) (actual time= 143.970..143.972 rows=1 loops=1)<br />  ->  Nested
Loop (cost=0.00..11.94 rows=1 width=0) (actual time=143.935..143.949 rows=1 loops=1)<br />        ->  Index Scan
usingix_usuario_11 on tb_usuario usuario1_ (cost=0.00..6.01 rows=1 width=8) (actual time=93.884..93.889 rows=1
loops=1)<br/>              Index Cond: (((nome)::text ~>=~ 'TATIANA CRISTINA G'::character varying) AND
((nome)::text~<~ 'TATIANA CRISTINA H'::character varying))<br />               Filter: ((nome)::text ~~ 'TATIANA
CRISTINAG%'::text)<br />        ->  Index Scan using tb_processo_pkey on tb_processo processo0_  (cost=0.00..5.91
rows=1width=8) (actual time=50.041..50.044 rows=1 loops=1) <br />              Index Cond: (processo0_.numprocesso =
"outer".numprocesso)<br/> Total runtime: 144.176 ms<br /><br />detran=# explain analyze select count(*) as x0_0_ from
sa_dut.TB_PROCESSOprocesso0_, sa_dut.TB_USUARIO usuario1_ where <br />(usuario1_.NOME like '%TATIANA CRISTINA G%'  and
processo0_.NUMPROCESSO=usuario1_.NUMPROCESSO);<br/><br />QUERY PLAN<br
/>-----------------------------------------------------------------------------------------------------------------------------------------------------
<br/> Aggregate  (cost=67534.55..67534.56 rows=1 width=0) (actual time=8101.957..8101.959 rows=1 loops=1)<br />  -> 
NestedLoop  (cost=0.00..67534.55 rows=1 width=0) (actual time=5404.106..8101.923 rows=1 loops=1)<br />        -> 
SeqScan on tb_usuario usuario1_  (cost= 0.00..67528.62 rows=1 width=8) (actual time=5404.056..8101.862 rows=1
loops=1)<br/>              Filter: ((nome)::text ~~ '%TATIANA CRISTINA G%'::text)<br />        ->  Index Scan using
tb_processo_pkeyon tb_processo<br /> processo0_  (cost=0.00..5.91 rows=1 width=8) (actual time=0.034..0.037 rows=1
loops=1)<br/>              Index Cond: (processo0_.numprocesso = "outer".numprocesso)<br /> Total runtime: 8102.105
ms<br/><br /><br />We use Java, and recently we made an effort in order to avoid the leading '%' on LIKE expressions.
<br/>The problem is that it wasn't solved, and then I made the following Java code to verify it.<br /><br />What
happensis that only the "004" block uses the index! The "002" code, which also has no leading percent, does a
sequentialscan. The difference between them is that "002" uses bind parameters. <br /><br />Is it concerned to the JDBC
Driveror PostgreSQL itself? What could be done in order to fix it?<br />I could use static parameters, but then the
querieswould have to be reparsed each time on the backend, missing cache advantages. <br /><br
/>****************************************************************************************************<br/>package
db;<br/><br />import java.sql.Connection;<br />import java.sql.DriverManager;<br />import java.sql.PreparedStatement
;<br/>import java.sql.ResultSet;<br />import java.sql.SQLException;<br /><br />public class SelectLike {<br /><br />  
publicSelectLike() {<br />       long qtd = 0L, inicio = 0L, tempo[] = {0,0,0,0};<br /><br />       try {<br
/>          Class.forName("org.postgresql.Driver");<br />       } catch (ClassNotFoundException e) {<br />          
e.printStackTrace();<br/>       }<br /><br />       Connection con = null;<br />       String dbURL =
"jdbc:postgresql://10.15.61.6/database";<br />       try {<br />           con = DriverManager.getConnection(dbURL,
"user","password");<br /><br />           String sql = "select count(*) as x0_0_ from<br />sa_dut.TB_PROCESSO
processo0_,sa_dut.TB_USUARIO usuario1_ where <br />(usuario1_.NOME like ? and
processo0_.NUMPROCESSO=usuario1_.NUMPROCESSO)";<br/>           String nome = "TATIANA CRISTINA G";<br /><br
/>          PreparedStatement ps = null;<br />           ResultSet rs = null; <br /><br />           //001 - '%NAME%'
binded<br/>           if (ps != null) ps.close();<br />           ps = con.prepareStatement(sql);<br />          
ps.setString(1,"%" + nome + "%");<br />           inicio = System.currentTimeMillis();<br />           rs =
ps.executeQuery();<br/>           rs.next();<br />           qtd = rs.getLong(1);<br />           rs.close();<br
/>          tempo[0] = System.currentTimeMillis() - inicio;<br /><br />            //002 - 'NAME%' binded<br
/>          if (ps != null) ps.close();<br />           ps = con.prepareStatement(sql);<br />           ps.setString(1,
nome+ "%");<br />           inicio = System.currentTimeMillis ();<br />           rs = ps.executeQuery();<br
/>          rs.next();<br />           qtd = rs.getLong(1);<br />           rs.close();<br />           tempo[1] =
System.currentTimeMillis()- inicio;<br /><br />           //003 - '%NAME%' static <br />           if (ps != null)
ps.close();<br/>           String sql1 = sql.replaceFirst("\\?", "'%" + nome + "%'");<br />           ps =
con.prepareStatement(sql1);<br/>           inicio = System.currentTimeMillis ();<br />           rs =
ps.executeQuery();<br/>           rs.next();<br />           qtd = rs.getLong(1);<br />           rs.close();<br
/>          tempo[2] = System.currentTimeMillis() - inicio;<br /><br />           //004 - 'NAME%' static <br
/>          if (ps != null) ps.close();<br />           String sql2 = sql.replaceFirst("\\?", "'" + nome + "%'");<br
/>          ps = con.prepareStatement(sql2);<br />           inicio = System.currentTimeMillis ();<br />           rs =
ps.executeQuery();<br/>           rs.next();<br />           qtd = rs.getLong(1);<br />           rs.close();<br
/>          tempo[3] = System.currentTimeMillis() - inicio;<br /><br />           ps.close();<br />           
con.close();<br/>       } catch (SQLException e) {<br />           e.printStackTrace();<br />       }<br /><br />      
System.out.println("QTD:" + qtd + "\n\n");<br />       for (int ii = 0; ii < tempo.length; ii++)<br />          
System.out.println(ii+ ": " + tempo[ii]);<br />   }<br /><br />   public static void main(String[] args) {<br />      
newSelectLike();<br />   }<br /><br />}<br
/>****************************************************************************************************<br /><br />--
<br/>Regards,<br /><br />Rodrigo Hjort<br /><a href="http://icewall.org/~hjort">http://icewall.org/~hjort</a><br /><br
/>

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: SQL compliant interval implementation
Next
From: Josh Berkus
Date:
Subject: Re: SQL compliant interval implementation