Re: the IN clause saga - Mailing list pgsql-jdbc

From Tom Lane
Subject Re: the IN clause saga
Date
Msg-id 6765.1058885511@sss.pgh.pa.us
Whole thread Raw
In response to Re: the IN clause saga  (Dmitry Tkach <dmitry@openratings.com>)
Responses Re: the IN clause saga  (Dmitry Tkach <dmitry@openratings.com>)
Re: the IN clause saga  (Oliver Jowett <oliver@opencloud.com>)
List pgsql-jdbc
Dmitry Tkach <dmitry@openratings.com> writes:
> Fernando Nasser wrote:
>> The feature 2 in 7.4 backends is of limited use as the planner does
>> not know about the list, so the generated plan will not be as good as
>> if you pass the list with fixed values since the beginning.

> This is the same problem, as it generally exists with x=? - the query
> plan is generally not as good as x=1, because the planner doesn't know
> the value to use with statistics.

No, it's not the same thing --- the planner can generate an indexscan
plan when scalar params are involved, although it might choose not to.
The planner is simply not aware that any comparable optimization might
be possible when using the new array syntax.  Let me attach the example
I sent Fernando last night ...


Fernando Nasser <fnasser@redhat.com> writes:
> PREPARE tststmt (integer[]) AS SELECT * from testmetadata where id IN (?);
> PREPARE tststmt (integer, integer) AS SELECT * from testmetadata where id IN (?, ?);
> all give parsing errors.

The second case works fine from the command line:

regression=# prepare z(int,int) as select * from tenk1 where unique1 in ($1,$2);
PREPARE
regression=# execute z(42,66);
 unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even |
stringu1| stringu2 | string4  

---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
      42 |    5530 |   0 |    2 |   2 |      2 |      42 |       42 |          42 |        42 |       42 |  84 |   85 |
QBAAAA  | SEIAAA   | OOOOxx 
      66 |    6723 |   0 |    2 |   6 |      6 |      66 |       66 |          66 |        66 |       66 | 132 |  133 |
OCAAAA  | PYJAAA   | VVVVxx 
(2 rows)

Perhaps JDBC has some problem with it?

I would not expect the first case to work, since it violates the plain
meaning of IN.  But Joe Conway has implemented some non-SQL syntax to
handle that in 7.4:

regression=# prepare zz(int[]) as select * from tenk1 where unique1 = ANY ($1);
PREPARE
regression=# execute zz(ARRAY[42,66]);
 unique1 | unique2 | two | four | ten | twenty | hundred | thousand | twothousand | fivethous | tenthous | odd | even |
stringu1| stringu2 | string4  

---------+---------+-----+------+-----+--------+---------+----------+-------------+-----------+----------+-----+------+----------+----------+---------
      42 |    5530 |   0 |    2 |   2 |      2 |      42 |       42 |          42 |        42 |       42 |  84 |   85 |
QBAAAA  | SEIAAA   | OOOOxx 
      66 |    6723 |   0 |    2 |   6 |      6 |      66 |       66 |          66 |        66 |       66 | 132 |  133 |
OCAAAA  | PYJAAA   | VVVVxx 
(2 rows)

I should warn you though that this is not yet executed efficiently; the
planner has no idea about reducing it to a set of indexscans.  Compare

regression=# explain analyze execute z(42,66);
                                                             QUERY PLAN
             

-------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using tenk1_unique1, tenk1_unique1 on tenk1  (cost=0.00..12.02 rows=2 width=244) (actual time=0.28..0.48
rows=2loops=1) 
   Index Cond: ((unique1 = $1) OR (unique1 = $2))
 Total runtime: 1.35 msec
(3 rows)

regression=# explain analyze execute zz(ARRAY[42,66]);
                                              QUERY PLAN
-------------------------------------------------------------------------------------------------------
 Seq Scan on tenk1  (cost=0.00..708.00 rows=5000 width=244) (actual time=70.03..126.16 rows=2 loops=1)
   Filter: (unique1 = ANY ($1))
 Total runtime: 126.78 msec
(3 rows)


Perhaps we can make it work better in 7.5.

            regards, tom lane

pgsql-jdbc by date:

Previous
From: Darin Ohashi
Date:
Subject: Re: the IN clause saga
Next
From: Dmitry Tkach
Date:
Subject: Re: the IN clause saga