Thread: Mistake in my query or Index Scan on subquery failure? (7.4)

Mistake in my query or Index Scan on subquery failure? (7.4)

From
"Gellert, Andre"
Date:
Hello,
I have a problem with this simple query :

explain select * from ref_artikel a where a.artnr in (  351275 , 351346 ,
293082 ) LIMIT 20 OFFSET 0 ;
                                                            QUERY PLAN

----------------------------------------------------------------------------
------------------------------------------------------
 Limit  (cost=0.00..9.06 rows=3 width=299)
   ->  Index Scan using ref_artikel_pkey, ref_artikel_pkey, \
       ref_artikel_pkey on ref_artikel a  (cost=0.00..9.06 rows=3 width=299)
         Index Cond: ((artnr = 351275) OR (artnr = 351346) OR (artnr =
293082))
(3 rows)

Is it okay , that the word ref_artikel_pkey will be repeated for each
condition ?

How can i optimize this, i have a number of up to 3000 "artnr" cond. to
check for equality
( and more tables joined over this query which i left away for better
understanding ) ,
is a  "(x=1) or (x=2)..." faster than "x in 1,2... " at this large number of
checks ?

Maybe this is the hash-joined bug marked for this 7.4 postgresql....
I consider upgrading to 7.41 on another testing server....

Thx for ideas,
Andre

Re: Mistake in my query or Index Scan on subquery failure?

From
Nick Barr
Date:
Gellert, Andre wrote:

>Hello,
>I have a problem with this simple query :
>
>explain select * from ref_artikel a where a.artnr in (  351275 , 351346 ,
>293082 ) LIMIT 20 OFFSET 0 ;
>                                                            QUERY PLAN
>
>----------------------------------------------------------------------------
>------------------------------------------------------
> Limit  (cost=0.00..9.06 rows=3 width=299)
>   ->  Index Scan using ref_artikel_pkey, ref_artikel_pkey, \
>       ref_artikel_pkey on ref_artikel a  (cost=0.00..9.06 rows=3 width=299)
>         Index Cond: ((artnr = 351275) OR (artnr = 351346) OR (artnr =
>293082))
>(3 rows)
>
>Is it okay , that the word ref_artikel_pkey will be repeated for each
>condition ?
>
>How can i optimize this, i have a number of up to 3000 "artnr" cond. to
>check for equality
>( and more tables joined over this query which i left away for better
>understanding ) ,
>is a  "(x=1) or (x=2)..." faster than "x in 1,2... " at this large number of
>checks ?
>
>Maybe this is the hash-joined bug marked for this 7.4 postgresql....
>I consider upgrading to 7.41 on another testing server....
>
>Thx for ideas,
>Andre
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
As far as I can see this is fine, especially for a small number of
values in the IN clause. If I understand you correctly the number of
values in the IN clause might extend to 3000. This would not be
particularly efficient and might end in a sequential scan.

For example,

explain select * from ref_artikel a where a.artnr in ( 351275 , 351346 ,
293082, ... <2997 more values> ) LIMIT 20 OFFSET 0 ;


A better method if you are going to use a lot of numbers would be something like the following:

-----
BEGIN;

CREATE TEMPORARY TABLE temp_num_2004_03_03 (temp_num int4);

INSERT INTO temp_num_2004_03_03 (351275);
...
INSERT INTO temp_num_2004_03_03 (293082);

CREATE INDEX temp_num_2004_03_03_temp_num_idx ON temp_num_2004_03_03 (temp_num);

SELECT * FROM ref_artikel a, temp_num_2004_03_03 b WHERE a.artnr=b.temp_num ORDER BY a.artnr LIMIT 20 OFFSET 0;

DROP INDEX temp_num_2004_03_03_temp_num_idx;
DROP TABLE temp_num_2004_03_03;

COMMIT/ABORT;
-----

This is especially true if you are going to use the set of numbers again in other queries that follow.

BTW, you will need to put an ORDER BY clause in your SELECT to guarentee the order of the rows that come back,
especiallywhen using the LIMIT/OFFSET clauses. 



Nick



Re: Mistake in my query or Index Scan on subquery failure? (7.4)

From
Tom Lane
Date:
"Gellert, Andre" <AGellert@ElectronicPartner.de> writes:
> Is it okay , that the word ref_artikel_pkey will be repeated for each
> condition ?

That's what it's supposed to do.  EXPLAIN is trying to tell you that
this query involves three independent index probes, which should not be
very surprising.

            regards, tom lane