Performance difference in similar queries - Mailing list pgsql-performance

From Kiran Mukhyala
Subject Performance difference in similar queries
Date
Msg-id 1102113071.31836.107.camel@scifi.gene.com
Whole thread Raw
List pgsql-performance
Hi Folks,

        I have two queries that are of the form :
select ... from ... where ... in (list1) AND ... in (list2). The two
queries differ only in the size of list2 by 1, but their performances
are quite different. Query2 runs much faster than Query1. The queries
are:

Query 1:
SELECT svm,pmodel_id,pseq_id FROM paprospect2 WHERE pseq_id in
(8880,10507,10600,10605,10724,10852 ...) AND pmodel_id in
(4757,8221,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

Query 2:
SELECT svm,pmodel_id,pseq_id FROM paprospect2 WHERE pseq_id in
(8880,10507,10600,10605,10724,10852 ...) AND pmodel_id in
(4757,8221,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

=>Notice the extra zero at the end of query2. The size of list1 is 800
and size of list2 is 49 in case of query1 and 50 in case of query2

The Query Plans are:

QUERY PLAN 1:

Index Scan using paprospect2_search1, paprospect2_search1,
paprospect2_search1, paprospect2_search1, paprospect2_search1,
paprospect2_search1, paprospect2_search1, paprospect2_search1 ..........
(cost=0.00..10959914.42 rows=45760 width=12)

   Index Cond: ((pmodel_id = 4757) OR (pmodel_id = 8221) OR (pmodel_id =
0) OR (pmodel_id = 0) OR (pmodel_id = 0) OR (pmodel_id = 0) OR
(pmodel_id = 0) OR (pmodel_id = 0) OR ...)

   Filter: ((pseq_id = 0) OR (pseq_id = 8880) OR (pseq_id = 10507) OR
(pseq_id = 10600) OR (pseq_id = 10605) OR ...)

QUERY PLAN 2:

Index Scan using
paprospect2_pseq_id_params_id,paprospect2_pseq_id_params_id,
paprospect2_pseq_id_params_id, paprospect2_pseq_id_params_id,
paprospect2_pseq_id_params_id, paprospect2_pseq_id_params_id,
paprospect2_pseq_id_params_id, paprospect2_pseq_id_params_id,
paprospect2_pseq_id_params_id, paprospect2_pseq_id_params_id,
paprospect2_pseq_id_params_id, paprospect2_pseq_id_params_id .......
(cost=0.00..11050741.64 rows=46520 width=12)

   Index Cond: ((pseq_id = 0) OR (pseq_id = 8880) OR (pseq_id = 10507)
OR (pseq_id = 10600) OR (pseq_id = 10605) OR (pseq_id = 10724) OR
(pseq_id = 10852) OR (pseq_id = 10905) OR (pseq_id = 10945) OR (pseq_id
= 10964)....)

Filter: ((pmodel_id = 4757) OR (pmodel_id = 8221) OR (pmodel_id = 0) OR
(pmodel_id = 0) OR (pmodel_id = 0) OR ...)

=> Notice that the Index, Index Cond. and Filter are different in the
two plans.
In short the query plans and performance are quite different although
the queries are similar. Can you please explain the difference in
performance? Thank you,

-Kiran





pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: DB2 feature
Next
From: Pailloncy Jean-Gérard
Date:
Subject: Re: DB2 feature