Thread: Horribly slow query/ sequential scan

Horribly slow query/ sequential scan

From
"Gregory S. Williamson"
Date:
This is a query migrated from postgres. In postgres it runs about 10,000 times *slower* than on informix on somewhat
newerhardware. The problem is entirely due to the planner. This  PostgreSQL 8.1.4 on linux, 2 gigs of ram. 

The table:
          Table "reporting.bill_rpt_work"
    Column     |         Type          | Modifiers
---------------+-----------------------+-----------
 report_id     | integer               |
 client_id     | character varying(10) |
 contract_id   | integer               | not null
 rate          | numeric               | not null
 appid         | character varying(10) | not null
 userid        | text                  | not null
 collection_id | integer               | not null
 client_name   | character varying(60) |
 use_sius      | integer               | not null
 is_subscribed | integer               | not null
 hits          | numeric               | not null
 sius          | numeric               | not null
 total_amnt    | numeric               | not null
 royalty_total | numeric               |
Indexes:
    "billrptw_ndx" UNIQUE, btree (report_id, client_id, contract_id, rate, appid, userid, collection_id)
    "billrpt_cntrct_ndx" btree (report_id, contract_id, client_id)
    "billrpt_collid_ndx" btree (report_id, collection_id, client_id, contract_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (report_id) REFERENCES billing_reports(report_id)
    "$2" FOREIGN KEY (client_id) REFERENCES "work".clients(client_id)


The query:
explain analyze select
w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
sum(w.sius) * w.rate AS BYIUS
from bill_rpt_work w, billing_reports b
where w.report_id in
(select b.report_id from billing_reports where b.report_s_date = '2006-09-30')
and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;
                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------
------------------------------
 GroupAggregate  (cost=298061335.44..298259321.62 rows=26 width=58) (actual time=372213.673..372213.689 rows=2 loops=1)
   ->  Sort  (cost=298061335.44..298083333.83 rows=8799354 width=58) (actual time=372213.489..372213.503 rows=37
loops=1)
         Sort Key: w.appid, w.rate, w.is_subscribed
         ->  Nested Loop  (cost=0.00..296121313.45 rows=8799354 width=58) (actual time=286628.486..372213.053 rows=37
loops=1)
               Join Filter: (subplan)
               ->  Seq Scan on bill_rpt_work w  (cost=0.00..85703.20 rows=11238 width=62) (actual time=1.239..1736.746
rows=61020loops=1) 
                     Filter: (((client_id)::text = '227400001'::text) OR ((client_id)::text = '2274000010'::text))
               ->  Seq Scan on billing_reports b  (cost=0.00..29.66 rows=1566 width=8) (actual time=0.001..0.879
rows=1566loops=61020) 
               SubPlan
                 ->  Result  (cost=0.00..29.66 rows=1566 width=0) (actual time=0.000..0.002 rows=1 loops=95557320)
                       One-Time Filter: ($1 = '2006-09-30'::date)
                       ->  Seq Scan on billing_reports  (cost=0.00..29.66 rows=1566 width=0) (actual time=0.001..0.863
rows=1565loops=61020) 
 Total runtime: 372214.085 ms


Informix uses report id/client id as an index, thus eliminating a huge number of rows. The table has 2280545 rows
currently;slightly fewer when the above analyze was run. Informix has about 5 times as much data. 

select count(*) from bill_rpt_work where report_id in (select report_id from billing_reports where report_s_date =
'2006-09-30')and (client_id = '227400001' or client_id = '2274000010'); 
 count
-------
    37
(1 row)

So scanning everything seems particularly senseless.

I had some success adding client id and report id to the initial select list, but that causes all sorts of problems in
callingprocedures that expect different data grouping. 

Any suggestion would be welcome because this is a horrible show stopper.

Thanks,

Greg Williamson
DBA
GlobeXplorer LLC




Re: Horribly slow query/ sequential scan

From
db@zigo.dhs.org
Date:
I don't think I understand the idea behind this query. Do you really need
billing_reports twice?

> The query:
> explain analyze select
> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
> sum(w.sius) * w.rate AS BYIUS
> from bill_rpt_work w, billing_reports b
> where w.report_id in
> (select b.report_id from billing_reports where b.report_s_date =
> '2006-09-30')
> and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3;

Maybe this is the query you want instead?

select w.appid,
       w.rate,
       w.is_subscribed,
       sum(w.hits) AS Hits,
       sum(w.sius) AS IUs,
       sum(w.total_amnt) AS Total,
       sum(w.hits) * w.rate AS ByHits,
       sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where w.report_id in
       (select b.report_id from billing_reports b where b.report_s_date =
'2006-09-30')
   and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;

/Dennis


Re: Horribly slow query/ sequential scan

From
"Gregory S. Williamson"
Date:
Voila ! You da man !

& other expressions of awe and appreciation ...

HAving burdened others with my foolishness too often, I hesitate to ask, but could someone either point me to a
referenceor explain what the difference might be ... I can see it with the eyes but I am having trouble understanding
whatInformix might have been doing to my (bad ?) SQL to "fix" the query. Seeing a redundancy and eliminating it ? 

The explain analyze for "db"'s sql (slightly faster than Informix on an older Sun machine ... about 20%):
 GroupAggregate  (cost=64.35..64.75 rows=8 width=58) (actual time=0.612..0.629 rows=2 loops=1)
   ->  Sort  (cost=64.35..64.37 rows=8 width=58) (actual time=0.463..0.476 rows=37 loops=1)
         Sort Key: w.appid, w.rate, w.is_subscribed
         ->  Nested Loop  (cost=8.11..64.23 rows=8 width=58) (actual time=0.130..0.211 rows=37 loops=1)
               Join Filter: ("inner".report_id = "outer".report_id)
               ->  HashAggregate  (cost=3.95..3.96 rows=1 width=4) (actual time=0.035..0.035 rows=1 loops=1)
                     ->  Index Scan using billrpt_sdate_ndx on billing_reports b  (cost=0.00..3.94 rows=1 width=4)
(actualtime=0.021..0.023 rows=1 loops=1) 
                           Index Cond: (report_s_date = '2006-09-30'::date)
               ->  Bitmap Heap Scan on bill_rpt_work w  (cost=4.17..59.92 rows=28 width=62) (actual time=0.084..0.111
rows=37loops=1) 
                     Recheck Cond: (((w.report_id = "outer".report_id) AND ((w.client_id)::text = '227400001'::text))
OR((w.report_id = "outer".report_id) AND ((w.client_id)::text = '2274000010'::text))) 
                     ->  BitmapOr  (cost=4.17..4.17 rows=28 width=0) (actual time=0.078..0.078 rows=0 loops=1)
                           ->  Bitmap Index Scan on billrptw_ndx  (cost=0.00..2.08 rows=14 width=0) (actual
time=0.053..0.053rows=22 loops=1) 
                                 Index Cond: ((w.report_id = "outer".report_id) AND ((w.client_id)::text =
'227400001'::text))
                           ->  Bitmap Index Scan on billrptw_ndx  (cost=0.00..2.08 rows=14 width=0) (actual
time=0.024..0.024rows=15 loops=1) 
                                 Index Cond: ((w.report_id = "outer".report_id) AND ((w.client_id)::text =
'2274000010'::text))
 Total runtime: 6.110 ms
(16 rows)

Thanks again (and sorry for the top-posting but this particular interface is ungainly)

G

-----Original Message-----
From:    db@zigo.dhs.org [mailto:db@zigo.dhs.org]
Sent:    Tue 1/9/2007 4:35 AM
To:    Gregory S. Williamson
Cc:    pgsql-performance@postgresql.org
Subject:    Re: [PERFORM] Horribly slow query/ sequential scan

I don't think I understand the idea behind this query. Do you really need
billing_reports twice?

> The query:
> explain analyze select
> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
> sum(w.sius) * w.rate AS BYIUS
> from bill_rpt_work w, billing_reports b
> where w.report_id in
> (select b.report_id from billing_reports where b.report_s_date =
> '2006-09-30')
> and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3;

Maybe this is the query you want instead?

select w.appid,
       w.rate,
       w.is_subscribed,
       sum(w.hits) AS Hits,
       sum(w.sius) AS IUs,
       sum(w.total_amnt) AS Total,
       sum(w.hits) * w.rate AS ByHits,
       sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where w.report_id in
       (select b.report_id from billing_reports b where b.report_s_date =
'2006-09-30')
   and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;

/Dennis



-------------------------------------------------------
Click link below if it is SPAM gsw@globexplorer.com

"https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45a38b1548991076418835&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1"
!DSPAM:45a38b1548991076418835!
-------------------------------------------------------




Voi

Re: Horribly slow query/ sequential scan

From
Nörder-Tuitje, Marcus
Date:
Forget abount "IN". Its horribly slow.

try :

select w.appid,
       w.rate,
       w.is_subscribed,
       sum(w.hits) AS Hits,
       sum(w.sius) AS IUs,
       sum(w.total_amnt) AS Total,
       sum(w.hits) * w.rate AS ByHits,
       sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30' and w.report_id = b.report_id)
   and w.client_id IN ('227400001','2274000010')
group by 1,2,3
order by 1,2,3;



should by faster;

assuming : index on report_id in b; index on report_id, client_id in w

to enforce useage of indexes on grouping (depends on result size), consider extending w with cols 1,2,3.


regards,
marcus

-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von
db@zigo.dhs.org
Gesendet: Dienstag, 9. Januar 2007 13:36
An: Gregory S. Williamson
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Horribly slow query/ sequential scan


I don't think I understand the idea behind this query. Do you really need
billing_reports twice?

> The query:
> explain analyze select
> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
> sum(w.sius) * w.rate AS BYIUS
> from bill_rpt_work w, billing_reports b
> where w.report_id in
> (select b.report_id from billing_reports where b.report_s_date =
> '2006-09-30')
> and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3;

Maybe this is the query you want instead?

select w.appid,
       w.rate,
       w.is_subscribed,
       sum(w.hits) AS Hits,
       sum(w.sius) AS IUs,
       sum(w.total_amnt) AS Total,
       sum(w.hits) * w.rate AS ByHits,
       sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where w.report_id in
       (select b.report_id from billing_reports b where b.report_s_date =
'2006-09-30')
   and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;

/Dennis


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate



Re: Horribly slow query/ sequential scan

From
"Gregory S. Williamson"
Date:
Thanks for the suggestion ... I will try it when I've had some sleep and the server is quiet again ... the IN seems to
haveimproved markedly since the 7.4 release, as advertised, so I will be interested in trying this. 

GSW

-----Original Message-----
From:    Nörder-Tuitje, Marcus [mailto:noerder-tuitje@technology.de]
Sent:    Tue 1/9/2007 4:50 AM
To:    db@zigo.dhs.org; Gregory S. Williamson
Cc:    pgsql-performance@postgresql.org
Subject:    AW: [PERFORM] Horribly slow query/ sequential scan

Forget abount "IN". Its horribly slow.

try :

select w.appid,
       w.rate,
       w.is_subscribed,
       sum(w.hits) AS Hits,
       sum(w.sius) AS IUs,
       sum(w.total_amnt) AS Total,
       sum(w.hits) * w.rate AS ByHits,
       sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30' and w.report_id = b.report_id)
   and w.client_id IN ('227400001','2274000010')
group by 1,2,3
order by 1,2,3;



should by faster;

assuming : index on report_id in b; index on report_id, client_id in w

to enforce useage of indexes on grouping (depends on result size), consider extending w with cols 1,2,3.


regards,
marcus

-----Ursprüngliche Nachricht-----
Von: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von
db@zigo.dhs.org
Gesendet: Dienstag, 9. Januar 2007 13:36
An: Gregory S. Williamson
Cc: pgsql-performance@postgresql.org
Betreff: Re: [PERFORM] Horribly slow query/ sequential scan


I don't think I understand the idea behind this query. Do you really need
billing_reports twice?

> The query:
> explain analyze select
> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
> sum(w.sius) * w.rate AS BYIUS
> from bill_rpt_work w, billing_reports b
> where w.report_id in
> (select b.report_id from billing_reports where b.report_s_date =
> '2006-09-30')
> and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3;

Maybe this is the query you want instead?

select w.appid,
       w.rate,
       w.is_subscribed,
       sum(w.hits) AS Hits,
       sum(w.sius) AS IUs,
       sum(w.total_amnt) AS Total,
       sum(w.hits) * w.rate AS ByHits,
       sum(w.sius) * w.rate AS BYIUS
  from bill_rpt_work w
 where w.report_id in
       (select b.report_id from billing_reports b where b.report_s_date =
'2006-09-30')
   and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;

/Dennis


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate




-------------------------------------------------------
Click link below if it is SPAM gsw@globexplorer.com

"https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45a38ea050372117817174&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1"
!DSPAM:45a38ea050372117817174!
-------------------------------------------------------






Re: Horribly slow query/ sequential scan

From
Dave Cramer
Date:
On 9-Jan-07, at 7:50 AM, Nörder-Tuitje, Marcus wrote:

> Forget abount "IN". Its horribly slow.

I think that statement above was historically correct, but is now
incorrect.  IN has been optimized quite significantly since 7.4

Dave
>
> try :
>
> select w.appid,
>        w.rate,
>        w.is_subscribed,
>        sum(w.hits) AS Hits,
>        sum(w.sius) AS IUs,
>        sum(w.total_amnt) AS Total,
>        sum(w.hits) * w.rate AS ByHits,
>        sum(w.sius) * w.rate AS BYIUS
>   from bill_rpt_work w
>  where (select b.report_id from billing_reports b where
> b.report_s_date = '2006-09-30' and w.report_id = b.report_id)
>    and w.client_id IN ('227400001','2274000010')
> group by 1,2,3
> order by 1,2,3;
>
>
>
> should by faster;
>
> assuming : index on report_id in b; index on report_id, client_id in w
>
> to enforce useage of indexes on grouping (depends on result size),
> consider extending w with cols 1,2,3.
>
>
> regards,
> marcus
>
> -----Ursprüngliche Nachricht-----
> Von: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von
> db@zigo.dhs.org
> Gesendet: Dienstag, 9. Januar 2007 13:36
> An: Gregory S. Williamson
> Cc: pgsql-performance@postgresql.org
> Betreff: Re: [PERFORM] Horribly slow query/ sequential scan
>
>
> I don't think I understand the idea behind this query. Do you
> really need
> billing_reports twice?
>
>> The query:
>> explain analyze select
>> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS
>> IUs,
>> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
>> sum(w.sius) * w.rate AS BYIUS
>> from bill_rpt_work w, billing_reports b
>> where w.report_id in
>> (select b.report_id from billing_reports where b.report_s_date =
>> '2006-09-30')
>> and (w.client_id = '227400001' or w.client_id = '2274000010')
>> group by 1,2,3
>> order by 1,2,3;
>
> Maybe this is the query you want instead?
>
> select w.appid,
>        w.rate,
>        w.is_subscribed,
>        sum(w.hits) AS Hits,
>        sum(w.sius) AS IUs,
>        sum(w.total_amnt) AS Total,
>        sum(w.hits) * w.rate AS ByHits,
>        sum(w.sius) * w.rate AS BYIUS
>   from bill_rpt_work w
>  where w.report_id in
>        (select b.report_id from billing_reports b where
> b.report_s_date =
> '2006-09-30')
>    and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3;
>
> /Dennis
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>


Re: Horribly slow query/ sequential scan

From
Tom Lane
Date:
"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> HAving burdened others with my foolishness too often, I hesitate to
> ask, but could someone either point me to a reference or explain what
> the difference might be ... I can see it with the eyes but I am having
> trouble understanding what Informix might have been doing to my (bad
> ?) SQL to "fix" the query.

Me too.  Does informix have anything EXPLAIN-like to show what it's
doing?

            regards, tom lane

Re: Horribly slow query/ sequential scan

From
"Plugge, Joe R."
Date:
 Yes it does:

SET EXPLAIN ON;

It writes the file to  sqexplain.out

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Tuesday, January 09, 2007 9:13 AM
To: Gregory S. Williamson
Cc: db@zigo.dhs.org; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Horribly slow query/ sequential scan

"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> HAving burdened others with my foolishness too often, I hesitate to
> ask, but could someone either point me to a reference or explain what
> the difference might be ... I can see it with the eyes but I am having
> trouble understanding what Informix might have been doing to my (bad
> ?) SQL to "fix" the query.

Me too.  Does informix have anything EXPLAIN-like to show what it's
doing?

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

Re: Horribly slow query/ sequential scan

From
"Gregory S. Williamson"
Date:
As Joe indicated, there is indeed an Informix explain, appended below my signature ...

This table has 5565862 total rows, and 37 target rows. So about twice the total data, but all of the "extra" data in
infomrixis much older. 

Thanks for the help, one and all!

Greg W.

QUERY:
------
SELECT collection_id,client_id,client_name,appid,SUM(hits),SUM(sius),SUM(royalty_total)
FROM bill_rpt_work WHERE report_id in
 (SELECT report_id FROM billing_reports WHERE report_s_date = '2004-09-10')
 GROUP BY collection_id, client_id,client_name,appid
 ORDER BY collection_id,client_id,appid

Estimated Cost: 2015
Estimated # of Rows Returned: 481
Temporary Files Required For: Order By  Group By

  1) informix.bill_rpt_work: INDEX PATH

    (1) Index Keys: report_id   (Serial, fragments: ALL)
        Lower Index Filter: informix.bill_rpt_work.report_id = ANY <subquery>

    Subquery:
    ---------
    Estimated Cost: 44
    Estimated # of Rows Returned: 1

      1) informix.billing_reports: SEQUENTIAL SCAN

            Filters: informix.billing_reports.report_s_date = datetime(2004-09-10) year to day



QUERY:
------
select count(*) from informix.systables;

Estimated Cost: 1
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: (count)


QUERY:
------
 select tabname , tabid , owner from informix . systables where tabname != 'ANSI' and tabtype != 'P' order by tabname

Estimated Cost: 30
Estimated # of Rows Returned: 196

  1) informix.systables: INDEX PATH

        Filters: informix.systables.tabtype != 'P'

    (1) Index Keys: tabname owner   (Key-First)
        Key-First Filters:  (informix.systables.tabname != 'ANSI' )


QUERY:
------
select tabid, tabtype, tabname, owner from informix.systables where (tabname = ? and owner like ?)

Estimated Cost: 2
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner   (Key-First)
        Lower Index Filter: informix.systables.tabname = 'bill_rpt_work'
        Key-First Filters:  (informix.systables.owner LIKE '%' )


QUERY:
------
select count(*) from             informix.systables where tabname = 'sysindices';

Estimated Cost: 2
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner
        Lower Index Filter: informix.systables.tabname = 'sysindices'


QUERY:
------
select colno, colname, coltype, collength, informix.syscolumns.extended_id, name from informix.syscolumns,
informix.systables,outer informix.sysxtdtypes where  informix.syscolumns.tabid = informix.systables.tabid and
informix.syscolumns.extended_id= informix.sysxtdtypes.extended_id and tabname = ? and informix.systables.owner = ?
orderby informix.syscolumns.colno; 

Estimated Cost: 9
Estimated # of Rows Returned: 7
Temporary Files Required For: Order By

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner
        Lower Index Filter: (informix.systables.tabname = 'bill_rpt_work' AND informix.systables.owner = 'informix
                 ' )  

  2) informix.syscolumns: INDEX PATH

    (1) Index Keys: tabid colno
        Lower Index Filter: informix.syscolumns.tabid = informix.systables.tabid
NESTED LOOP JOIN

  3) informix.sysxtdtypes: INDEX PATH

    (1) Index Keys: extended_id
        Lower Index Filter: informix.syscolumns.extended_id = informix.sysxtdtypes.extended_id
NESTED LOOP JOIN


QUERY:
------
select tabid, tabtype, tabname, owner from informix.systables where (tabname = ? and owner like ?)

Estimated Cost: 2
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner   (Key-First)
        Lower Index Filter: informix.systables.tabname = 'bill_rpt_work'
        Key-First Filters:  (informix.systables.owner LIKE '%' )


QUERY:
------
select count(*) from             informix.systables where tabname = 'sysindices';

Estimated Cost: 2
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner
        Lower Index Filter: informix.systables.tabname = 'sysindices'


QUERY:
------
select idxtype, clustered,idxname, informix.sysindices.owner, indexkeys::lvarchar, amid, am_name from
informix.sysindices,informix.systables, informix.sysams where informix.systables.tabname = ? and
informix.systables.tabid = informix.sysindices.tabid and informix.systables.owner like ? and informix.sysindices.amid =
informix.sysams.am_id;

Estimated Cost: 6
Estimated # of Rows Returned: 2

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner
        Lower Index Filter: (informix.systables.tabname = 'bill_rpt_work' AND informix.systables.owner = 'informix' )

  2) informix.sysindices: INDEX PATH

    (1) Index Keys: tabid
        Lower Index Filter: informix.systables.tabid = informix.sysindices.tabid
NESTED LOOP JOIN

  3) informix.sysams: INDEX PATH

    (1) Index Keys: am_id
        Lower Index Filter: informix.sysindices.amid = informix.sysams.am_id
NESTED LOOP JOIN

UDRs in query:
--------------
    UDR id  :    1
    UDR name:    indexkeyarray_out

QUERY:
------
select tabid, tabtype, tabname, owner from informix.systables where (tabname = ? and owner like ?)

Estimated Cost: 2
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner   (Key-First)
        Lower Index Filter: informix.systables.tabname = 'bill_rpt_work'
        Key-First Filters:  (informix.systables.owner LIKE '%' )


QUERY:
------
select count(*) from             informix.systables where tabname = 'sysindices';

Estimated Cost: 2
Estimated # of Rows Returned: 1

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner
        Lower Index Filter: informix.systables.tabname = 'sysindices'


QUERY:
------
select colno, colname, coltype, collength, informix.syscolumns.extended_id, name from informix.syscolumns,
informix.systables,outer informix.sysxtdtypes where  informix.syscolumns.tabid = informix.systables.tabid and
informix.syscolumns.extended_id= informix.sysxtdtypes.extended_id and tabname = ? and informix.systables.owner = ?
orderby informix.syscolumns.colno; 

Estimated Cost: 9
Estimated # of Rows Returned: 7
Temporary Files Required For: Order By

  1) informix.systables: INDEX PATH

    (1) Index Keys: tabname owner
        Lower Index Filter: (informix.systables.tabname = 'bill_rpt_work' AND informix.systables.owner = 'informix
                 ' )  

  2) informix.syscolumns: INDEX PATH

    (1) Index Keys: tabid colno
        Lower Index Filter: informix.syscolumns.tabid = informix.systables.tabid
NESTED LOOP JOIN

  3) informix.sysxtdtypes: INDEX PATH

    (1) Index Keys: extended_id
        Lower Index Filter: informix.syscolumns.extended_id = informix.sysxtdtypes.extended_id
NESTED LOOP JOIN



QUERY:
------
select
w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
sum(w.sius) * w.rate AS BYIUS
from bill_rpt_work w, billing_reports b
where w.report_id in
(select b.report_id from billing_reports where b.report_s_date = '2006-09-30')
and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3

Estimated Cost: 3149
Estimated # of Rows Returned: 1
Temporary Files Required For: Order By  Group By

  1) informix.b: INDEX PATH

    (1) Index Keys: report_s_date   (Serial, fragments: ALL)
        Lower Index Filter: informix.b.report_s_date = datetime(2006-09-30) year to day

  2) informix.w: INDEX PATH

        Filters: (informix.w.client_id = '227400001' OR informix.w.client_id = '2274000010' )

    (1) Index Keys: report_id   (Serial, fragments: ALL)
        Lower Index Filter: informix.w.report_id = informix.b.report_id
NESTED LOOP JOIN

  3) informix.billing_reports: SEQUENTIAL SCAN  (First Row)
NESTED LOOP JOIN  (Semi Join)




-----Original Message-----
From:    pgsql-performance-owner@postgresql.org on behalf of Plugge, Joe R.
Sent:    Tue 1/9/2007 7:36 AM
To:    pgsql-performance@postgresql.org
Cc:
Subject:    Re: [PERFORM] Horribly slow query/ sequential scan

 Yes it does:

SET EXPLAIN ON;

It writes the file to  sqexplain.out

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Tuesday, January 09, 2007 9:13 AM
To: Gregory S. Williamson
Cc: db@zigo.dhs.org; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Horribly slow query/ sequential scan

"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> HAving burdened others with my foolishness too often, I hesitate to
> ask, but could someone either point me to a reference or explain what
> the difference might be ... I can see it with the eyes but I am having
> trouble understanding what Informix might have been doing to my (bad
> ?) SQL to "fix" the query.

Me too.  Does informix have anything EXPLAIN-like to show what it's
doing?

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings


-------------------------------------------------------
Click link below if it is SPAM gsw@globexplorer.com

"https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45a3b93d75271019119885&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1"
!DSPAM:45a3b93d75271019119885!
-------------------------------------------------------






Re: Horribly slow query/ sequential scan

From
Tom Lane
Date:
"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> As Joe indicated, there is indeed an Informix explain, appended below my signature ...

> select
> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
> sum(w.sius) * w.rate AS BYIUS
> from bill_rpt_work w, billing_reports b
> where w.report_id in
> (select b.report_id from billing_reports where b.report_s_date = '2006-09-30')
> and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3

> Estimated Cost: 3149
> Estimated # of Rows Returned: 1
> Temporary Files Required For: Order By  Group By

>   1) informix.b: INDEX PATH

>     (1) Index Keys: report_s_date   (Serial, fragments: ALL)
>         Lower Index Filter: informix.b.report_s_date = datetime(2006-09-30) year to day

>   2) informix.w: INDEX PATH

>         Filters: (informix.w.client_id = '227400001' OR informix.w.client_id = '2274000010' )

>     (1) Index Keys: report_id   (Serial, fragments: ALL)
>         Lower Index Filter: informix.w.report_id = informix.b.report_id
> NESTED LOOP JOIN

>   3) informix.billing_reports: SEQUENTIAL SCAN  (First Row)
> NESTED LOOP JOIN  (Semi Join)

Interesting!  "Semi join" is the two-dollar technical term for what our
code calls an "IN join", viz a join that returns at most one copy of a
left-hand row even when there's more than one right-hand join candidate
for it.  So I think there's not any execution mechanism here that we
don't have.  What seems to be happening is that Informix is willing to
flatten the sub-SELECT into an IN join even though the sub-SELECT is
correlated to the outer query (that is, it contains outer references).
I'm not sure whether we're just being paranoid by not doing that, or
whether there are special conditions to check before allowing it, or
whether Informix is wrong ...

            regards, tom lane

Re: Horribly slow query/ sequential scan

From
Tom Lane
Date:
I wrote:
> ... What seems to be happening is that Informix is willing to
> flatten the sub-SELECT into an IN join even though the sub-SELECT is
> correlated to the outer query (that is, it contains outer references).

I did some googling this morning and found confirmation that recent
versions of Informix have pretty extensive support for optimizing
correlated subqueries:
http://www.iiug.org/waiug/archive/iugnew83/FeaturesIDS73.htm

This is something we've not really spent much time on for Postgres,
but it might be interesting to look at someday.  Given that the problem
with your query was really a mistake anyway, I'm not sure that your
example is compelling evidence for making it a high priority.

            regards, tom lane