Thread: Horribly slow query/ sequential scan
This is a query migrated from postgres. In postgres it runs about 10,000 times *slower* than on informix on somewhat newerhardware. The problem is entirely due to the planner. This PostgreSQL 8.1.4 on linux, 2 gigs of ram. The table: Table "reporting.bill_rpt_work" Column | Type | Modifiers ---------------+-----------------------+----------- report_id | integer | client_id | character varying(10) | contract_id | integer | not null rate | numeric | not null appid | character varying(10) | not null userid | text | not null collection_id | integer | not null client_name | character varying(60) | use_sius | integer | not null is_subscribed | integer | not null hits | numeric | not null sius | numeric | not null total_amnt | numeric | not null royalty_total | numeric | Indexes: "billrptw_ndx" UNIQUE, btree (report_id, client_id, contract_id, rate, appid, userid, collection_id) "billrpt_cntrct_ndx" btree (report_id, contract_id, client_id) "billrpt_collid_ndx" btree (report_id, collection_id, client_id, contract_id) Foreign-key constraints: "$1" FOREIGN KEY (report_id) REFERENCES billing_reports(report_id) "$2" FOREIGN KEY (client_id) REFERENCES "work".clients(client_id) The query: explain analyze select w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w, billing_reports b where w.report_id in (select b.report_id from billing_reports where b.report_s_date = '2006-09-30') and (w.client_id = '227400001' or w.client_id = '2274000010') group by 1,2,3 order by 1,2,3; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- ------------------------------ GroupAggregate (cost=298061335.44..298259321.62 rows=26 width=58) (actual time=372213.673..372213.689 rows=2 loops=1) -> Sort (cost=298061335.44..298083333.83 rows=8799354 width=58) (actual time=372213.489..372213.503 rows=37 loops=1) Sort Key: w.appid, w.rate, w.is_subscribed -> Nested Loop (cost=0.00..296121313.45 rows=8799354 width=58) (actual time=286628.486..372213.053 rows=37 loops=1) Join Filter: (subplan) -> Seq Scan on bill_rpt_work w (cost=0.00..85703.20 rows=11238 width=62) (actual time=1.239..1736.746 rows=61020loops=1) Filter: (((client_id)::text = '227400001'::text) OR ((client_id)::text = '2274000010'::text)) -> Seq Scan on billing_reports b (cost=0.00..29.66 rows=1566 width=8) (actual time=0.001..0.879 rows=1566loops=61020) SubPlan -> Result (cost=0.00..29.66 rows=1566 width=0) (actual time=0.000..0.002 rows=1 loops=95557320) One-Time Filter: ($1 = '2006-09-30'::date) -> Seq Scan on billing_reports (cost=0.00..29.66 rows=1566 width=0) (actual time=0.001..0.863 rows=1565loops=61020) Total runtime: 372214.085 ms Informix uses report id/client id as an index, thus eliminating a huge number of rows. The table has 2280545 rows currently;slightly fewer when the above analyze was run. Informix has about 5 times as much data. select count(*) from bill_rpt_work where report_id in (select report_id from billing_reports where report_s_date = '2006-09-30')and (client_id = '227400001' or client_id = '2274000010'); count ------- 37 (1 row) So scanning everything seems particularly senseless. I had some success adding client id and report id to the initial select list, but that causes all sorts of problems in callingprocedures that expect different data grouping. Any suggestion would be welcome because this is a horrible show stopper. Thanks, Greg Williamson DBA GlobeXplorer LLC
I don't think I understand the idea behind this query. Do you really need billing_reports twice? > The query: > explain analyze select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w, billing_reports b > where w.report_id in > (select b.report_id from billing_reports where b.report_s_date = > '2006-09-30') > and (w.client_id = '227400001' or w.client_id = '2274000010') > group by 1,2,3 > order by 1,2,3; Maybe this is the query you want instead? select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where w.report_id in (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30') and (w.client_id = '227400001' or w.client_id = '2274000010') group by 1,2,3 order by 1,2,3; /Dennis
Voila ! You da man ! & other expressions of awe and appreciation ... HAving burdened others with my foolishness too often, I hesitate to ask, but could someone either point me to a referenceor explain what the difference might be ... I can see it with the eyes but I am having trouble understanding whatInformix might have been doing to my (bad ?) SQL to "fix" the query. Seeing a redundancy and eliminating it ? The explain analyze for "db"'s sql (slightly faster than Informix on an older Sun machine ... about 20%): GroupAggregate (cost=64.35..64.75 rows=8 width=58) (actual time=0.612..0.629 rows=2 loops=1) -> Sort (cost=64.35..64.37 rows=8 width=58) (actual time=0.463..0.476 rows=37 loops=1) Sort Key: w.appid, w.rate, w.is_subscribed -> Nested Loop (cost=8.11..64.23 rows=8 width=58) (actual time=0.130..0.211 rows=37 loops=1) Join Filter: ("inner".report_id = "outer".report_id) -> HashAggregate (cost=3.95..3.96 rows=1 width=4) (actual time=0.035..0.035 rows=1 loops=1) -> Index Scan using billrpt_sdate_ndx on billing_reports b (cost=0.00..3.94 rows=1 width=4) (actualtime=0.021..0.023 rows=1 loops=1) Index Cond: (report_s_date = '2006-09-30'::date) -> Bitmap Heap Scan on bill_rpt_work w (cost=4.17..59.92 rows=28 width=62) (actual time=0.084..0.111 rows=37loops=1) Recheck Cond: (((w.report_id = "outer".report_id) AND ((w.client_id)::text = '227400001'::text)) OR((w.report_id = "outer".report_id) AND ((w.client_id)::text = '2274000010'::text))) -> BitmapOr (cost=4.17..4.17 rows=28 width=0) (actual time=0.078..0.078 rows=0 loops=1) -> Bitmap Index Scan on billrptw_ndx (cost=0.00..2.08 rows=14 width=0) (actual time=0.053..0.053rows=22 loops=1) Index Cond: ((w.report_id = "outer".report_id) AND ((w.client_id)::text = '227400001'::text)) -> Bitmap Index Scan on billrptw_ndx (cost=0.00..2.08 rows=14 width=0) (actual time=0.024..0.024rows=15 loops=1) Index Cond: ((w.report_id = "outer".report_id) AND ((w.client_id)::text = '2274000010'::text)) Total runtime: 6.110 ms (16 rows) Thanks again (and sorry for the top-posting but this particular interface is ungainly) G -----Original Message----- From: db@zigo.dhs.org [mailto:db@zigo.dhs.org] Sent: Tue 1/9/2007 4:35 AM To: Gregory S. Williamson Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Horribly slow query/ sequential scan I don't think I understand the idea behind this query. Do you really need billing_reports twice? > The query: > explain analyze select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w, billing_reports b > where w.report_id in > (select b.report_id from billing_reports where b.report_s_date = > '2006-09-30') > and (w.client_id = '227400001' or w.client_id = '2274000010') > group by 1,2,3 > order by 1,2,3; Maybe this is the query you want instead? select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where w.report_id in (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30') and (w.client_id = '227400001' or w.client_id = '2274000010') group by 1,2,3 order by 1,2,3; /Dennis ------------------------------------------------------- Click link below if it is SPAM gsw@globexplorer.com "https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45a38b1548991076418835&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1" !DSPAM:45a38b1548991076418835! ------------------------------------------------------- Voi
Forget abount "IN". Its horribly slow. try : select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30' and w.report_id = b.report_id) and w.client_id IN ('227400001','2274000010') group by 1,2,3 order by 1,2,3; should by faster; assuming : index on report_id in b; index on report_id, client_id in w to enforce useage of indexes on grouping (depends on result size), consider extending w with cols 1,2,3. regards, marcus -----Ursprüngliche Nachricht----- Von: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von db@zigo.dhs.org Gesendet: Dienstag, 9. Januar 2007 13:36 An: Gregory S. Williamson Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Horribly slow query/ sequential scan I don't think I understand the idea behind this query. Do you really need billing_reports twice? > The query: > explain analyze select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w, billing_reports b > where w.report_id in > (select b.report_id from billing_reports where b.report_s_date = > '2006-09-30') > and (w.client_id = '227400001' or w.client_id = '2274000010') > group by 1,2,3 > order by 1,2,3; Maybe this is the query you want instead? select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where w.report_id in (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30') and (w.client_id = '227400001' or w.client_id = '2274000010') group by 1,2,3 order by 1,2,3; /Dennis ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Thanks for the suggestion ... I will try it when I've had some sleep and the server is quiet again ... the IN seems to haveimproved markedly since the 7.4 release, as advertised, so I will be interested in trying this. GSW -----Original Message----- From: Nörder-Tuitje, Marcus [mailto:noerder-tuitje@technology.de] Sent: Tue 1/9/2007 4:50 AM To: db@zigo.dhs.org; Gregory S. Williamson Cc: pgsql-performance@postgresql.org Subject: AW: [PERFORM] Horribly slow query/ sequential scan Forget abount "IN". Its horribly slow. try : select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30' and w.report_id = b.report_id) and w.client_id IN ('227400001','2274000010') group by 1,2,3 order by 1,2,3; should by faster; assuming : index on report_id in b; index on report_id, client_id in w to enforce useage of indexes on grouping (depends on result size), consider extending w with cols 1,2,3. regards, marcus -----Ursprüngliche Nachricht----- Von: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von db@zigo.dhs.org Gesendet: Dienstag, 9. Januar 2007 13:36 An: Gregory S. Williamson Cc: pgsql-performance@postgresql.org Betreff: Re: [PERFORM] Horribly slow query/ sequential scan I don't think I understand the idea behind this query. Do you really need billing_reports twice? > The query: > explain analyze select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w, billing_reports b > where w.report_id in > (select b.report_id from billing_reports where b.report_s_date = > '2006-09-30') > and (w.client_id = '227400001' or w.client_id = '2274000010') > group by 1,2,3 > order by 1,2,3; Maybe this is the query you want instead? select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where w.report_id in (select b.report_id from billing_reports b where b.report_s_date = '2006-09-30') and (w.client_id = '227400001' or w.client_id = '2274000010') group by 1,2,3 order by 1,2,3; /Dennis ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ------------------------------------------------------- Click link below if it is SPAM gsw@globexplorer.com "https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45a38ea050372117817174&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1" !DSPAM:45a38ea050372117817174! -------------------------------------------------------
On 9-Jan-07, at 7:50 AM, Nörder-Tuitje, Marcus wrote: > Forget abount "IN". Its horribly slow. I think that statement above was historically correct, but is now incorrect. IN has been optimized quite significantly since 7.4 Dave > > try : > > select w.appid, > w.rate, > w.is_subscribed, > sum(w.hits) AS Hits, > sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total, > sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w > where (select b.report_id from billing_reports b where > b.report_s_date = '2006-09-30' and w.report_id = b.report_id) > and w.client_id IN ('227400001','2274000010') > group by 1,2,3 > order by 1,2,3; > > > > should by faster; > > assuming : index on report_id in b; index on report_id, client_id in w > > to enforce useage of indexes on grouping (depends on result size), > consider extending w with cols 1,2,3. > > > regards, > marcus > > -----Ursprüngliche Nachricht----- > Von: pgsql-performance-owner@postgresql.org > [mailto:pgsql-performance-owner@postgresql.org]Im Auftrag von > db@zigo.dhs.org > Gesendet: Dienstag, 9. Januar 2007 13:36 > An: Gregory S. Williamson > Cc: pgsql-performance@postgresql.org > Betreff: Re: [PERFORM] Horribly slow query/ sequential scan > > > I don't think I understand the idea behind this query. Do you > really need > billing_reports twice? > >> The query: >> explain analyze select >> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS >> IUs, >> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, >> sum(w.sius) * w.rate AS BYIUS >> from bill_rpt_work w, billing_reports b >> where w.report_id in >> (select b.report_id from billing_reports where b.report_s_date = >> '2006-09-30') >> and (w.client_id = '227400001' or w.client_id = '2274000010') >> group by 1,2,3 >> order by 1,2,3; > > Maybe this is the query you want instead? > > select w.appid, > w.rate, > w.is_subscribed, > sum(w.hits) AS Hits, > sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total, > sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w > where w.report_id in > (select b.report_id from billing_reports b where > b.report_s_date = > '2006-09-30') > and (w.client_id = '227400001' or w.client_id = '2274000010') > group by 1,2,3 > order by 1,2,3; > > /Dennis > > > ---------------------------(end of > broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
"Gregory S. Williamson" <gsw@globexplorer.com> writes: > HAving burdened others with my foolishness too often, I hesitate to > ask, but could someone either point me to a reference or explain what > the difference might be ... I can see it with the eyes but I am having > trouble understanding what Informix might have been doing to my (bad > ?) SQL to "fix" the query. Me too. Does informix have anything EXPLAIN-like to show what it's doing? regards, tom lane
Yes it does: SET EXPLAIN ON; It writes the file to sqexplain.out -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Tom Lane Sent: Tuesday, January 09, 2007 9:13 AM To: Gregory S. Williamson Cc: db@zigo.dhs.org; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Horribly slow query/ sequential scan "Gregory S. Williamson" <gsw@globexplorer.com> writes: > HAving burdened others with my foolishness too often, I hesitate to > ask, but could someone either point me to a reference or explain what > the difference might be ... I can see it with the eyes but I am having > trouble understanding what Informix might have been doing to my (bad > ?) SQL to "fix" the query. Me too. Does informix have anything EXPLAIN-like to show what it's doing? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
As Joe indicated, there is indeed an Informix explain, appended below my signature ... This table has 5565862 total rows, and 37 target rows. So about twice the total data, but all of the "extra" data in infomrixis much older. Thanks for the help, one and all! Greg W. QUERY: ------ SELECT collection_id,client_id,client_name,appid,SUM(hits),SUM(sius),SUM(royalty_total) FROM bill_rpt_work WHERE report_id in (SELECT report_id FROM billing_reports WHERE report_s_date = '2004-09-10') GROUP BY collection_id, client_id,client_name,appid ORDER BY collection_id,client_id,appid Estimated Cost: 2015 Estimated # of Rows Returned: 481 Temporary Files Required For: Order By Group By 1) informix.bill_rpt_work: INDEX PATH (1) Index Keys: report_id (Serial, fragments: ALL) Lower Index Filter: informix.bill_rpt_work.report_id = ANY <subquery> Subquery: --------- Estimated Cost: 44 Estimated # of Rows Returned: 1 1) informix.billing_reports: SEQUENTIAL SCAN Filters: informix.billing_reports.report_s_date = datetime(2004-09-10) year to day QUERY: ------ select count(*) from informix.systables; Estimated Cost: 1 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: (count) QUERY: ------ select tabname , tabid , owner from informix . systables where tabname != 'ANSI' and tabtype != 'P' order by tabname Estimated Cost: 30 Estimated # of Rows Returned: 196 1) informix.systables: INDEX PATH Filters: informix.systables.tabtype != 'P' (1) Index Keys: tabname owner (Key-First) Key-First Filters: (informix.systables.tabname != 'ANSI' ) QUERY: ------ select tabid, tabtype, tabname, owner from informix.systables where (tabname = ? and owner like ?) Estimated Cost: 2 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner (Key-First) Lower Index Filter: informix.systables.tabname = 'bill_rpt_work' Key-First Filters: (informix.systables.owner LIKE '%' ) QUERY: ------ select count(*) from informix.systables where tabname = 'sysindices'; Estimated Cost: 2 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner Lower Index Filter: informix.systables.tabname = 'sysindices' QUERY: ------ select colno, colname, coltype, collength, informix.syscolumns.extended_id, name from informix.syscolumns, informix.systables,outer informix.sysxtdtypes where informix.syscolumns.tabid = informix.systables.tabid and informix.syscolumns.extended_id= informix.sysxtdtypes.extended_id and tabname = ? and informix.systables.owner = ? orderby informix.syscolumns.colno; Estimated Cost: 9 Estimated # of Rows Returned: 7 Temporary Files Required For: Order By 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner Lower Index Filter: (informix.systables.tabname = 'bill_rpt_work' AND informix.systables.owner = 'informix ' ) 2) informix.syscolumns: INDEX PATH (1) Index Keys: tabid colno Lower Index Filter: informix.syscolumns.tabid = informix.systables.tabid NESTED LOOP JOIN 3) informix.sysxtdtypes: INDEX PATH (1) Index Keys: extended_id Lower Index Filter: informix.syscolumns.extended_id = informix.sysxtdtypes.extended_id NESTED LOOP JOIN QUERY: ------ select tabid, tabtype, tabname, owner from informix.systables where (tabname = ? and owner like ?) Estimated Cost: 2 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner (Key-First) Lower Index Filter: informix.systables.tabname = 'bill_rpt_work' Key-First Filters: (informix.systables.owner LIKE '%' ) QUERY: ------ select count(*) from informix.systables where tabname = 'sysindices'; Estimated Cost: 2 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner Lower Index Filter: informix.systables.tabname = 'sysindices' QUERY: ------ select idxtype, clustered,idxname, informix.sysindices.owner, indexkeys::lvarchar, amid, am_name from informix.sysindices,informix.systables, informix.sysams where informix.systables.tabname = ? and informix.systables.tabid = informix.sysindices.tabid and informix.systables.owner like ? and informix.sysindices.amid = informix.sysams.am_id; Estimated Cost: 6 Estimated # of Rows Returned: 2 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner Lower Index Filter: (informix.systables.tabname = 'bill_rpt_work' AND informix.systables.owner = 'informix' ) 2) informix.sysindices: INDEX PATH (1) Index Keys: tabid Lower Index Filter: informix.systables.tabid = informix.sysindices.tabid NESTED LOOP JOIN 3) informix.sysams: INDEX PATH (1) Index Keys: am_id Lower Index Filter: informix.sysindices.amid = informix.sysams.am_id NESTED LOOP JOIN UDRs in query: -------------- UDR id : 1 UDR name: indexkeyarray_out QUERY: ------ select tabid, tabtype, tabname, owner from informix.systables where (tabname = ? and owner like ?) Estimated Cost: 2 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner (Key-First) Lower Index Filter: informix.systables.tabname = 'bill_rpt_work' Key-First Filters: (informix.systables.owner LIKE '%' ) QUERY: ------ select count(*) from informix.systables where tabname = 'sysindices'; Estimated Cost: 2 Estimated # of Rows Returned: 1 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner Lower Index Filter: informix.systables.tabname = 'sysindices' QUERY: ------ select colno, colname, coltype, collength, informix.syscolumns.extended_id, name from informix.syscolumns, informix.systables,outer informix.sysxtdtypes where informix.syscolumns.tabid = informix.systables.tabid and informix.syscolumns.extended_id= informix.sysxtdtypes.extended_id and tabname = ? and informix.systables.owner = ? orderby informix.syscolumns.colno; Estimated Cost: 9 Estimated # of Rows Returned: 7 Temporary Files Required For: Order By 1) informix.systables: INDEX PATH (1) Index Keys: tabname owner Lower Index Filter: (informix.systables.tabname = 'bill_rpt_work' AND informix.systables.owner = 'informix ' ) 2) informix.syscolumns: INDEX PATH (1) Index Keys: tabid colno Lower Index Filter: informix.syscolumns.tabid = informix.systables.tabid NESTED LOOP JOIN 3) informix.sysxtdtypes: INDEX PATH (1) Index Keys: extended_id Lower Index Filter: informix.syscolumns.extended_id = informix.sysxtdtypes.extended_id NESTED LOOP JOIN QUERY: ------ select w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w, billing_reports b where w.report_id in (select b.report_id from billing_reports where b.report_s_date = '2006-09-30') and (w.client_id = '227400001' or w.client_id = '2274000010') group by 1,2,3 order by 1,2,3 Estimated Cost: 3149 Estimated # of Rows Returned: 1 Temporary Files Required For: Order By Group By 1) informix.b: INDEX PATH (1) Index Keys: report_s_date (Serial, fragments: ALL) Lower Index Filter: informix.b.report_s_date = datetime(2006-09-30) year to day 2) informix.w: INDEX PATH Filters: (informix.w.client_id = '227400001' OR informix.w.client_id = '2274000010' ) (1) Index Keys: report_id (Serial, fragments: ALL) Lower Index Filter: informix.w.report_id = informix.b.report_id NESTED LOOP JOIN 3) informix.billing_reports: SEQUENTIAL SCAN (First Row) NESTED LOOP JOIN (Semi Join) -----Original Message----- From: pgsql-performance-owner@postgresql.org on behalf of Plugge, Joe R. Sent: Tue 1/9/2007 7:36 AM To: pgsql-performance@postgresql.org Cc: Subject: Re: [PERFORM] Horribly slow query/ sequential scan Yes it does: SET EXPLAIN ON; It writes the file to sqexplain.out -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Tom Lane Sent: Tuesday, January 09, 2007 9:13 AM To: Gregory S. Williamson Cc: db@zigo.dhs.org; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Horribly slow query/ sequential scan "Gregory S. Williamson" <gsw@globexplorer.com> writes: > HAving burdened others with my foolishness too often, I hesitate to > ask, but could someone either point me to a reference or explain what > the difference might be ... I can see it with the eyes but I am having > trouble understanding what Informix might have been doing to my (bad > ?) SQL to "fix" the query. Me too. Does informix have anything EXPLAIN-like to show what it's doing? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings ------------------------------------------------------- Click link below if it is SPAM gsw@globexplorer.com "https://mailscanner.globexplorer.com/dspam/dspam.cgi?signatureID=45a3b93d75271019119885&user=gsw@globexplorer.com&retrain=spam&template=history&history_page=1" !DSPAM:45a3b93d75271019119885! -------------------------------------------------------
"Gregory S. Williamson" <gsw@globexplorer.com> writes: > As Joe indicated, there is indeed an Informix explain, appended below my signature ... > select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BYIUS > from bill_rpt_work w, billing_reports b > where w.report_id in > (select b.report_id from billing_reports where b.report_s_date = '2006-09-30') > and (w.client_id = '227400001' or w.client_id = '2274000010') > group by 1,2,3 > order by 1,2,3 > Estimated Cost: 3149 > Estimated # of Rows Returned: 1 > Temporary Files Required For: Order By Group By > 1) informix.b: INDEX PATH > (1) Index Keys: report_s_date (Serial, fragments: ALL) > Lower Index Filter: informix.b.report_s_date = datetime(2006-09-30) year to day > 2) informix.w: INDEX PATH > Filters: (informix.w.client_id = '227400001' OR informix.w.client_id = '2274000010' ) > (1) Index Keys: report_id (Serial, fragments: ALL) > Lower Index Filter: informix.w.report_id = informix.b.report_id > NESTED LOOP JOIN > 3) informix.billing_reports: SEQUENTIAL SCAN (First Row) > NESTED LOOP JOIN (Semi Join) Interesting! "Semi join" is the two-dollar technical term for what our code calls an "IN join", viz a join that returns at most one copy of a left-hand row even when there's more than one right-hand join candidate for it. So I think there's not any execution mechanism here that we don't have. What seems to be happening is that Informix is willing to flatten the sub-SELECT into an IN join even though the sub-SELECT is correlated to the outer query (that is, it contains outer references). I'm not sure whether we're just being paranoid by not doing that, or whether there are special conditions to check before allowing it, or whether Informix is wrong ... regards, tom lane
I wrote: > ... What seems to be happening is that Informix is willing to > flatten the sub-SELECT into an IN join even though the sub-SELECT is > correlated to the outer query (that is, it contains outer references). I did some googling this morning and found confirmation that recent versions of Informix have pretty extensive support for optimizing correlated subqueries: http://www.iiug.org/waiug/archive/iugnew83/FeaturesIDS73.htm This is something we've not really spent much time on for Postgres, but it might be interesting to look at someday. Given that the problem with your query was really a mistake anyway, I'm not sure that your example is compelling evidence for making it a high priority. regards, tom lane