Re: Horribly slow query/ sequential scan - Mailing list pgsql-performance

From Tom Lane
Subject Re: Horribly slow query/ sequential scan
Date
Msg-id 374.1168408529@sss.pgh.pa.us
Whole thread Raw
In response to Re: Horribly slow query/ sequential scan  ("Gregory S. Williamson" <gsw@globexplorer.com>)
Responses Re: Horribly slow query/ sequential scan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> As Joe indicated, there is indeed an Informix explain, appended below my signature ...

> select
> w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
> sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
> sum(w.sius) * w.rate AS BYIUS
> from bill_rpt_work w, billing_reports b
> where w.report_id in
> (select b.report_id from billing_reports where b.report_s_date = '2006-09-30')
> and (w.client_id = '227400001' or w.client_id = '2274000010')
> group by 1,2,3
> order by 1,2,3

> Estimated Cost: 3149
> Estimated # of Rows Returned: 1
> Temporary Files Required For: Order By  Group By

>   1) informix.b: INDEX PATH

>     (1) Index Keys: report_s_date   (Serial, fragments: ALL)
>         Lower Index Filter: informix.b.report_s_date = datetime(2006-09-30) year to day

>   2) informix.w: INDEX PATH

>         Filters: (informix.w.client_id = '227400001' OR informix.w.client_id = '2274000010' )

>     (1) Index Keys: report_id   (Serial, fragments: ALL)
>         Lower Index Filter: informix.w.report_id = informix.b.report_id
> NESTED LOOP JOIN

>   3) informix.billing_reports: SEQUENTIAL SCAN  (First Row)
> NESTED LOOP JOIN  (Semi Join)

Interesting!  "Semi join" is the two-dollar technical term for what our
code calls an "IN join", viz a join that returns at most one copy of a
left-hand row even when there's more than one right-hand join candidate
for it.  So I think there's not any execution mechanism here that we
don't have.  What seems to be happening is that Informix is willing to
flatten the sub-SELECT into an IN join even though the sub-SELECT is
correlated to the outer query (that is, it contains outer references).
I'm not sure whether we're just being paranoid by not doing that, or
whether there are special conditions to check before allowing it, or
whether Informix is wrong ...

            regards, tom lane

pgsql-performance by date:

Previous
From: Andrew Lazarus
Date:
Subject: Re: group by will not use an index?
Next
From: Florian Weimer
Date:
Subject: Re: High inserts, bulk deletes - autovacuum vs scheduled vacuum