Thread: Using IN with subselect
I have a query with an in subquery like where x in (select x from y); Now the subquery is not related to the outer query so it always returns the same set. Is this subselect executed each time or just once? If it is executed each time, if I create a function would that then be only executed once? -- Dave Smith CANdata Systems Ltd 416-493-9020
Running EXPLAIN over the query will tell you... On Thu, Nov 25, 2004 at 11:49:06AM -0500, Dave Smith wrote: > I have a query with an in subquery like > > where x in (select x from y); > > Now the subquery is not related to the outer query so it always returns > the same set. Is this subselect executed each time or just once? If it > is executed each time, if I create a function would that then be only > executed once? > > -- > Dave Smith > CANdata Systems Ltd > 416-493-9020 > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Well here is explain. I would guess that it is executed each time .. function any different? HashAggregate (cost=288.32..288.32 rows=1 width=32) -> Hash IN Join (cost=288.18..288.31 rows=1 width=32) Hash Cond: (("outer".gl_num)::text = lpad(ltrim(("inner".account_num)::text, '0'::text), 9, ' '::text)) -> Subquery Scan journal_all (cost=282.36..282.45 rows=2 width=64) -> Unique (cost=282.36..282.43 rows=2 width=159) -> Sort (cost=282.36..282.36 rows=2 width=159) Sort Key: objectid, owner_oid, source_code, posting_date, control_num, reference, gl_num, gl_amt, distributed_amt, "comment", operator_id, branch_id, company_id -> Append (cost=0.00..282.35 rows=2 width=159) -> Subquery Scan "*SELECT* 1" (cost=0.00..265.24 rows=1 width=159) -> Index Scan using journal_9 on journal (cost=0.00..265.23 rows=1 width=159) Index Cond: (company_id = 1000) Filter: ((posting_date >= '2004-01-01'::date) AND (posting_date <= '2004-01-31'::date)) -> Subquery Scan "*SELECT* 2" (cost=0.00..17.10 rows=1 width=159) -> Index Scan using journal_hist_7 on journal_hist (cost=0.00..17.09 rows=1 width=159) Index Cond: (company_id = 1000) Filter: ((posting_date >= '2004-01-01'::date) AND (posting_date <= '2004-01-31'::date)) -> Hash (cost=5.83..5.83 rows=1 width=13) -> Index Scan using glmast_index3 on glmast (cost=0.00..5.83 rows=1 width=13) Index Cond: ((company_id = 1000) AND ((control_type)::text = 'F'::text)) On Thu, 2004-11-25 at 12:11, Martijn van Oosterhout wrote: > Running EXPLAIN over the query will tell you... > > On Thu, Nov 25, 2004 at 11:49:06AM -0500, Dave Smith wrote: > > I have a query with an in subquery like > > > > where x in (select x from y); > > > > Now the subquery is not related to the outer query so it always returns > > the same set. Is this subselect executed each time or just once? If it > > is executed each time, if I create a function would that then be only > > executed once? > > > > -- > > Dave Smith > > CANdata Systems Ltd > > 416-493-9020 > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org -- Dave Smith CANdata Systems Ltd 416-493-9020
On Thu, Nov 25, 2004 at 12:19:23PM -0500, Dave Smith wrote: > Well here is explain. I would guess that it is executed each time .. > function any different? Hmm, if it's only executed once I would expect it to be an InitPlan. Are you sure it's not correlated? If you want to make sure, put the subquery in the FROM clause, that it definitly will only be run once. <snip> Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
Dave Smith <dave.smith@candata.com> writes: > Well here is explain. I would guess that it is executed each time .. > function any different? > HashAggregate (cost=288.32..288.32 rows=1 width=32) > -> Hash IN Join (cost=288.18..288.31 rows=1 width=32) > -> Subquery Scan journal_all (cost=282.36..282.45 rows=2 width=64) > -> Hash (cost=5.83..5.83 rows=1 width=13) > -> Index Scan using glmast_index3 on glmast (cost=0.00..5.83 rows=1 width=13) No ... this plan says to scan glmast once, load the selected rows into an in-memory hash table, then scan journal_all once and probe the hash table for matches. It looks like a pretty decent choice of plan to me. regards, tom lane
That's what I wanted it to do I just did not understand how to read the explain. So is it HashAggregate that means this already loaded? On Thu, 2004-11-25 at 12:57, Tom Lane wrote: > Dave Smith <dave.smith@candata.com> writes: > > Well here is explain. I would guess that it is executed each time .. > > function any different? > > > HashAggregate (cost=288.32..288.32 rows=1 width=32) > > -> Hash IN Join (cost=288.18..288.31 rows=1 width=32) > > -> Subquery Scan journal_all (cost=282.36..282.45 rows=2 width=64) > > -> Hash (cost=5.83..5.83 rows=1 width=13) > > -> Index Scan using glmast_index3 on glmast (cost=0.00..5.83 rows=1 width=13) > > No ... this plan says to scan glmast once, load the selected rows into > an in-memory hash table, then scan journal_all once and probe the hash > table for matches. It looks like a pretty decent choice of plan to me. > > regards, tom lane -- Dave Smith CANdata Systems Ltd 416-493-9020
Dave Smith <dave.smith@candata.com> writes: > That's what I wanted it to do I just did not understand how to read the > explain. So is it HashAggregate that means this already loaded? The HashAggregate Node is doing a GROUP BY command. (or the implicit GROUP BY if you used an aggregate function without one.) The "Hash IN Join" is a particular type of join that behaves the way Tom describes. Each type of join behaves differently. A Nested Loop join would execute one side once and the other many times. A Merge join executes both sides once. You could also run "explain analyze" instead of just "explain" and look at the "loops" number that would tell you how many times the node actually was executed. > > Dave Smith <dave.smith@candata.com> writes: > > > Well here is explain. I would guess that it is executed each time .. > > > function any different? > > > > > HashAggregate (cost=288.32..288.32 rows=1 width=32) > > > -> Hash IN Join (cost=288.18..288.31 rows=1 width=32) > > > -> Subquery Scan journal_all (cost=282.36..282.45 rows=2 width=64) > > > -> Hash (cost=5.83..5.83 rows=1 width=13) > > > -> Index Scan using glmast_index3 on glmast (cost=0.00..5.83 rows=1 width=13) > > > > No ... this plan says to scan glmast once, load the selected rows into > > an in-memory hash table, then scan journal_all once and probe the hash > > table for matches. It looks like a pretty decent choice of plan to me. -- greg