Hi,
Two parts to the mail
1)
> We already do --- the scan nodes project out only
> the needed columns.
ok..thats great.
I tried looking for what you are saying in the source
code... [before and after doing a cvs update].. but I
am still confused by the following:
In /backend/executor/nodeMergeJoin.c
in ExecMergeJoin()
In the state (the switch case) EXEC_MJ_JOINTUPLES
we still do ExecProject(), what does this do?
(this is what prompted my incorrect assumption in the
first place).
I was able to trace the ExecProject() which takes
place (due to the SeqScan node) in the
ExecScan function of /executor/execScan.c
(I made an error in assuming the tuple returned
by the function passed as an argument to ExecScan was
being returned directly )...
so now i see what you were saying in the mail but
am not clear about why we have a ExecProject() higher
up in the join nodes..its there in the NestedLoop node
too.
part 2)
> As you indicate, that's probably a net loss
> when there's a Sort node directly above the scan
> node. Needs more thought...
Some food for thought,
Let's ignore the attributes listed in the select
clause
and work only with the where clause (join condition)
attributes. And as a back reference store the
tupleid of the original whole tuple in the "working"
tuple. At the final output stage perform a lookup to
retrieve the select clause attributes of only the
qualifying tuple. Thus enabling us to work with really
small sized data.
worth trying out?
Thanks
Anagh Lal
__________________________________________________
Do you Yahoo!?
Yahoo! Shopping - Send Flowers for Valentine's Day
http://shopping.yahoo.com