On Mon, Mar 14, 2011 at 12:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Mon, Mar 14, 2011 at 2:34 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> changes to:
>>
>> SELECT member_statistics.member_id
>> FROM member_statistics
>> WHERE EXISTS
>> (
>> SELECT mat1.tag_id
>> FROM member_all_tags_v AS mat1
>> WHERE mat1.member_id = member_statistics.member_id
>> AND mat1.tag_id
>> IN (640, 641, 3637, 3638, 637, 638, 639) AND
>> mat1.polarity >= 90
>> AND mat1.member_id IN ( <<400 ids>> )
>> )
>
> It isn't easy to get the ORM to spit that kind of queries, but I could
> try them by hand.
>
>> also, always try to compare vs straight join version:
>>
>>
>> SELECT member_statistics.member_id
>> FROM member_statistics
>> JOIN VALUES ( <<400 ids>> ) q(member_id) using (member_id)
>> JOIN
>> (
>> SELECT mat1.member_id
>> FROM member_all_tags_v AS mat1
>> WHERE mat1.tag_id IN (640, 641, 3637, 3638, 637, 638, 639)
>> AND mat1.polarity >= 90) p
>> USING(member_id)
>> ) p using(member_id);
>>
>> merlin
>
> The straight join like that was used long ago, but it replicates rows
> unacceptably: for each row in the subquery, one copy of member_id is
> output, which create an unacceptable overhead in the application and
> network side. It could be perhaps fixed with distinct, but then
> there's sorting overhead.
ah -- right. my mistake. well, you could always work around with
'distinct', although the exists version should be better (semi vs full
join). what options *do* you have in terms of coaxing the ORM to
produce particular sql? :-). This is likely 100% work-aroundable via
tweaking the SQL. I don't have the expertise to suggest a solution
with your exact sql, if there is one.
merlin