Re: fast DISTINCT or EXIST - Mailing list pgsql-performance

From Tom Lane
Subject Re: fast DISTINCT or EXIST
Date
Msg-id 9915.1175963978@sss.pgh.pa.us
Whole thread Raw
In response to Re: fast DISTINCT or EXIST  (Tilo Buschmann <mailinglist.postgresql.performance@b-n-w.org>)
List pgsql-performance
Tilo Buschmann <mailinglist.postgresql.performance@b-n-w.org> writes:
>> Arjen van der Meijden <acmmailing@tweakers.net> writes:
>>> SELECT ... FROM cd
>>> JOIN tracks ...
>>> WHERE cd.id IN (SELECT DISTINCT cd_id FROM (SELECT t.cd_id FROM tracks t
>>> WHERE t.tstitle @@ plainto_tsquery('simple','education') LIMIT 30)
>>> as foo LIMIT 10)

> Unfortunately, the query above will definitely not work correctly, if
> someone searches for "a" or "the".

Well, the "incorrectness" is only that it might deliver fewer than the
hoped-for ten CDs ... but that was a completely arbitrary cutoff anyway,
no?  I think in practice this'd give perfectly acceptable results.

> Actually, I hoped to find an alternative, that does not involve
> DISTINCT.

You could try playing around with GROUP BY rather than DISTINCT; those
are separate code paths and will probably give you different plans.
But I don't think you'll find that GROUP BY does any better on this
particular measure of yielding rows before the full input has been
scanned.

            regards, tom lane

pgsql-performance by date:

Previous
From: Tilo Buschmann
Date:
Subject: Re: fast DISTINCT or EXIST
Next
From: Arjen van der Meijden
Date:
Subject: Re: fast DISTINCT or EXIST