Re: [PERFORM] Big IN() clauses etc : feature proposal - Mailing list pgsql-hackers

From PFC
Subject Re: [PERFORM] Big IN() clauses etc : feature proposal
Date
Msg-id op.s893vzcpcigqcu@apollo13
Whole thread Raw
In response to Re: [PERFORM] Big IN() clauses etc : feature proposal  (Christian Kratzer <ck-lists@cksoft.de>)
Responses Re: [PERFORM] Big IN() clauses etc : feature proposal  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-hackers
> Additionally to your query you are already transferring the whole result
> set multiple times.  First you copy it to the result table. Then you
> read it again.   Your subsequent queries will also have to read over
> all the unneeded tuples just to get your primary key.

    Considering that the result set is not very large and will be cached in
RAM, this shouldn't be a problem.

> then why useth thy not the DISTINCT clause when building thy result
> table and thou shalt have no duplicates.

    Because the result table contains no duplicates ;)
    I need to remove duplicates in this type of queries :

-- get object owners info
SELECT * FROM users WHERE id IN (SELECT user_id FROM results);

    And in this case I find IN() easier to read than DISTINCT (what I posted
was a simplification of my real use case...)

> which is a perfect reason to use a temp table.  Another variation on the
> temp table scheme is use a result table and add a query_id.

    True. Doesn't solve my problem though : it's still complex, doesn't have
good rowcount estimation, bloats a table (I only need these records for
the duration of the transaction), etc.

> We do something like this in our web application when users submit
> complex queries.  For each query we store tuples of (query_id,result_id)
> in a result table.  It's then easy for the web application to page the
> result set.

    Yes, that is about the only sane way to page big result sets.

> A cleaner solution usually pays off in the long run whereas a hackish
> or overly complex solution will bite you in the behind for sure as
> time goes by.

    Yes, but in this case temp tables add too much overhead. I wish there
were RAM based temp tables like in mysql. However I guess the current temp
table slowness comes from the need to mark their existence in the system
catalogs or something. That's why I proposed using cursors...

pgsql-hackers by date:

Previous
From: Christian Kratzer
Date:
Subject: Re: [PERFORM] Big IN() clauses etc : feature proposal
Next
From: Martijn van Oosterhout
Date:
Subject: Re: [PERFORM] Big IN() clauses etc : feature proposal