Re: [SQL] OFFSET impact on Performance??? - Mailing list pgsql-performance

From Greg Stark
Subject Re: [SQL] OFFSET impact on Performance???
Date
Msg-id 87zmyvvxlg.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: [SQL] OFFSET impact on Performance???  (Oleg Bartunov <oleg@sai.msu.su>)
List pgsql-performance
Oleg Bartunov <oleg@sai.msu.su> writes:

> On Thu, 27 Jan 2005, PFC wrote:
>
> >
> > > > beware that SELECT * FROM table WHERE id =ANY( array ) won't use an index,

> > > contrib/intarray provides index access to such queries.
> >
> > Can you provide an example of such a query ? I've looked at the operators
> > for intarray without finding it.
>
> for example,
> http://www.sai.msu.su/~megera/postgres/gist/code/7.3/README.intarray
> see OPERATIONS and EXAMPLE USAGE:
>
> SELECT * FROM table WHERE id && int[]

I don't think that helps him. He wants the join to the *other* table to use an
index. It would be nice if the IN plan used an index for =ANY(array) just like
it does for =ANY(subquery) but I'm not sure the statistics are there. It might
not be a bad plan to just assume arrays are never going to be millions of
elements long though.

There is a way to achieve this using "int_array_enum" from another contrib
module, "intagg". My current project uses something similar to this except the
arrays are precomputed. When I went to 7.4 the new array support obsoleted
everything else I was using from the "intagg" and "array" contrib moduels
except for this one instance where intagg is still necessary.

It is a bit awkward but it works:

slo=> EXPLAIN
       SELECT *
         FROM foo
         JOIN (SELECT int_array_enum(foo_ids) AS foo_id
                 FROM cache
                WHERE cache_id = 1) AS x
        USING (foo_id) ;
                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..6.40 rows=1 width=726)
   ->  Subquery Scan x  (cost=0.00..3.18 rows=1 width=4)
         ->  Index Scan using idx_cache on cache  (cost=0.00..3.17 rows=1 width=30)
               Index Cond: (cache_id = 1)
   ->  Index Scan using foo_pkey on foo  (cost=0.00..3.21 rows=1 width=726)
         Index Cond: (foo.foo_id = "outer".foo_id)
(6 rows)


(query and plan edited for clarity and for paranoia purposes)


--
greg

pgsql-performance by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: [SQL] OFFSET impact on Performance???
Next
From: PFC
Date:
Subject: Re: [SQL] OFFSET impact on Performance???