On Mon, Aug 6, 2012 at 8:08 AM, Ioannis Anagnostopoulos
<ioannis@anatec.com> wrote:
> Hi, my query is very simple:
>
> select
> msg_id,
> msg_type,
> ship_pos_messages.pos_georef1,
> ship_pos_messages.pos_georef2,
> ship_pos_messages.pos_georef3,
> ship_pos_messages.pos_georef4,
> obj_id,
> ship_speed,
> ship_heading,
> ship_course,
> pos_point
> from
> feed_all_y2012m08.ship_pos_messages
> where
> extract('day' from msg_date_rec) = 1
> AND msg_id = any(ARRAY[7294724,14174174,22254408]);
>
> The msg_id is the pkey on the ship_pos_messages table and in this example it
> is working fast as it uses the pkey (primary key index) to make the
> selection. The expplain anayze follows:
...
>
> I think this is a pretty good plan and quite quick given the size of the
> table (88Million rows at present). However in real life the parameter where
> I search for msg_id is not an array of 3 ids but of 300.000 or more. It is
> then that the query forgets the plan and goes to sequential scan. Is there
> any way around? Or is this the best I can have?
What happens if you set "enable_seqscan=off" and run the query with
the very large list? (This is an experiment, not a recommendation for
production use)
Cheers,
Jeff