Re: Sequential scan instead of index scan - Mailing list pgsql-performance

From Ioannis Anagnostopoulos
Subject Re: Sequential scan instead of index scan
Date
Msg-id 50213D1E.3040403@anatec.com
Whole thread Raw
In response to Re: Sequential scan instead of index scan  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: Sequential scan instead of index scan
List pgsql-performance
On 07/08/2012 17:00, Jeff Janes wrote:
> On Mon, Aug 6, 2012 at 8:08 AM, Ioannis Anagnostopoulos
> <ioannis@anatec.com> wrote:
>> Hi, my query is very simple:
>>
>> select
>>              msg_id,
>>              msg_type,
>>              ship_pos_messages.pos_georef1,
>>              ship_pos_messages.pos_georef2,
>>              ship_pos_messages.pos_georef3,
>>              ship_pos_messages.pos_georef4,
>>              obj_id,
>>              ship_speed,
>>              ship_heading,
>>              ship_course,
>>              pos_point
>>          from
>>              feed_all_y2012m08.ship_pos_messages
>>          where
>>              extract('day' from msg_date_rec) = 1
>>              AND msg_id = any(ARRAY[7294724,14174174,22254408]);
>>
>> The msg_id is the pkey on the ship_pos_messages table and in this example it
>> is working fast as it uses the pkey (primary key index) to make the
>> selection. The expplain anayze follows:
> ...
>> I think this is a pretty good plan and quite quick given the size of the
>> table (88Million rows at present). However in real life the parameter where
>> I search for msg_id is not an array of 3 ids but of 300.000 or more. It is
>> then that the query forgets the plan and goes to sequential scan. Is there
>> any way around? Or is this the best I can have?
> What happens if you set "enable_seqscan=off" and run the query with
> the very large list?  (This is an experiment, not a recommendation for
> production use)
>
>
> Cheers,
>
> Jeff
As Tom said, the actual question is not valid. Seq scan are not bad, we
just need to understand the way around it instead of forcing them off.
In my case, the problem was the ARRAY as a parameter (which all together
is not that great for holding so many data). By converting it into a
temporary table and performing an inner join in the query (after
analysing the temp table) you get a nice Hash join (or Merge Join if you
don't analyse the temp table).

cheers Yiannis

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Sequential scan instead of index scan
Next
From: Jeff Janes
Date:
Subject: Re: Sequential scan instead of index scan