Thread: Sequencial scan instead of using index

Sequencial scan instead of using index

From
"Harry Hehl"
Date:
There seems to be many posts on this issue but I not yet found an answer to the seq scan issue.

I am having an issue with a joins. I am using 8.0.3 on FC4

Query: select * from ommemberrelation where srcobj='somevalue' and dstobj in (select objectid from omfilesysentry where
name='dir15_file80');

Columns srcobj, dstobj & name are all indexed.

I ran test adding records to ommemberrelation and omfilesysentry up to 32K in each to simulate and measured query
times. The graph is O(n²) like.  i.e sequencial scan   

The columns in the where clauses are indexed, and yes I did VACUUM ANALYZE FULL. I even tried backup restore of the
entiredb. No difference.  

Turning sequencial scan off results in a O(n log n) like graph,

Explain analyze confirms sequencial scan. A majority (70ms) of the 91ms query is as a result of  ->  Seq Scan on
ommemberrelationTiming is on. 
                                                                                      QUERY PLAN
                                                              

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop IN Join  (cost=486.19..101533.99 rows=33989 width=177) (actual time=5.493..90.682 rows=1 loops=1)
   Join Filter: ("outer".dstobj = "inner".objectid)
   ->  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989 width=177) (actual time=0.078..70.887 rows=100
loops=1)
         Filter: (srcobj = '3197a4e6-abf1-11da-a0f9-000fb05ab829'::text)
   ->  Materialize  (cost=486.19..487.48 rows=129 width=16) (actual time=0.004..0.101 rows=26 loops=100)
         ->  Append  (cost=0.00..486.06 rows=129 width=16) (actual time=0.063..1.419 rows=26 loops=1)
               ->  Index Scan using omfilesysentry_name_idx on omfilesysentry  (cost=0.00..8.30 rows=2 width=16)
(actualtime=0.019..0.019 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Index Scan using omfile_name_idx on omfile omfilesysentry  (cost=0.00..393.85 rows=101 width=16)
(actualtime=0.033..0.291 rows=26 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Seq Scan on omdirectory omfilesysentry  (cost=0.00..24.77 rows=11 width=16) (actual
time=0.831..0.831rows=0 loops=1) 
                     Filter: (name = 'dir15_file80'::text)
               ->  Index Scan using omfilesequence_name_idx on omfilesequence omfilesysentry  (cost=0.00..8.30 rows=2
width=16)(actual time=0.014..0.014 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Index Scan using omclipfile_name_idx on omclipfile omfilesysentry  (cost=0.00..8.30 rows=2 width=16)
(actualtime=0.008..0.008 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Index Scan using omimagefile_name_idx on omimagefile omfilesysentry  (cost=0.00..8.30 rows=2
width=16)(actual time=0.008..0.008 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Index Scan using omcollection_name_idx on omcollection omfilesysentry  (cost=0.00..8.30 rows=2
width=16)(actual time=0.008..0.008 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Index Scan using omhomedirectory_name_idx on omhomedirectory omfilesysentry  (cost=0.00..8.30 rows=2
width=16)(actual time=0.007..0.007 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Seq Scan on omrootdirectory omfilesysentry  (cost=0.00..1.05 rows=1 width=16) (actual
time=0.013..0.013rows=0 loops=1) 
                     Filter: (name = 'dir15_file80'::text)
               ->  Index Scan using omwarehousedirectory_name_idx on omwarehousedirectory omfilesysentry
(cost=0.00..8.30rows=2 width=16) (actual time=0.007..0.007 rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)
               ->  Index Scan using omtask_name_idx on omtask omfilesysentry  (cost=0.00..8.30 rows=2 width=16) (actual
time=0.009..0.009rows=0 loops=1) 
                     Index Cond: (name = 'dir15_file80'::text)  Total runtime: 91.019 ms
(29 rows)

So why is the planner not using the index?  Everything I have read indicates sequencial scanning should be left on and
theplanner should do the right thing.  

This is a quote from 1 web site:

"These options are pretty much only for use in query testing; frequently one sets "enable_seqscan = false" in order to
determineif the planner is unnecessarily discarding an index, for example. However, it would require very unusual
circumstancesto change any of them to false in the .conf file." 

So how do I determine why the planner is unnecessarily discarding the index?

Thanks




Re: Sequencial scan instead of using index

From
Ragnar
Date:
On mán, 2006-03-06 at 13:46 -0500, Harry Hehl wrote:
> Query: select * from ommemberrelation where srcobj='somevalue'
> and dstobj in (select objectid from omfilesysentry where name='dir15_file80');
>
> Columns srcobj, dstobj & name are all indexed.

>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop IN Join  (cost=486.19..101533.99 rows=33989 width=177) (actual time=5.493..90.682 rows=1 loops=1)
>    Join Filter: ("outer".dstobj = "inner".objectid)
>    ->  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989 width=177) (actual time=0.078..70.887 rows=100
loops=1)
>          Filter: (srcobj = '3197a4e6-abf1-11da-a0f9-000fb05ab829'::text)
>    ->  Materialize  (cost=486.19..487.48 rows=129 width=16) (actual time=0.004..0.101 rows=26 loops=100)

Looks like the planner is expecting 33989 rows, making
an index scan a ppor choice, but in fact only 100 rows
actually match your srcobj value.

Could we see the explain analyze with enable_seqscan
= false please ?

Possibly you might want totry to increase the statistics
target for this columns , as in:
  ALTER TABLE ommemberrelation ALTER COLUMN srcobj
      SET STATISTICS 1000;
  ANALYZE;
and try again (with enable_seqscan=true)

A target of 1000 ismost probably overkill, but
start with this value, and if it improves matters,
you can experiment with lower settings.

gnari



Re: Sequencial scan instead of using index

From
Mark Kirkwood
Date:
Harry Hehl wrote:
> There seems to be many posts on this issue but I not yet found an answer to the seq scan issue.
>
> I am having an issue with a joins. I am using 8.0.3 on FC4
>
> Query: select * from ommemberrelation where srcobj='somevalue' and dstobj in (select objectid from omfilesysentry
wherename='dir15_file80'); 
>
> Columns srcobj, dstobj & name are all indexed.
>
>

The planner is over-estimating the number of rows here (33989 vs 100):

->  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989
width=177) (actual time=0.078..70.887 rows=100 loops=1)

The usual way to attack this is to up the sample size for ANALYZE:

ALTER TABLE ommemberrelation ALTER COLUMN srcobj SET STATISTICS 100;
ALTER TABLE ommemberrelation ALTER COLUMN dstobj SET STATISTICS 100;
-- or even 1000.
ANALYZE ommemberrelation;

Then try EXPLAIN ANALYZE again.


If you can upgrade to 8.1.(3), then the planner can consider paths that
use *both* the indexes on srcobj and dstobj (which would probably be the
business!).

Cheers

Mark

Re: Sequencial scan instead of using index

From
markir@paradise.net.nz
Date:
Quoting Harry Hehl <Harry.Hehl@diskstream.com>:

> Mark,
>
> (snippage)However I am still getting seq scans on indexes for other queries
>
> For example:
>
> select * from omfile where ( objectid in ( select distinct(ref_oid)
> from
> ts ) );
> objectid & ref_oid are non-unique indexes
> omimagefile & omclipfile inherit from omfile
>
> --------------------------------------------------------------
> ----------
> --------
>
>  Nested Loop IN Join (cost=21432.32..951981.42 rows=204910 width=217)
>  Join Filter: ("outer".objectid = "inner".ref_oid)
>  -> Append (cost=0.00..8454.10 rows=204910 width=217)
>  -> Seq Scan on omfile (cost=0.00..8428.20 rows=204320
> width=217)
>  -> Seq Scan on omimagefile omfile (cost=0.00..12.70 rows=270
> width=217)
>  -> Seq Scan on omclipfile omfile (cost=0.00..13.20 rows=320
> width=217)
>  -> Materialize (cost=21432.32..21434.32 rows=200 width=16)
>  -> Unique (cost=20614.91..21430.12 rows=200 width=16)
>  -> Sort (cost=20614.91..21022.52 rows=163041 width=16)
>  Sort Key: ts.ref_oid
>  -> Seq Scan on ts (cost=0.00..3739.41 rows=163041
> width=16)
>
> (11 rows)
> Time: 164.232 ms
>
> BTW set enable_seqscan=off has no affect i.e still uses seq scans.
>
> If I do a simple query, it is very quick, no sequencial scans.
> So how can I get index scans to work consistently with joins?
>
> explain select * from omfile where
> objectid='65ef0be3-bf02-46b6-bae9-5bd015ffdb79';
>
> --------------------------------------------------------------------
> ----
> --------
>
>  Result (cost=2.00..7723.30 rows=102903 width=217)
>  -> Append (cost=2.00..7723.30 rows=102903 width=217)
>  -> Bitmap Heap Scan on omfile (cost=2.00..7697.60 rows=102608
> width=217)
>  Recheck Cond: (objectid =
> '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
>  -> Bitmap Index Scan on omfile_objectid_idx
> (cost=0.00..2.00 rows=102608 width=0)
>  Index Cond: (objectid =
> '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
>  -> Bitmap Heap Scan on omimagefile omfile (cost=1.00..12.69
> rows=135 width=217)
>  Recheck Cond: (objectid =
> '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
>  -> Bitmap Index Scan on omimagefile_objectid_idx
> (cost=0.00..1.00 rows=135 width=0)
>  Index Cond: (objectid =
> '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
>  -> Bitmap Heap Scan on omclipfile omfile (cost=1.00..13.00
> rows=160 width=217)
>  Recheck Cond: (objectid =
> '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
>  -> Bitmap Index Scan on omclipfile_objectid_idx
> (cost=0.00..1.00 rows=160 width=0)
>  Index Cond: (objectid =
> '65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
>
> (14 rows)
> Time: 5.164
>
>

Hmm - that first query needs to do a sort, so you might want to experiment with
the sort_mem parameter. Could you show us output from explain analyze for both
the above queries?

At face value, selecting 200000 rows (assuming the estimates are accurate) may
mean that a seqscan is the best plan! But we'll know more after seeing the
explain analyze...

Cheers


Mark

Re: Sequencial scan instead of using index

From
markir@paradise.net.nz
Date:
Quoting "markir@paradise.net.nz" <markir@paradise.net.nz>:


> Hmm - that first query needs to do a sort, so you might want to
> experiment with
> the sort_mem parameter

Oops - I mean work_mem...

Re: Sequencial scan instead of using index

From
"Harry Hehl"
Date:
Mark,

>If you can upgrade to 8.1.(3), then the planner can consider paths that

>use *both* the indexes on srcobj and dstobj (which would probably be
the
>business!).

Yes, 8.1.3 resolved this issue. Thanks.

However I am still getting seq scans on indexes for other queries

For example:

select * from omfile where ( objectid in ( select distinct(ref_oid) from
ts ) );
objectid & ref_oid are non-unique indexes
omimagefile & omclipfile inherit from omfile

------------------------------------------------------------------------
--------

 Nested Loop IN Join  (cost=21432.32..951981.42 rows=204910 width=217)
   Join Filter: ("outer".objectid = "inner".ref_oid)
   ->  Append  (cost=0.00..8454.10 rows=204910 width=217)
         ->  Seq Scan on omfile  (cost=0.00..8428.20 rows=204320
width=217)
         ->  Seq Scan on omimagefile omfile  (cost=0.00..12.70 rows=270
width=217)
         ->  Seq Scan on omclipfile omfile  (cost=0.00..13.20 rows=320
width=217)
   ->  Materialize  (cost=21432.32..21434.32 rows=200 width=16)
         ->  Unique  (cost=20614.91..21430.12 rows=200 width=16)
               ->  Sort  (cost=20614.91..21022.52 rows=163041 width=16)
                     Sort Key: ts.ref_oid
                     ->  Seq Scan on ts  (cost=0.00..3739.41 rows=163041
width=16)

(11 rows)
Time: 164.232 ms

BTW set enable_seqscan=off has no affect i.e still uses seq scans.

If I do a simple query, it is very quick, no sequencial scans.
So how can I get index scans to work consistently with joins?

explain select * from omfile where
objectid='65ef0be3-bf02-46b6-bae9-5bd015ffdb79';

------------------------------------------------------------------------
--------

 Result  (cost=2.00..7723.30 rows=102903 width=217)
   ->  Append  (cost=2.00..7723.30 rows=102903 width=217)
         ->  Bitmap Heap Scan on omfile  (cost=2.00..7697.60 rows=102608
width=217)
               Recheck Cond: (objectid =
'65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
               ->  Bitmap Index Scan on omfile_objectid_idx
(cost=0.00..2.00 rows=102608 width=0)
                     Index Cond: (objectid =
'65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
         ->  Bitmap Heap Scan on omimagefile omfile  (cost=1.00..12.69
rows=135 width=217)
               Recheck Cond: (objectid =
'65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
               ->  Bitmap Index Scan on omimagefile_objectid_idx
(cost=0.00..1.00 rows=135 width=0)
                     Index Cond: (objectid =
'65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
         ->  Bitmap Heap Scan on omclipfile omfile  (cost=1.00..13.00
rows=160 width=217)
               Recheck Cond: (objectid =
'65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)
               ->  Bitmap Index Scan on omclipfile_objectid_idx
(cost=0.00..1.00 rows=160 width=0)
                     Index Cond: (objectid =
'65ef0be3-bf02-46b6-bae9-5bd015ffdb79'::capsa_sys.uuid)

(14 rows)
Time: 5.164



-----Original Message-----
From: Mark Kirkwood [mailto:markir@paradise.net.nz]
Sent: Tuesday, March 07, 2006 12:04 AM
To: Harry Hehl
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Sequencial scan instead of using index

Harry Hehl wrote:
> There seems to be many posts on this issue but I not yet found an
answer to the seq scan issue.
>
> I am having an issue with a joins. I am using 8.0.3 on FC4
>
> Query: select * from ommemberrelation where srcobj='somevalue' and
> dstobj in (select objectid from omfilesysentry where
> name='dir15_file80');
>
> Columns srcobj, dstobj & name are all indexed.
>
>

The planner is over-estimating the number of rows here (33989 vs 100):

->  Seq Scan on ommemberrelation  (cost=0.00..2394.72 rows=33989
width=177) (actual time=0.078..70.887 rows=100 loops=1)

The usual way to attack this is to up the sample size for ANALYZE:

ALTER TABLE ommemberrelation ALTER COLUMN srcobj SET STATISTICS 100;
ALTER TABLE ommemberrelation ALTER COLUMN dstobj SET STATISTICS 100;
-- or even 1000.
ANALYZE ommemberrelation;

Then try EXPLAIN ANALYZE again.


If you can upgrade to 8.1.(3), then the planner can consider paths that
use *both* the indexes on srcobj and dstobj (which would probably be the

business!).

Cheers

Mark

Re: Sequencial scan instead of using index

From
Tom Lane
Date:
"Harry Hehl" <Harry.Hehl@diskstream.com> writes:
>  Nested Loop IN Join  (cost=21432.32..951981.42 rows=204910 width=217)
>    Join Filter: ("outer".objectid = "inner".ref_oid)
>    ->  Append  (cost=0.00..8454.10 rows=204910 width=217)
>          ->  Seq Scan on omfile  (cost=0.00..8428.20 rows=204320
> width=217)
>          ->  Seq Scan on omimagefile omfile  (cost=0.00..12.70 rows=270
> width=217)
>          ->  Seq Scan on omclipfile omfile  (cost=0.00..13.20 rows=320
> width=217)
>    ->  Materialize  (cost=21432.32..21434.32 rows=200 width=16)
>          ->  Unique  (cost=20614.91..21430.12 rows=200 width=16)
>                ->  Sort  (cost=20614.91..21022.52 rows=163041 width=16)
>                      Sort Key: ts.ref_oid
>                      ->  Seq Scan on ts  (cost=0.00..3739.41 rows=163041
> width=16)

> (11 rows)
> Time: 164.232 ms

> So how can I get index scans to work consistently with joins?

It's not the join that's the problem, it's the inheritance.  I recently
improved the planner so that it can consider appended indexscans for an
inheritance tree on the inside of a join, but no pre-8.2 release can do
it.

            regards, tom lane

Re: Sequencial scan instead of using index

From
"Harry Hehl"
Date:
Thanks Mark,

>Hmm - that first query needs to do a sort, so you might want to
experiment with the sort_mem parameter. Could you show us output from
explain analyze for >both the above queries?

Not too concerned about the sort, more about the query performance with
seq scan as the tables size increases.

>At face value, selecting 200000 rows (assuming the estimates are
accurate) may mean that a seqscan is the best plan! But we'll know more
after seeing the >explain analyze...

200000 rows is about right.

I saw Tom's response on the planner improvement in 8.2 but I was still
going to send the explain analyze output.
However I can't show you explain analyze. The postmaster goes to 99% cpu
and stays there. The explain analyze command hangs...

It is starting to look like inheritance does help in modeling the data,
but for searches parallel flat tables that don't use inheritance is
required to get optimum query performance.

Has anyone else come to this conclusion?

Thanks