Thread: Strange behaviour of SELECT ... IN

Strange behaviour of SELECT ... IN

From

Jorge Sarmiento

Date:

26 June 2002, 11:44:54

when I do a:

explain select * from table1 where name in ('JORGE');

NOTICE:  QUERY PLAN:

Index Scan using idx_table1 on table  (cost=0.00..10901.87 rows=3184 width=48)

but if do a:

explain select * from table1 where name in ('JORGE',  'JUAN',  'JOSE');

NOTICE:  QUERY PLAN:

Seq Scan on table1  (cost=0.00..16689.73 rows=9506 width=48)


why??? is there any way to make postgresql use index in the second type of
query?

thnx!

Jorge S.

Re: Strange behaviour of SELECT ... IN

From

Stephan Szabo

Date:

26 June 2002, 14:55:19

On Wed, 26 Jun 2002, Jorge Sarmiento wrote:

> when I do a:
>
> explain select * from table1 where name in ('JORGE');
>
> NOTICE:  QUERY PLAN:
>
> Index Scan using idx_table1 on table  (cost=0.00..10901.87 rows=3184 width=48)
>
> but if do a:
>
> explain select * from table1 where name in ('JORGE',  'JUAN',  'JOSE');
>
> NOTICE:  QUERY PLAN:
>
> Seq Scan on table1  (cost=0.00..16689.73 rows=9506 width=48)
>
>
> why??? is there any way to make postgresql use index in the second type of
> query?

Well, the number of rows in the second is about 3 times greater.  Have
you done a vacuum analyze of the table in question?  How many rows does
the table actually have?

Re: Strange behaviour of SELECT ... IN

From

Jorge Sarmiento

Date:

26 June 2002, 18:20:32

uh...

the first one is an INDEX SCAN, the second one a SEQUENTIAL SCAN.

number of rows in table has nothing to do...

any more ideas?

I have been suggested to use an EXPLICIT JOIN instead of the IN statement, but
how do I change the query I have to use join instead of IN?

thnx!

Jorge S.

On Wednesday 26 June 2002 13:40, Stephan Szabo wrote:
> On Wed, 26 Jun 2002, Jorge Sarmiento wrote:
> > when I do a:
> >
> > explain select * from table1 where name in ('JORGE');
> >
> > NOTICE:  QUERY PLAN:
> >
> > Index Scan using idx_table1 on table  (cost=0.00..10901.87 rows=3184
> > width=48)
> >
> > but if do a:
> >
> > explain select * from table1 where name in ('JORGE',  'JUAN',  'JOSE');
> >
> > NOTICE:  QUERY PLAN:
> >
> > Seq Scan on table1  (cost=0.00..16689.73 rows=9506 width=48)
> >
> >
> > why??? is there any way to make postgresql use index in the second type
> > of query?
>
> Well, the number of rows in the second is about 3 times greater.  Have
> you done a vacuum analyze of the table in question?  How many rows does
> the table actually have?
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: Strange behaviour of SELECT ... IN

From

Stephan Szabo

Date:

26 June 2002, 18:36:59

On Wed, 26 Jun 2002, Jorge Sarmiento wrote:

> the first one is an INDEX SCAN, the second one a SEQUENTIAL SCAN.
>
> number of rows in table has nothing to do...

Sure it does.  As the number of rows selected approaches the number
of rows in the table the cost of the index scan relative to sequential
scan can increase due to the fact that we're reading both the index
and the heap blocks (for row validity testing - and doing it in a
possibly random order) as opposed to a sequential read set of the heap.

Re: Strange behaviour of SELECT ... IN

From

Martijn van Oosterhout

Date:

26 June 2002, 20:08:21

On Wed, Jun 26, 2002 at 04:00:08PM -0400, Jorge Sarmiento wrote:
> uh...
>
> the first one is an INDEX SCAN, the second one a SEQUENTIAL SCAN.
>
> number of rows in table has nothing to do...

Wrong. The number of rows has everything to do with it. If the number of
rows exceeds 50% of the table, a sequential scan is faster than an index
scan.

You can use enable_seq_scan=off to force it. Let us know if the index scan
is actually significantly faster.

Oh, you did use VACUUM ANALYZE right?

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Strange behaviour of SELECT ... IN

From

Jorge Sarmiento

Date:

27 June 2002, 11:49:11

> Wrong. The number of rows has everything to do with it. If the number of
> rows exceeds 50% of the table, a sequential scan is faster than an index
> scan.

Mi database has 3 000 000 registries, my queries are usually of 50 - 100
rows... so index is faster right?

> You can use enable_seq_scan=off to force it. Let us know if the index scan
> is actually significantly faster.

I have inserted that line in postgresql.conf, and received an error.
where should it be put?

> Oh, you did use VACUUM ANALYZE right?

yep...

Re: Strange behaviour of SELECT ... IN

From

Martijn van Oosterhout

Date:

27 June 2002, 19:14:29

On Thu, Jun 27, 2002 at 09:44:13AM -0400, Jorge Sarmiento wrote:
> > Wrong. The number of rows has everything to do with it. If the number of
> > rows exceeds 50% of the table, a sequential scan is faster than an index
> > scan.
>
> Mi database has 3 000 000 registries, my queries are usually of 50 - 100
> rows... so index is faster right?

Well, you'll need to send the EXPLAIN ANALYZE output because my ESP module
doesn't appear to be working. There is no easy explanation for why a query
doesn't use an index scan.

> > You can use enable_seq_scan=off to force it. Let us know if the index scan
> > is actually significantly faster.
>
> I have inserted that line in postgresql.conf, and received an error.
> where should it be put?

In psql: set enable_seq_scan=off

or something like that.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.