Thread: Sequential Scan Index Bug

Sequential Scan Index Bug

From
"Gabriel Weinberg"
Date:
I have a table with an integer column with about 10M rows in it.

This column has an index (btree).

When I try to select a row using this column with an integer, e.g. select *
from table where id=4, it always uses the index.  However, if I select try
to select a row using this column with a decimal, e.g. select * from table
where id=4.343, it skips the index entirely and does a sequential scan of
the table.

I am using v7.4.2 on Freebsd 4.9.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu

Re: Sequential Scan Index Bug

From
Bruno Wolff III
Date:
On Sat, Apr 03, 2004 at 13:51:56 -0500,
  Gabriel Weinberg <yegg@alum.mit.edu> wrote:
>
> I have a table with an integer column with about 10M rows in it.
>
> This column has an index (btree).
>
> When I try to select a row using this column with an integer, e.g. select *
> from table where id=4, it always uses the index.  However, if I select try
> to select a row using this column with a decimal, e.g. select * from table
> where id=4.343, it skips the index entirely and does a sequential scan of
> the table.
>
> I am using v7.4.2 on Freebsd 4.9.

Depending on what you want to do, you probably either want to cast the
value to an int explicitly or combine that with a test (using a stable
function) to make sure the number is actually an integer.

Re: Sequential Scan Index Bug

From
"Gabriel Weinberg"
Date:
Yes, I thought I had done that, but now that I figured out what was going
on, I did it for all cases.  So it is no longer occurring for me, but it
still seems like a bug in PostgreSQL.  I would expect it to throw an error
immediately, instead of scanning the table for a value of a different type.
In my case, the table is huge, so it really put a hamper on the system.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu=20

-----Original Message-----
From: Bruno Wolff III [mailto:bruno@wolff.to]=20
Sent: Wednesday, April 07, 2004 1:38 AM
To: Gabriel Weinberg
Cc: pgsql-bugs@postgresql.org
Subject: Re: Sequential Scan Index Bug


On Sat, Apr 03, 2004 at 13:51:56 -0500,
  Gabriel Weinberg <yegg@alum.mit.edu> wrote:
>=20
> I have a table with an integer column with about 10M rows in it.
>=20
> This column has an index (btree).
>=20
> When I try to select a row using this column with an integer, e.g.=20
> select * from table where id=3D4, it always uses the index.  However, if=
=20
> I select try to select a row using this column with a decimal, e.g.=20
> select * from table where id=3D4.343, it skips the index entirely and=20
> does a sequential scan of the table.
>=20
> I am using v7.4.2 on Freebsd 4.9.

Depending on what you want to do, you probably either want to cast the value
to an int explicitly or combine that with a test (using a stable
function) to make sure the number is actually an integer.

Re: Sequential Scan Index Bug

From
Stephan Szabo
Date:
On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

> Yes, I thought I had done that, but now that I figured out what was going
> on, I did it for all cases.  So it is no longer occurring for me, but it
> still seems like a bug in PostgreSQL.  I would expect it to throw an error
> immediately, instead of scanning the table for a value of a different type.

But what if you said id = 4.0?  Would you want it to find the id=4 row?

Re: Sequential Scan Index Bug

From
Stephan Szabo
Date:
On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

> Presumably, but that is not what I was doing.

I was responding to the part that was: "I would expect it to throw an
error immediately, instead of scanning the table for a value of a
different type." If say intcol = 4.345 is an error, is intcol = 4.0 an
error as well given that 4.345 and 4.0 are presumably the same type?  I'm
not sure what error you would expect.

Re: Sequential Scan Index Bug

From
Stephan Szabo
Date:
On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

> I would expect if I did intcol = 4.35 or intcol = 'abc', it would throw a
> type mismatch error.

Well, in practice, the former is AFAICS required to do something "right"
by the SQL spec because it explicitly states that all all numbers are
mutually comparable, so erroring would technically be against spec.  It's
possible that we could do something more intelligent than the current
behavior for that case but I can't come up with a particularly good
choice that wouldn't have bad effects elsewhere.

Re: Sequential Scan Index Bug

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> It's possible that we could do something more intelligent than the current
> behavior for that case but I can't come up with a particularly good
> choice that wouldn't have bad effects elsewhere.

In theory we could recognize that "integer_column = 4.35" will yield a
constant false.  If the expression were replaced by "false" during
constant folding then the planner would produce a short-circuited plan
that won't actually examine the table.

In practice, though, I don't see any way to do that that wouldn't be a
horrendous kluge.  I don't like putting special-case type-specific
knowledge into the planner; yet here we have knowledge that's not only
type-specific but specific to the combination of two different types.
Yech.  I don't see any hope for a catalog-driven, extensible approach
for such things.

You'd also have to ask questions about whether the planner time spent
testing for such cases would really be a good investment...

            regards, tom lane

Re: Sequential Scan Index Bug

From
"Gabriel Weinberg"
Date:
I would expect if I did intcol =3D 4.35 or intcol =3D 'abc', it would throw=
 a
type mismatch error.

Now whether you want to be clever and make intcol =3D 4.0 not throw an error
and instead convert it to intcol =3D 4 is another thing.  In that case, I
still wouldn't scan the table.  And now that I think of it, 4 is different
than 4.0 in terms of precision.  If you are querying an intcol, maybe that
doesn't matter, but I would probably err on the side of precaution and throw
a type mismatch error as well.=20=20

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu=20

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]=20
Sent: Wednesday, April 07, 2004 12:58 PM
To: Gabriel Weinberg
Cc: 'Bruno Wolff III'; pgsql-bugs@postgresql.org
Subject: RE: [BUGS] Sequential Scan Index Bug



On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

> Presumably, but that is not what I was doing.

I was responding to the part that was: "I would expect it to throw an error
immediately, instead of scanning the table for a value of a different type."
If say intcol =3D 4.345 is an error, is intcol =3D 4.0 an error as well giv=
en
that 4.345 and 4.0 are presumably the same type?  I'm not sure what error
you would expect.

Re: Sequential Scan Index Bug

From
"Gabriel Weinberg"
Date:
Presumably, but that is not what I was doing.

I was taking a number from a user, which was supposed to be divisible by an
integer.  Sometimes the user left off the last digit when typing in the
number or otherwise typed it in wrong, rendering the input not divisible by
that number.

So it was looking for something like 4.345.

Gabriel

_________________
Gabriel Weinberg
yegg@alum.mit.edu

-----Original Message-----
From: Stephan Szabo [mailto:sszabo@megazone.bigpanda.com]
Sent: Wednesday, April 07, 2004 11:35 AM
To: Gabriel Weinberg
Cc: 'Bruno Wolff III'; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Sequential Scan Index Bug



On Wed, 7 Apr 2004, Gabriel Weinberg wrote:

> Yes, I thought I had done that, but now that I figured out what was
> going on, I did it for all cases.  So it is no longer occurring for
> me, but it still seems like a bug in PostgreSQL.  I would expect it to
> throw an error immediately, instead of scanning the table for a value
> of a different type.

But what if you said id = 4.0?  Would you want it to find the id=4 row?