Thread: optimizer ignoring primary key and doing sequence scan

optimizer ignoring primary key and doing sequence scan

From
"Chris Hoy"
Date:
Hi

I have a number of tables in my database where the queries appear to
ignoring the primary key and doing a seq scan instead, however other tables
appear to be fine. I can see any difference between them.

Is their any way of determination why the otimizer isn't picking up the
primary key?

Version 8.3.3 windows

An example of a non working table is:

select * from industries where industryid = 1;
"Seq Scan on industries  (cost=0.00..1.02 rows=1 width=116) (actual
time=0.011..0.013 rows=1 loops=1)"

CREATE TABLE industries
(
  industryid serial NOT NULL,
  industryname character varying(45) NOT NULL,
  prn_key integer,
  CONSTRAINT pk_industry PRIMARY KEY (industryid),
  CONSTRAINT un_industry UNIQUE (industryname)
)

A example of woring one

explain select * from currencies where currencyid = 1
"Index Scan using pk_currencyid on currencies  (cost=0.00..8.27 rows=1
width=196)"
CREATE TABLE currencies
(
  currencyid serial NOT NULL,
  currencyname character varying(85) NOT NULL,
  prn_key integer,
  CONSTRAINT pk_currencyid PRIMARY KEY (currencyid)
)


Chris

No virus found in this outgoing message.
Checked by AVG - http://www.avg.com
Version: 8.0.138 / Virus Database: 270.4.10/1551 - Release Date: 14/07/2008
06:49





Re: optimizer ignoring primary key and doing sequence scan

From
"Scott Marlowe"
Date:
On Mon, Jul 14, 2008 at 1:54 PM, Chris Hoy <chris.hoy@hoyc.fsnet.co.uk> wrote:
>
> Hi
>
> I have a number of tables in my database where the queries appear to
> ignoring the primary key and doing a seq scan instead, however other tables
> appear to be fine. I can see any difference between them.
>
> Is their any way of determination why the otimizer isn't picking up the
> primary key?
>
> Version 8.3.3 windows
>
> An example of a non working table is:
>
> select * from industries where industryid = 1;
> "Seq Scan on industries  (cost=0.00..1.02 rows=1 width=116) (actual
> time=0.011..0.013 rows=1 loops=1)"

According to this there's only one row in the table.  why WOULD
postgresql use an index when it can just scan the one row table in a
split second.

Re: optimizer ignoring primary key and doing sequence scan

From
Edoardo Panfili
Date:
Scott Marlowe ha scritto:
> On Mon, Jul 14, 2008 at 1:54 PM, Chris Hoy <chris.hoy@hoyc.fsnet.co.uk> wrote:
>> Hi
>>
>> I have a number of tables in my database where the queries appear to
>> ignoring the primary key and doing a seq scan instead, however other tables
>> appear to be fine. I can see any difference between them.
>>
>> Is their any way of determination why the otimizer isn't picking up the
>> primary key?
>>
>> Version 8.3.3 windows
>>
>> An example of a non working table is:
>>
>> select * from industries where industryid = 1;
>> "Seq Scan on industries  (cost=0.00..1.02 rows=1 width=116) (actual
>> time=0.011..0.013 rows=1 loops=1)"
>
> According to this there's only one row in the table.  why WOULD
> postgresql use an index when it can just scan the one row table in a
> split second.
>
I agree with you that it can depend on the size of the table but where
you can read that the table contains only one row?

I try with my table (39910 rows, no index on column note)
explain analyze select * from table where note='single example';

Seq Scan on table  (cost=0.00..2458.88 rows=13 width=327) (actual
time=10.901..481.896 rows=1 loops=1)

On the postgres manual I can find "Estimated number of rows output by
this plan node (Again, only if executed to completion.)" regarding the
third parameter of the explain

Where is my error?

Edoardo


Re: optimizer ignoring primary key and doing sequence scan

From
Ragnar
Date:
On þri, 2008-07-15 at 08:19 +0200, Edoardo Panfili wrote:
> Scott Marlowe ha scritto:
> > On Mon, Jul 14, 2008 at 1:54 PM, Chris Hoy <chris.hoy@hoyc.fsnet.co.uk> wrote:
> >>
> >> select * from industries where industryid = 1;
> >> "Seq Scan on industries  (cost=0.00..1.02 rows=1 width=116) (actual
> >> time=0.011..0.013 rows=1 loops=1)"
> >
> > According to this there's only one row in the table.  why WOULD
> > postgresql use an index when it can just scan the one row table in a
> > split second.
> >
> I agree with you that it can depend on the size of the table but where
> you can read that the table contains only one row?

it does not really say 1 row, but you can infer from the estimated cost,
that the table is only 1 block (cost=0.00..1.02). that is the smallest
read unit.
using an index would cost 2 random reads.

> I try with my table (39910 rows, no index on column note)
> explain analyze select * from table where note='single example';
>
> Seq Scan on table  (cost=0.00..2458.88 rows=13 width=327) (actual
> time=10.901..481.896 rows=1 loops=1)

surely this is not the same table

gnari



Re: optimizer ignoring primary key and doing sequence scan

From
"Scott Marlowe"
Date:
On Tue, Jul 15, 2008 at 12:19 AM, Edoardo Panfili <edoardo@aspix.it> wrote:
> Scott Marlowe ha scritto:
>>
>> On Mon, Jul 14, 2008 at 1:54 PM, Chris Hoy <chris.hoy@hoyc.fsnet.co.uk>
>> wrote:
>>>
>>> Hi
>>>
>>> I have a number of tables in my database where the queries appear to
>>> ignoring the primary key and doing a seq scan instead, however other
>>> tables
>>> appear to be fine. I can see any difference between them.
>>>
>>> Is their any way of determination why the otimizer isn't picking up the
>>> primary key?
>>>
>>> Version 8.3.3 windows
>>>
>>> An example of a non working table is:
>>>
>>> select * from industries where industryid = 1;
>>> "Seq Scan on industries  (cost=0.00..1.02 rows=1 width=116) (actual
>>> time=0.011..0.013 rows=1 loops=1)"
>>
>> According to this there's only one row in the table.  why WOULD
>> postgresql use an index when it can just scan the one row table in a
>> split second.
>>
> I agree with you that it can depend on the size of the table but where you
> can read that the table contains only one row?

Actually I meant to write one page or block there, not row.  But it's
the same diff really.  1 Row or 100, if they fit in an 8k block
together, pgsql isn't going to use an index to look them up.  It kinda
knows which block they'll be in ahead of time.

> I try with my table (39910 rows, no index on column note)
> explain analyze select * from table where note='single example';
>
> Seq Scan on table  (cost=0.00..2458.88 rows=13 width=327) (actual
> time=10.901..481.896 rows=1 loops=1)

Yeah that'll take 481 blocks scanned to get the answer.

> On the postgres manual I can find "Estimated number of rows output by this
> plan node (Again, only if executed to completion.)" regarding the third
> parameter of the explain
>
> Where is my error?

I'm not seeing an error, just a possible misunderstanding of pgsql
plans queries.  Without an index above on note, it has to do a
sequential scan there's no index to shorten up the work.

The estimated number of rows are how many pgsql thinks it will get
back when it runs the query, based on the statistics it has stored
from the last time analyze was run.

The actual number of rows listed in the explain analyze output is the
actual number of rows, not the estimated number...