Thread: How to avoid "Seq Scans"?

How to avoid "Seq Scans"?

From
Vincenzo Romano
Date:
Hi all.

In PG 8.2.4 I have a 4+M rows table like this:

create table t (
  f1 bigint,
  f2 text,
  f3 text
);

create index i_t_1 on t( f1 );
create index i_t_2 on t( f2 );
create index i_t_2 on t( f3 );

I'd need to write a stored function that should do the
following:

for rec in select * from t order by f2,f2 loop
...
end loop;

This loop is increadibly slow. Infact the friendly explain tells me
that:

test=# explain select * from t order by f2,f3;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 Sort  (cost=958786.20..970734.55 rows=4779338 width=28)
   Sort Key: f2,f3
   ->  Seq Scan on t  (cost=0.00..85501.38 rows=4779338 width=28)

I'd like to know a hint about a technicque to avoid the sequential
scan!

Thanks.

--
Vincenzo Romano
--
Maybe Computer will never become as intelligent as Humans.
For sure they won't ever become so stupid. [VR-1988]

Re: How to avoid "Seq Scans"?

From
Richard Huxton
Date:
Vincenzo Romano wrote:
> Hi all.
>
> In PG 8.2.4 I have a 4+M rows table like this:

> I'd need to write a stored function that should do the
> following:
>
> for rec in select * from t order by f2,f2 loop
> ...
> end loop;

>    ->  Seq Scan on t  (cost=0.00..85501.38 rows=4779338 width=28)
>
> I'd like to know a hint about a technicque to avoid the sequential
> scan!

But you're fetching all the rows - what other way would be faster?

--
   Richard Huxton
   Archonet Ltd

Re: How to avoid "Seq Scans"?

From
"A. Kretschmer"
Date:
am  Wed, dem 29.08.2007, um 11:15:21 +0200 mailte Vincenzo Romano folgendes:
> This loop is increadibly slow. Infact the friendly explain tells me
> that:
>
> test=# explain select * from t order by f2,f3;
>                                    QUERY PLAN
> ---------------------------------------------------------------------------------
>  Sort  (cost=958786.20..970734.55 rows=4779338 width=28)
>    Sort Key: f2,f3
>    ->  Seq Scan on t  (cost=0.00..85501.38 rows=4779338 width=28)
>
> I'd like to know a hint about a technicque to avoid the sequential
> scan!

A 'select foo from bar' without a WHERE-condition forces a seq-scan
because _YOU_ want the whole table.


Andreas
--
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

Re: How to avoid "Seq Scans"?

From
Martijn van Oosterhout
Date:
On Wed, Aug 29, 2007 at 11:15:21AM +0200, Vincenzo Romano wrote:
> This loop is increadibly slow. Infact the friendly explain tells me
> that:

Is it wrong? Have you have run with seq_scan disabled to see if an
index scan is actually faster? If so, then perhaps your
random+pagE_cost needs adjusting.

Also, your query can't use an index anyway, for that you'd need an
index on (f2,f3).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

Re: How to avoid "Seq Scans"?

From
Vincenzo Romano
Date:
On Wednesday 29 August 2007 11:20:53 Richard Huxton wrote:
> Vincenzo Romano wrote:
> > Hi all.
> >
> > In PG 8.2.4 I have a 4+M rows table like this:
> >
> > I'd need to write a stored function that should do the
> > following:
> >
> > for rec in select * from t order by f2,f2 loop
> > ...
> > end loop;
> >
> >    ->  Seq Scan on t  (cost=0.00..85501.38 rows=4779338 width=28)
> >
> > I'd like to know a hint about a technicque to avoid the
> > sequential scan!
>
> But you're fetching all the rows - what other way would be faster?

Definitely right.

I'm trying to investigate the strange (to me) bahaviour of a couple of
stored procedure.
The outer one is in PL/PGSQL and has the above mentioned loop.
The inner one, called into the loop, is an "SQL stable strict"
function.
The outer "empty" loop takes less than 16 seconds.
The inner function takes between 10 and 50 msec when called by itself.
The inner+outer function needs more than 45 minutes just to run over
the first 10 thousands lines.

The inner function is actually a select over another table (16+M rows)
and always shows very good timing when execute by itself.
What I argue now is that something wrong happens with the query
planner when the inner function gets called by the outer one.

Is there any confirmation (and possibly workaround) for this
behaviour?

--
Vincenzo Romano
--
Maybe Computer will never become as intelligent as Humans.
For sure they won't ever become so stupid. [VR-1988]