Re: seq scan issue... - Mailing list pgsql-performance

From PFC
Subject Re: seq scan issue...
Date
Msg-id op.t9r4m5nmcigqcu@apollo13.peufeu.com
Whole thread Raw
In response to seq scan issue...  (kevin kempter <kevin@kevinkempterllc.com>)
List pgsql-performance
> - why am I still getting a seq scan ?

You'll seq scan tmp1 obviously, and also the other table since you fetch a
very large part of it in the process.
It's the only way to do this query since there is no WHERE to restrict the
number of rows and the DISTINCT applies on columns from both tables.

You might want to simplify your query. For instance perhaps you can get
pf.segment_id out of the DISTINCT, in which case you can put the distinct
in a subquery on tmp1.

>
> Thanks in advance.
>
>
>
>
>
>
> ============
> Explain PLan
> ============
>
> explain
> select distinct
> tmp1.affiliate_id,
> tmp1.name,
> tmp1.description,
> tmp1.create_dt,
> tmp1.playback_device_id,
> pf.segment_id
> from
> aff_id_tmp1 tmp1,
> playback_fragment pf
> where
> tmp1.playback_device_id = pf.playback_device_id ;
>
>
>   Unique  (cost=2966361.56..3194555.91 rows=10104496 width=97)
>     ->  Sort  (cost=2966361.56..2998960.76 rows=13039677 width=97)
>           Sort Key: tmp1.affiliate_id, tmp1.name, tmp1.description,
> tmp1.create_dt,
> tmp1.playback_device_id, pf.segment_id
>           ->  Hash Join  (cost=23925.45..814071.14 rows=13039677
> width=97)
>                 Hash Cond: (pf.playback_device_id =
> tmp1.playback_device_id)
>                 ->  Seq Scan on playback_fragment pf
> (cost=0.00..464153.77 rows=130
> 39677 width=16)
>                 ->  Hash  (cost=16031.31..16031.31 rows=631531 width=89)
>                       ->  Seq Scan on aff_id_tmp1 tmp1
> (cost=0.00..16031.31 rows=63
> 1531 width=89)
> (1068 rows)
>
>



pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: SQL Function Slowness, 8.3.0
Next
From: Greg Smith
Date:
Subject: Re: Strange behavior: pgbench and new Linux kernels