Re: Performance issues - Mailing list pgsql-performance

From Jim Nasby
Subject Re: Performance issues
Date
Msg-id 55076FFB.9090105@BlueTreble.com
Whole thread Raw
In response to Re: Performance issues  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Performance issues
List pgsql-performance
On 3/16/15 3:59 PM, Tomas Vondra wrote:
> On 16.3.2015 20:43, Jim Nasby wrote:
>> On 3/13/15 7:12 PM, Tomas Vondra wrote:
>>> (4) I suspect many of the relations referenced in the views are not
>>>       actually needed in the query, i.e. the join is performed but
>>>       then it's just discarded because those columns are not used.
>>>       Try to simplify the views as much has possible - remove all the
>>>       tables that are not really necessary to run the query. If two
>>>       queries need different tables, maybe defining two views is
>>>       a better approach.
>>
>> A better alternative with multi-purpose views is to use an outer
>> join instead of an inner join. With an outer join if you ultimately
>> don't refer to any of the columns in a particular table Postgres will
>> remove the table from the query completely.
>
> Really? Because a quick test suggests otherwise:
>
> db=# create table test_a (id int);
> CREATE TABLE
> db=# create table test_b (id int);
> CREATE TABLE
> db=# explain select test_a.* from test_a left join test_b using (id);
>                                QUERY PLAN
> ----------------------------------------------------------------------
>   Merge Left Join  (cost=359.57..860.00 rows=32512 width=4)
>     Merge Cond: (test_a.id = test_b.id)
>     ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
>           Sort Key: test_a.id
>           ->  Seq Scan on test_a  (cost=0.00..35.50 rows=2550 width=4)
>     ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
>           Sort Key: test_b.id
>           ->  Seq Scan on test_b  (cost=0.00..35.50 rows=2550 width=4)
> (8 rows)
>
> Also, how would that work with duplicate rows in the referenced table?

Right, I neglected to mention that the omitted table must also be unique
on the join key:

decibel@decina.attlocal=# create table a(a_id serial primary key);
CREATE TABLE
decibel@decina.attlocal=# create table b(a_id int);
CREATE TABLE
decibel@decina.attlocal=# explain analyze select a.* from a left join b
using(a_id);
                                                 QUERY PLAN

-----------------------------------------------------------------------------------------------------------
  Hash Right Join  (cost=67.38..137.94 rows=2550 width=4) (actual
time=0.035..0.035 rows=0 loops=1)
    Hash Cond: (b.a_id = a.a_id)
    ->  Seq Scan on b  (cost=0.00..35.50 rows=2550 width=4) (never executed)
    ->  Hash  (cost=35.50..35.50 rows=2550 width=4) (actual
time=0.002..0.002 rows=0 loops=1)
          Buckets: 4096  Batches: 1  Memory Usage: 32kB
          ->  Seq Scan on a  (cost=0.00..35.50 rows=2550 width=4)
(actual time=0.001..0.001 rows=0 loops=1)
  Planning time: 0.380 ms
  Execution time: 0.086 ms
(8 rows)

decibel@decina.attlocal=# alter table b add primary key(a_id);
ALTER TABLE
decibel@decina.attlocal=# explain analyze select a.* from a left join b
using(a_id);
                                           QUERY PLAN

-----------------------------------------------------------------------------------------------
  Seq Scan on a  (cost=0.00..35.50 rows=2550 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
  Planning time: 0.247 ms
  Execution time: 0.029 ms
(3 rows)

decibel@decina.attlocal=# alter table a drop constraint a_pkey;
ALTER TABLE
decibel@decina.attlocal=# explain analyze select a.* from a left join b
using(a_id);
                                           QUERY PLAN

-----------------------------------------------------------------------------------------------
  Seq Scan on a  (cost=0.00..35.50 rows=2550 width=4) (actual
time=0.001..0.001 rows=0 loops=1)
  Planning time: 0.098 ms
  Execution time: 0.011 ms
(3 rows)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


pgsql-performance by date:

Previous
From: Gunnlaugur Thor Briem
Date:
Subject: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Next
From: Vivekanand Joshi
Date:
Subject: Re: Performance issues