Re: [GENERAL] Equivalence Classes when using IN - Mailing list pgsql-general

From David Rowley
Subject Re: [GENERAL] Equivalence Classes when using IN
Date
Msg-id CAKJS1f8EhaRwra-mc4bm4Q4ButmTi-7W4kn6wicB5SySb5bvnw@mail.gmail.com
Whole thread Raw
In response to Re: [GENERAL] Equivalence Classes when using IN  (Kim Rose Carlsen <krc@hiper.dk>)
Responses Re: [GENERAL] Equivalence Classes when using IN  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [GENERAL] Equivalence Classes when using IN  (Kim Rose Carlsen <krc@hiper.dk>)
List pgsql-general
On 9 October 2017 at 22:39, Kim Rose Carlsen <krc@hiper.dk> wrote:
>             EXPLAIN ANALYZE
>              SELECT *
>                FROM customer
>                JOIN view_customer
>                  ON customer.customer_id = view_customer.customer_id
>               WHERE age < 20;
>
>                                                                  QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------
>  Nested Loop Left Join  (cost=139.00..10392.96 rows=668 width=16) (actual time=0.528..35.120 rows=200 loops=1)
>    Join Filter: (c.customer_id = product.customer_id)
>    Rows Removed by Join Filter: 199900
>    ->  Nested Loop  (cost=0.28..199.21 rows=334 width=12) (actual time=0.075..1.146 rows=100 loops=1)
>          ->  Seq Scan on customer  (cost=0.00..21.51 rows=334 width=8) (actual time=0.067..0.282 rows=100 loops=1)
>                Filter: (age < 20)
>                Rows Removed by Filter: 901
>          ->  Index Only Scan using customer_pkey on customer c  (cost=0.28..0.53 rows=1 width=4) (actual
time=0.006..0.006rows=1 loops=100)
 
>                Index Cond: (customer_id = customer.customer_id)
>                Heap Fetches: 100
>    ->  Materialize  (cost=138.73..173.75 rows=2001 width=8) (actual time=0.005..0.130 rows=2001 loops=100)
>          ->  Sort  (cost=138.73..143.73 rows=2001 width=8) (actual time=0.448..0.588 rows=2001 loops=1)
>                Sort Key: product.customer_id, product.product_id
>                Sort Method: quicksort  Memory: 142kB
>                ->  Seq Scan on product  (cost=0.00..29.01 rows=2001 width=8) (actual time=0.006..0.215 rows=2001
loops=1)
>  Planning time: 0.214 ms
>  Execution time: 35.284 ms

You would benefit from adding the age column to view_customer, or at
least consider having some view which contains all the columns you'll
ever need from those tables and if you need special views with only a
subset of columns due to some software doing "select * from
viewname;", then you could just create some. Joining to the same table
again seems like a bit of a waste of effort for the planner and
executor.  I'd assume customer_id is the PRIMARY KEY of customer and
is unique.

It's not all that clear what your view is doing here. Confusingly
there's a Sort in the plan, yet nothing in the query asked for that,
so I guess that the view must have an ORDER BY. If you get rid of that
the planner would likely use an index on product (customer_id) to
parameterise the nested loop, at least, it likely would, if you have
one.

It's pretty bad practice to have ORDER BY in views. I kinda wish we
didn't even allow it, but that ship sailed many years ago...

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Kim Rose Carlsen
Date:
Subject: Re: [GENERAL] Equivalence Classes when using IN
Next
From: Durumdara
Date:
Subject: [GENERAL] Error: "cached plan must not change result type"