Thread: Slow views

Slow views

From
davidn-postgres@rebel.net.au
Date:
I know this has been discussed many times before but I'm not entirely
satisfied with the answer, which I understand is "views are essentially
macros".  Despite that Postgres is producing the correct output, I hope
you'll all agree that the performance issue is serious enough to call
this a "bug."

I'm running Postgres 7.4.3 installed on Fedora Core 1 (i386) from rpms
downloaded from ftp.postgres.org.

I've got a view defined as:

  CREATE VIEW stock_exp_v AS
    SELECT stock_code, barcode, title, supplier.description AS supplier,
           format.description AS format, rating.description AS rating,
           genre_list(stock_code) AS genre, release_date, price, srp
     FROM stock
     LEFT JOIN supplier USING (supplier_code)
     LEFT JOIN format USING (format_code)
     LEFT JOIN rating USING (rating_code);


When I use the view in a join the query takes 52 seconds, for example:

  SELECT trim(stock_code), barcode, title, supplier, format, rating, genre,
         release_date, o.price, o.srp, quantity
  FROM order_lines o
  LEFT JOIN stock_exp_v USING (stock_code);

  btrim  |    barcode    |                title                |        supplier        | format | rating | genre |
release_date| price | srp  | quantity 

---------+---------------+-------------------------------------+------------------------+--------+--------+-------+--------------+-------+------+----------
 B00392S | 9397810039223 | GREATEST WILDLIFE SHOW ON EARTH THE | ROADSHOW ENTERTAINMENT | VHS    |        |       |
2000-06-05  |  1.23 | 1.10 |       15 
 B00392S | 9397810039223 | GREATEST WILDLIFE SHOW ON EARTH THE | ROADSHOW ENTERTAINMENT | VHS    |        |       |
2000-06-05  |  1.23 | 1.10 |       15 
         |               |                                     |                        |        |        |       |
        |  1.23 | 1.10 |        5 
(3 rows)

Time: 52110.369 ms

When I expand the view by hand it takes only 27 milliseconds:

  SELECT trim(stock_code), barcode, title, supplier.description,
         format.description, rating.description, genre_list(o.stock_code),
         release_date, o.price, o.srp, quantity
  FROM order_lines o
  LEFT JOIN stock USING (stock_code)
  LEFT JOIN supplier USING (supplier_code)
  LEFT JOIN format USING (format_code)
  LEFT JOIN rating USING (rating_code);

  btrim  |    barcode    |                title                |      description       | description | description |
genre_list| release_date | price | srp  | quantity 

---------+---------------+-------------------------------------+------------------------+-------------+-------------+------------+--------------+-------+------+----------
 B00392S | 9397810039223 | GREATEST WILDLIFE SHOW ON EARTH THE | ROADSHOW ENTERTAINMENT | VHS         |             |
        | 2000-06-05   |  1.23 | 1.10 |       15 
 B00392S | 9397810039223 | GREATEST WILDLIFE SHOW ON EARTH THE | ROADSHOW ENTERTAINMENT | VHS         |             |
        | 2000-06-05   |  1.23 | 1.10 |       15 
         |               |                                     |                        |             |             |
        |              |  1.23 | 1.10 |        5 
(3 rows)

Time: 26.820 ms


The plans are different; views are clearly not "essentially macros":


videos=> explain select trim(stock_code), barcode, title, supplier, format, rating, genre, release_date, o.price,
o.srp,quantity 
videos-> from order_lines o left join stock_exp_v using (stock_code);
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
 Merge Left Join  (cost=4344.01..4447.40 rows=202 width=217)
   Merge Cond: ("outer"."?column5?" = "inner".stock_code)
   ->  Sort  (cost=1.05..1.06 rows=3 width=45)
         Sort Key: (o.stock_code)::bpchar
         ->  Seq Scan on order_lines o  (cost=0.00..1.03 rows=3 width=45)
   ->  Sort  (cost=4342.95..4393.38 rows=20171 width=188)
         Sort Key: stock_exp_v.stock_code
         ->  Subquery Scan stock_exp_v  (cost=3.85..1711.67 rows=20171 width=188)
               ->  Hash Left Join  (cost=3.85..1509.96 rows=20171 width=125)
                     Hash Cond: ("outer".rating_code = "inner".rating_code)
                     ->  Hash Left Join  (cost=2.79..1233.63 rows=20171 width=135)
                           Hash Cond: ("outer".format_code = "inner".format_code)
                           ->  Hash Left Join  (cost=1.75..930.03 rows=20171 width=146)
                                 Hash Cond: ("outer".supplier_code = "inner".supplier_code)
                                 ->  Seq Scan on stock  (cost=0.00..625.71 rows=20171 width=133)
                                 ->  Hash  (cost=1.60..1.60 rows=60 width=33)
                                       ->  Seq Scan on supplier  (cost=0.00..1.60 rows=60 width=33)
                           ->  Hash  (cost=1.03..1.03 rows=3 width=23)
                                 ->  Seq Scan on format  (cost=0.00..1.03 rows=3 width=23)
                     ->  Hash  (cost=1.05..1.05 rows=5 width=24)
                           ->  Seq Scan on rating  (cost=0.00..1.05 rows=5 width=24)
(21 rows)

Time: 22.585 ms
videos=> explain select trim(stock_code), barcode, title, supplier.description, format.description, rating.description,
genre_list(o.stock_code),release_date, o.price, o.srp, quantity 
videos-> from order_lines o left join stock using (stock_code) left join supplier using (supplier_code) left join
formatusing (format_code) left join rating using (rating_code); 
                                               QUERY PLAN
--------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=24.59..25.04 rows=3 width=132)
   Hash Cond: ("outer".rating_code = "inner".rating_code)
   ->  Hash Left Join  (cost=23.53..23.92 rows=3 width=142)
         Hash Cond: ("outer".format_code = "inner".format_code)
         ->  Merge Left Join  (cost=22.49..22.83 rows=3 width=153)
               Merge Cond: ("outer".supplier_code = "inner".supplier_code)
               ->  Sort  (cost=19.12..19.13 rows=3 width=140)
                     Sort Key: stock.supplier_code
                     ->  Nested Loop Left Join  (cost=0.00..19.09 rows=3 width=140)
                           ->  Seq Scan on order_lines o  (cost=0.00..1.03 rows=3 width=45)
                           ->  Index Scan using stock_pkey on stock  (cost=0.00..6.01 rows=1 width=111)
                                 Index Cond: (("outer".stock_code)::bpchar = stock.stock_code)
               ->  Sort  (cost=3.37..3.52 rows=60 width=33)
                     Sort Key: supplier.supplier_code
                     ->  Seq Scan on supplier  (cost=0.00..1.60 rows=60 width=33)
         ->  Hash  (cost=1.03..1.03 rows=3 width=23)
               ->  Seq Scan on format  (cost=0.00..1.03 rows=3 width=23)
   ->  Hash  (cost=1.05..1.05 rows=5 width=24)
         ->  Seq Scan on rating  (cost=0.00..1.05 rows=5 width=24)
(19 rows)

Time: 29.351 ms


There are 20,000 rows in the stock table.  Relevant DDL is:

create table deal(
    deal_code character(16) not null unique primary key,
    tag varchar(6) not null,
    description varchar(92),
    icon character(6),
    first_available date,
    last_available date
);


create table genre(
    genre_code character(13) not null unique primary key,
    description varchar(20) not null
);

create table rating(
    rating_code character(13) not null unique primary key,
    description varchar(6),
    icon character(6)
);

create table format(
    format_code character(13) not null unique primary key,
    description varchar(6),
    icon character(6)
);

create table supplier(
    supplier_code character(6) not null unique primary key,
    description varchar(30)
);

create table stock(
    stock_code character(12) not null unique primary key,
    barcode varchar(15),
    title varchar(92),
    supplier_code character(6) references supplier(supplier_code) on delete set null,
    format_code character(13) references format(format_code) on delete set null,
    rating_code character(13) references rating(rating_code) on delete set null,
    release_date date,
    price numeric(9,2),
    srp numeric(9,2)
);

create table stock_genre(
    stock_code character(12) not null references stock(stock_code) on delete cascade,
    genre_code character(13) not null references genre(genre_code) on delete cascade
);
create index stock_genre_stock on stock_genre(stock_code);
create index stock_genre_genre on stock_genre(genre_code);

create table stock_deal(
    stock_code character(12) not null references stock(stock_code) on delete cascade,
    deal_code character(16) not null references deal(deal_code) on delete cascade,
    sort_order integer,
    price numeric(9,2),
    srp numeric(9,2),
    unique (deal_code, stock_code)
);
create index stock_deal_stock_code on stock_deal(stock_code);


create table order_lines(
    orderid integer not null references orders(orderid) on delete cascade,
    stock_code varchar(15) references stock(stock_code) on delete set null,
    price numeric(9,2),
    srp numeric(9,2),
    quantity numeric(5)
);
create index order_lines_orderid on order_lines(orderid);


-- returns string listing all of the genres for a stock item
CREATE OR REPLACE FUNCTION genre_list(character) RETURNS varchar AS '
DECLARE
    code ALIAS FOR $1;
    result VARCHAR(100) := '''';
    genre RECORD;
BEGIN
    FOR genre IN SELECT * FROM genre JOIN stock_genre USING (genre_code) WHERE stock_code = code LOOP
        IF result != '''' THEN
            result := result || ''/'';
        END IF;
        result := result || genre.description;
    END LOOP;
    RETURN result;
END;
' LANGUAGE plpgsql IMMUTABLE STRICT;


-- view of the stock table with supplier, format, rating and genre tags expanded
create or replace view stock_exp_v as
    select stock_code, barcode, title, supplier.description as supplier, format.description as format,
        rating.description as rating, genre_list(stock_code) as genre, release_date, price, srp
    from stock
    left join supplier using (supplier_code)
    left join format using (format_code)
    left join rating using (rating_code);

Re: Slow views

From
Stephan Szabo
Date:
On Sat, 10 Jul 2004 davidn-postgres@rebel.net.au wrote:

> I know this has been discussed many times before but I'm not entirely
> satisfied with the answer, which I understand is "views are essentially
> macros".  Despite that Postgres is producing the correct output, I hope
> you'll all agree that the performance issue is serious enough to call
> this a "bug."
>
> I'm running Postgres 7.4.3 installed on Fedora Core 1 (i386) from rpms
> downloaded from ftp.postgres.org.
>
> I've got a view defined as:
>
>   CREATE VIEW stock_exp_v AS
>     SELECT stock_code, barcode, title, supplier.description AS supplier,
>            format.description AS format, rating.description AS rating,
>            genre_list(stock_code) AS genre, release_date, price, srp
>      FROM stock
>      LEFT JOIN supplier USING (supplier_code)
>      LEFT JOIN format USING (format_code)
>      LEFT JOIN rating USING (rating_code);
>
>
> When I use the view in a join the query takes 52 seconds, for example:
>
>   SELECT trim(stock_code), barcode, title, supplier, format, rating, genre,
>          release_date, o.price, o.srp, quantity
>   FROM order_lines o
>   LEFT JOIN stock_exp_v USING (stock_code);
>
> Time: 52110.369 ms
>
> When I expand the view by hand it takes only 27 milliseconds:
>
>   SELECT trim(stock_code), barcode, title, supplier.description,
>          format.description, rating.description, genre_list(o.stock_code),
>          release_date, o.price, o.srp, quantity
>   FROM order_lines o
>   LEFT JOIN stock USING (stock_code)
>   LEFT JOIN supplier USING (supplier_code)
>   LEFT JOIN format USING (format_code)
>   LEFT JOIN rating USING (rating_code);

That's not expanding the view.  Expanding the view would be something
like:

select trim(stock_code), barcode, title, supplier, format, rating, genre,
release_date, o.price, o.srp, quantity
FROM order_lines o
LEFT JOIN
 ( select stock_code, barcode, title, supplier.description as supplier,
   format.description AS format, rating.description AS rating,
   genre_list(stock_code) AS genre, release_date, price, srp
   FROM stock LEFT JOIN supplier USING (supplier_code)
   LEFT JOIN format USING (format_code)
   LEFT JOIN rating USING (rating_code) ) stock_exp_v USING (stock_code)

You also did an optimization, removing the subquery which PostgreSQL
isn't.

The problem is that while I believe it was safe for the query above
because it uses USING throughout, in general it is not necessarily safe,
because the queries may have different results if a join condition in the
view could return true for NULL values in stock.

 For example, I think
A LEFT JOIN B USING (blah) LEFT JOIN C ON (B.bval = C.bval or B.bval is
null)
 is different from
A LEFT JOIN (B LEFT JOIN C ON (B.bval = C.bval or B.bval is null)) USING
(blah)

Re: Slow views

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> You also did an optimization, removing the subquery which PostgreSQL
> isn't.

I suspect the real issue is that the implied join order is not the same.
The view-based query is really

    a LEFT JOIN (b LEFT JOIN c LEFT JOIN d LEFT JOIN e)

while the allegedly equivalent hand expansion is

    a LEFT JOIN b LEFT JOIN c LEFT JOIN d LEFT JOIN e

and since JOIN associates left-to-right, this is not the same thing at
all.  I'm not even convinced that it gives the same end result ...
INNER JOIN is associative but LEFT JOIN is not.

            regards, tom lane

Re: Slow views

From
Tom Lane
Date:
David Newall <davidn-postgres@rebel.net.au> writes:
> It's clearly an optimisation issue:

No, it's an outer-join-semantics issue.

> select * from a join b optimises differently to select * from a join
> (select * from b) as b

The above claim is provably false.  Now, if you throw in some more
tables and sprinkle in a LEFT JOIN instead of just JOIN here and there,
you can get different plans --- but the semantics are not necessarily
the same, either.

            regards, tom lane

Re: Slow views

From
Stephan Szabo
Date:
On Sat, 17 Jul 2004, David Newall wrote:

> On Sat, 2004-07-17 at 13:34, Tom Lane wrote:
> > I suspect the real issue is that the implied join order is not the same.
>
> With respect, the real issue is that using the view takes 100 times
> longer than not using it.
>
> > The view-based query is really
> >
> >     a LEFT JOIN (b LEFT JOIN c LEFT JOIN d LEFT JOIN e)
> >
> > while the allegedly equivalent hand expansion is
> >
> >     a LEFT JOIN b LEFT JOIN c LEFT JOIN d LEFT JOIN e
>
> Reversing the two terms, ie view right join table, gives the same (slow)
> result.  Adding a sub-select to the expansion, as Stephan said, gives

I think you're still fundamentally misunderstanding that the two queries
above are not always the same.  The reason you got better results from
your version is that a left join b was faster, however, if the join
conditions in your view were more complicated, your version would give the
wrong results. Unless/until we know for certain under what conditions we
can reorder the joins, we can't do so without causing bugs in other
queries.

Re: Slow views

From
David Newall
Date:
On Sat, 2004-07-17 at 13:34, Tom Lane wrote:
> I suspect the real issue is that the implied join order is not the same.

With respect, the real issue is that using the view takes 100 times
longer than not using it.

> The view-based query is really
>
>     a LEFT JOIN (b LEFT JOIN c LEFT JOIN d LEFT JOIN e)
>
> while the allegedly equivalent hand expansion is
>
>     a LEFT JOIN b LEFT JOIN c LEFT JOIN d LEFT JOIN e

Reversing the two terms, ie view right join table, gives the same (slow)
result.  Adding a sub-select to the expansion, as Stephan said, gives
takes the same time as using a view.  It's clearly an optimisation
issue:

select * from a join b optimises differently to select * from a join
(select * from b) as b

It's a pity because views express the code more clearly, but I can't use
them because they're just too slow.

Re: Slow views

From
David Newall
Date:
On Sat, 2004-07-17 at 14:30, Tom Lane wrote:
> David Newall <davidn-postgres@rebel.net.au> writes:
> > select * from a join b optimises differently to select * from a join
> > (select * from b) as b
>
> The above claim is provably false.  Now, if you throw in some more
> tables and sprinkle in a LEFT JOIN instead of just JOIN here and there,
> you can get different plans --- but the semantics are not necessarily
> the same, either.

You're right.  It's select * from a left join b left join c that
optimises differently to select * from a left join (select * from b left
join c).  I was clumsy with my language but I think my meaning was clear
within the context of this thread.  Anyway, back to my point: I'd like
to use a view but I can't because it's too slow, and that's a pity.