Re: PG 7.2b4 bug? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: PG 7.2b4 bug?
Date
Msg-id 6761.1008621321@sss.pgh.pa.us
Whole thread Raw
In response to PG 7.2b4 bug?  (Don Baccus <dhogaza@pacifier.com>)
List pgsql-hackers
Don Baccus <dhogaza@pacifier.com> writes:
> Apparently there's been a change in the way views are handled within 
> PostreSQL.  The following program works fine in earlier versions.

AFAICT, it was just pure, unadulterated luck that it "works" in prior
versions.

In 7.1 I get:

regression=# select test_seq.nextval from multiple_rows;
NOTICE:  Adding missing FROM-clause entry for table "test_seq"nextval
---------      3      4
(2 rows)

regression=# explain select test_seq.nextval from multiple_rows;
NOTICE:  Adding missing FROM-clause entry for table "test_seq"
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..30.00 rows=1000 width=4) ->  Seq Scan on multiple_rows  (cost=0.00..20.00 rows=1000 width=0)
-> Subquery Scan test_seq  (cost=0.00..0.00 rows=0 width=0)       ->  Result  (cost=0.00..0.00 rows=0 width=0)
 

EXPLAIN

In 7.2 I get:

regression=# select test_seq.nextval from multiple_rows;
NOTICE:  Adding missing FROM-clause entry for table "test_seq"nextval
---------      4      4
(2 rows)

regression=# explain select test_seq.nextval from multiple_rows;
NOTICE:  Adding missing FROM-clause entry for table "test_seq"
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..30.01 rows=1000 width=8) ->  Subquery Scan test_seq  (cost=0.00..0.01 rows=1 width=0)       ->
Result  (cost=0.00..0.01 rows=1 width=0) ->  Seq Scan on multiple_rows  (cost=0.00..20.00 rows=1000 width=0)
 

EXPLAIN

The reason it "works" in 7.1 is that the view is the inside of the
nested loop, and so is re-evaluated for each tuple from the outer query.
(The Result node is where the nextval call is actually being evaluated.)
In 7.2 the view has been placed on the outside of the nested loop, so
it's only evaluated once.  The reason for the change is that the 7.2
planner makes the (much more realistic) assumption that evaluating the
Result node isn't free, and so it considers that evaluating the view
multiple times is more expensive than doing it only once.  This can be
demonstrated to be the cause by setting the Result cost to zero; then
the behavior matches 7.1:

regression=# show cpu_tuple_cost ;
NOTICE:  cpu_tuple_cost is 0.01
SHOW VARIABLE
regression=# set cpu_tuple_cost to 0;
SET VARIABLE
regression=# explain select test_seq.nextval from multiple_rows;
NOTICE:  Adding missing FROM-clause entry for table "test_seq"
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..10.00 rows=1000 width=8) ->  Seq Scan on multiple_rows  (cost=0.00..10.00 rows=1000 width=0)
-> Subquery Scan test_seq  (cost=0.00..0.00 rows=1 width=0)       ->  Result  (cost=0.00..0.00 rows=1 width=0)
 

EXPLAIN
regression=# select test_seq.nextval from multiple_rows;
NOTICE:  Adding missing FROM-clause entry for table "test_seq"nextval
---------      5      6
(2 rows)

However, it's pure luck that you get the nested loop expressed this way
and not the other way when the costs come out the same.  I'm surprised
that you consistently got the behavior you wanted in queries more
complex than this test case.

I'd have to say that I consider the code as given to be broken; it's not
a bug for the planner to rearrange this query in any way it sees fit.

It would be nice to accept the Oracle syntax for nextval, but I'm
afraid this hack doesn't get the job done :-(
        regards, tom lane


pgsql-hackers by date:

Previous
From: Don Baccus
Date:
Subject: recursive SQL functions
Next
From: Stephan Szabo
Date:
Subject: Re: PG 7.2b4 bug?