Re: Query Optimizer Failure / Possible Bug - Mailing list pgsql-performance

From Hannes Dorbath
Subject Re: Query Optimizer Failure / Possible Bug
Date
Msg-id 4260e649$0$24290$8fe63b2a@news.disputo.net
Whole thread Raw
In response to Re: Query Optimizer Failure / Possible Bug  (Hannes Dorbath <light@theendofthetunnel.de>)
List pgsql-performance
Some people on the #postgresql irc channel pointed out that it's a known
issue.

http://www.qaix.com/postgresql-database-development/246-557-select-based-on-function-result-read.shtml

A more simple testcase is below. Adding OFFSET 0 to the inner query does
indeed fix it in my case.


SELECT
   tmp.user_id AS foo,
   tmp.user_id AS bar,
   tmp.user_id AS baz
FROM
   (
     SELECT
       u.user_id
     FROM
       users u
   ) AS tmp;



Seq Scan on users  (cost=0.00..1.53 rows=53 width=4) (actual
time=0.230..0.233 rows=1 loops=1)
Total runtime: 0.272 ms


---------------------------


SELECT
   tmp.user_id AS foo,
   tmp.user_id AS bar,
   tmp.user_id AS baz
FROM
   (
     SELECT
       (SELECT 1) AS user_id
     FROM
       users u
   ) AS tmp;



   Seq Scan on users u  (cost=0.03..1.56 rows=53 width=0) (actual
time=0.216..0.219 rows=1 loops=1)
   InitPlan
     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.004..0.006 rows=1 loops=1)
     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.002..0.004 rows=1 loops=1)
     ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.002..0.003 rows=1 loops=1)
Total runtime: 0.270 ms


---------------------------


SELECT
   tmp.user_id AS foo,
   tmp.user_id AS bar,
   tmp.user_id AS baz
FROM
   (
     SELECT
       (SELECT 1) AS user_id
     FROM
       users u
     OFFSET 0
   ) AS tmp;


Subquery Scan tmp  (cost=0.01..1.03 rows=1 width=4) (actual
time=0.032..0.042 rows=1 loops=1)
   ->  Limit  (cost=0.01..1.02 rows=1 width=0) (actual time=0.026..0.033
rows=1 loops=1)
         InitPlan
           ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.003..0.004 rows=1 loops=1)
         ->  Seq Scan on users u  (cost=0.00..1.01 rows=1 width=0)
(actual time=0.022..0.027 rows=1 loops=1)
Total runtime: 0.090 ms




On 04.04.2005 17:18, Hannes Dorbath wrote:
> Mhh. I have no clue about the internals of PostgreSQL and query planing,
> but to me as user this should really be a thing the optimizer has to
> work out..
>
>
> On 03.04.2005 10:01, PFC wrote:
>
>>
>>     Noticed this problem,too.
>>     You can always make the calculation you want done once inside a
>> set  returning function so it'll behave like a table, but that's ugly.
>>
>> On Mon, 28 Mar 2005 16:14:44 +0200, Hannes Dorbath
>> <light@theendofthetunnel.de> wrote:
>>
>>> hm, a few days and not a single reply :|
>>>
>>> any more information needed? test data? simplified test case? anything?
>>>
>>>
>>> thanks
>>>
>>>
>>> Hannes Dorbath wrote:
>>>
>>>> The query and the corresponding EXPLAIN is at
>>>>  http://hannes.imos.net/query.txt
>>>>  I'd like to use the column q.replaced_serials for multiple
>>>> calculations
>>>> in the SELECT clause, but every time it is referenced there in some way
>>>> the whole query in the FROM clause returning q is executed again.
>>>>  This doesn't make sense to me at all and eats performance.
>>>>  If this wasn't clear enough, for every
>>>>  q.replaced_serials <insert_random_calculation> AS some_column
>>>>  in the SELECT clause there is new block of
>>>>  ---------------------------------------------------------------
>>>> ->  Aggregate  (cost=884.23..884.23 rows=1 width=0)
>>>>       ->  Nested Loop  (cost=0.00..884.23 rows=1 width=0)
>>>>             ->  Index Scan using ix_rma_ticket_serials_replace on
>>>>                         rma_ticket_serials rts  (cost=0.00..122.35
>>>>                         rows=190 width=4)
>>>>                   Index Cond: ("replace" = false)
>>>>             ->  Index Scan using pk_serials on serials s
>>>>                           (cost=0.00..3.51 rows=1 width=4)
>>>>                   Index Cond: (s.serial_id = "outer".serial_id)
>>>>                   Filter: ((article_no = $0) AND (delivery_id = $1))
>>>> ---------------------------------------------------------------
>>>>  in the EXPLAIN result.
>>>>  For those who wonder why I do this FROM (SELECT...). I was
>>>> searching  for
>>>> a way to use the result of an subselect for multiple calculations in
>>>> the
>>>> SELECT clause and return that calculation results as individual
>>>> columns.
>>>>  I tested a bit further and found out that PG behaves the same in
>>>> case q
>>>> is a view. This makes me wonder how efficient the optimizer can work
>>>> with views - or even worse - nested views.
>>>>  Tested and reproduced on PG 7.4.1 linux and 8.0.0 win32.
>>>>   Thanks in advance,
>>>> Hannes Dorbath
>>>
>>>
>>>
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 5: Have you checked our extensive FAQ?
>>
>>               http://www.postgresql.org/docs/faq
>>

pgsql-performance by date:

Previous
From: Kevin Brown
Date:
Subject: Re: How to improve db performance with $7K?
Next
From: Enrico Weigelt
Date:
Subject: Re: immutable functions vs. join for lookups ?