Re: Optimize common expressions in projection evaluation - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Optimize common expressions in projection evaluation
Date
Msg-id CAFj8pRD=+Ou5-bzkySG=2tJHC+izy0JQBSS5TtrSn6ubm6y7ig@mail.gmail.com
Whole thread Raw
In response to Re: Optimize common expressions in projection evaluation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Optimize common expressions in projection evaluation  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Optimize common expressions in projection evaluation  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers


po 5. 12. 2022 v 5:28 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Peifeng Qiu <pgsql@qiupf.dev> writes:
>> the need for this code seems not that great.  But as to the code itself I'm unable to properly judge.

> A simplified version of my use case is like this:
> CREATE FOREIGN TABLE ft(rawdata json);
> INSERT INTO tbl SELECT (convert_func(rawdata)).* FROM ft;

It might be worth noting that the code as we got it from Berkeley
could do this scenario without multiple evaluations of convert_func().
Memory is foggy, but I believe it involved essentially a two-level
targetlist.  Unfortunately, the scheme was impossibly baroque and
buggy, so we eventually ripped it out altogether in favor of the
multiple-evaluation behavior you see today.  I think that commit
62e29fe2e might have been what ripped it out, but I'm not quite
sure.  It's about the right time-frame, anyway.

I mention this because trying to reverse-engineer this situation
in execExpr seems seriously ugly and inefficient, even assuming
you can make it non-buggy.  The right solution has to involve never
expanding foo().* into duplicate function calls in the first place,
which is the way it used to be.  Maybe if you dug around in those
twenty-year-old changes you could get some inspiration.

I tend to agree with David that LATERAL offers a good-enough
solution in most cases ... but it is annoying that we accept
this syntax and then pessimize it.

I agree, so there is a perfect solution like don't use .*, but on second hand any supported syntax should be optimized well or we should raise some warning about negative performance impact.

Today there are a lot of baroque technologies in the stack so it is hard to remember all good practices and it is hard for newbies to take this knowledge. We should reduce possible performance traps when it is possible.

Regards

Pavel


 

                        regards, tom lane


pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Next
From: David Rowley
Date:
Subject: Re: Bug in row_number() optimization