Re: New Feature Request - Mailing list pgsql-hackers
From | Bert Scalzo |
---|---|
Subject | Re: New Feature Request |
Date | |
Msg-id | CAFernC5F-HEDfgE9c3Z2AWfcQciYYABuWJnBGyQa_vgH2bZXZA@mail.gmail.com Whole thread Raw |
In response to | Re: New Feature Request (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: New Feature Request
|
List | pgsql-hackers |
I greatly appreciate all the replies. Thanks. I also fully understand and appreciate all the points made - especially that this idea may not have general value or acceptance as worthwhile. No argument from me. Let me explain why I am looking to do this to see if that changes any opinions. I have written a product called QIKR for MySQL that leverages the MySQL query rewrite feature and places a knowledge expert of SQL rewrite rules as a preprocessor to the MySQL optimizer. I have defined an extensive set of rules based on my 30 years of doing code reviews for app developers who write terrible SQL. Right now QIKR does 100% syntactic analysis (hoping to do semantic analysis in a later version). For MySQL (which has a less mature and less robust optimizer) the performance gains are huge - in excess of 10X. So far QIKR shows about a 2.5X improvement over the PostgreSQL optimizer when fed bad SQL. I am not saying the PotsgrSQL optimizer does a poor job, but rather that QIKR was designed for "garbage in, not garbage out" - so QIKR fixes all the stupid mistakes that people make which can confuse or even cripple an optimizer. Hence why I am looking for this hook - and have come to the experts for help. I have two very large PostgreSQL partner organizations who have asked me to make QIKR work for PostgreSQL as it does for MySQL. Again, I am willing to pay for this hook since it's a special request for a special purpose and not generally worthwhile in many people's opinions - which I cannot argue with.
On Tue, May 26, 2020 at 2:17 AM Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
On 26.05.2020 04:47, Tomas Vondra wrote:
> On Mon, May 25, 2020 at 09:21:26PM -0400, Bruce Momjian wrote:
>> On Mon, May 25, 2020 at 07:53:40PM -0500, Bert Scalzo wrote:
>>> I am reposting this from a few months back (see below). I am not
>>> trying to be a
>>> pest, just very motivated. I really think this feature has merit,
>>> and if not
>>> generally worthwhile, I'd be willing to pay someone to code it for
>>> me as I
>>> don't have strong enough C skills to modify the PostgreSQL code
>>> myself. So
>>> anyone who might have such skills that would be interested, please
>>> contact me:
>>> bertscalzo2@gmail.com.
>>
>> I think your best bet is to try getting someone to write a hook
>> that will do the replacement so that you don't need to modify too much
>> of the Postgres core code. You will need to have the hook updated for
>> new versions of Postgres, which adds to the complexity.
>>
>
> I don't think we have a hook to tweak the incoming SQL, though. We only
> have post_parse_analyze_hook, i.e. post-parse, at which point we can't
> just rewrite the SQL directly. So I guess we'd need new hook.
VOPS extension performs query substitution (replace query to the
original table with query to projection) using post_parse_analysis_hook
and SPI. So I do not understand why some extra hook is needed.
>
> I do however wonder if an earlier hook is a good idea at all - matching
> the SQL directly seems like a rather naive approach that'll break easily
> due to formatting, upper/lower-case, subqueries, and many other things.
> From this standpoint it seems actually better to inspect and tweak the
> parse-analyze result. Not sure how to define the rules easily, though.
>
In some cases we need to know exact parameter value (as in case
SUBSTRING(column,1,3) = 'ABC').
Sometime concrete value of parameter is not important...
Also it is not clear where such pattern-matching transformation should
be used only for the whole query or for any its subtrees?
> As for the complexity, I think hooks are fairly low-maintenance in
> practice, we tend not to modify them very often, and when we do it's
> usually just adding an argument etc.
I am not sure if the proposed approach can really be useful in many cases.
Bad queries are used to be generated by various ORM tools.
But them rarely generate exactly the same query. So defining matching
rules for the whole query tree will rarely work.
It seems to be more useful to have extensible SQL optimizer, which
allows to add user defined rules (may as transformation patterns).
This is how it is done in GCC code optimizer.
Definitely writing such rules is very non-trivial task.
Very few developers will be able to add their own meaningful rules.
But in any case it significantly simplify improvement of optimizer,
although most of problems with choosing optimal plan are
caused by wrong statistic and rue-based optimization can not help here.
pgsql-hackers by date: