Re: Cached/global query plans, autopreparation - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Cached/global query plans, autopreparation
Date
Msg-id 20180302202909.GB27946@momjian.us
Whole thread Raw
In response to Re: Cached/global query plans, autopreparation  (Shay Rojansky <roji@roji.org>)
Responses Re: Cached/global query plans, autopreparation  (Andres Freund <andres@anarazel.de>)
Re: Cached/global query plans, autopreparation  (konstantin knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
On Thu, Feb 15, 2018 at 03:00:17PM +0100, Shay Rojansky wrote:
> Just wanted to say that I've seen more than 10% improvement in some real-world
> application when preparation was done properly. Also, I'm assuming that
> implementing this wouldn't involve "rewriting substantial part of Postgres
> core", and that even 10% is quite a big gain, especially if it's a transparent/
> free one as far as the user is concerned (no application changes).

I would like to step back on this issue.  Ideally, every query would get
re-optimized because we can only be sure the plan is optimal when we use
supplied constants to generate the plan.  But, of course, parsing and
planning take time, so there ideally would be an way to avoid it.  The
question is always how much time will be saved by avoiding
parsing/planning, and what risk is there of suboptimal plans.

Postgres uses a conservative method for reusing plans with previous
constants, as described in the PREPARE manual page:

    https://www.postgresql.org/docs/10/static/sql-prepare.html
    Prepared statements can use generic plans rather than re-planning with
    each set of supplied EXECUTE values. This occurs immediately for prepared
    statements with no parameters; otherwise it occurs only after five or more
    executions produce plans whose estimated cost average (including planning
    overhead) is more expensive than the generic plan cost estimate. Once
    a generic plan is chosen, it is used for the remaining lifetime of the
    prepared statement. Using EXECUTE values which are rare in columns with
    many duplicates can generate custom plans that are so much cheaper than
    the generic plan, even after adding planning overhead, that the generic
    plan might never be used.

While I have heard people complain about how other databases cache
prepare plans, I have heard few complaints about the Postgres approach,
and I haven't even heard of people asking to control the documented "five
or more" behavior.

I also know that other database products have more sophisticated prepare
usage, but they might have higher parse/plan overhead, or they might be
more flexible in handling specialized workloads, which Postgres might
not want to handle, given the costs/complexity/overhead.

So, the directions for improvement are:

1  Improve the existing "five or more" behavior
2  Automatically prepare queries that are not sent as prepared queries
3  Share plans among sessions

While #1 would be nice, #2 increases the number of applications that can
silently benefit from prepared queries, and #3 improves the number of
cases that query plans can be reused.  The issue with #3 is that the
constants used are no longer local to the session (which is the same
issue with connection poolers reusing prepared plans).  When different
sessions with potentially more varied constants reuse plans, the
probability of suboptimal plans increases.

I think the fact that pgbench shows a 2x improvement for prepared
statements, and real world reports are a 10% improvement means we need
to have a better understanding of exactly what workloads can benefit
from this, and a comprehensive analysis of all three areas of
improvement.

-- 
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Testing "workers launched" in expected output? Really?
Next
From: Andres Freund
Date:
Subject: Re: Cached/global query plans, autopreparation