Thread: How to observe plan_cache_mode transition from custom to generic plan?

How to observe plan_cache_mode transition from custom to generic plan?

From
Richard Michael
Date:
TL;DR --

1/ My basic prepared statement doesn't seem to start out with a custom plan
(i.e., no parameter symbols in query text).

2/ EXPLAIN EXECUTE of prepared statement doesn't show query text, as the
documentation seems to imply.  Should it?

3/ How can I observe the effect of plan_cache_mode?
    e.g., Possible plan change from custom to generic

Apologies if this belongs on -novice.


Details --

I am trying to understand `plan_cache_mode`, related to a report of a
performance problem (resolved by changing `auto` to
`force_custom_plan`).  I do not have access to the server or data in the
report, so I am trying to reproduce a similar situation myself.  I am
using PostgreSQL 12.

My first step is to observe the change in the plan, from custom to
generic, after the first five queries.  I have unsuccessfully tried to
do this with both `EXPLAIN` and the `auto_explain` extension.

I would appreciate help trying to understand what's happening and what
I've misunderstood.

(After I see the plan changing, I'll use more complicated data and
queries to investigate different performance scenarios. Eventually, I'm
interested in queries using the extended protocol. I'm unsure if parse/bind/execute
will go through the same server code path as `PREPARE ... ; EXECUTE ..`.)

The `PREPARE` documentation
(https://www.postgresql.org/docs/12/sql-prepare.html) indicates:

 1/ Regarding `plan_cache_mode`, "... the first five executions are
 done with custom plans ..."

 2/ Regarding `EXPLAIN EXECUTE ...`, "To examine the query plan
 PostgreSQL is using for a prepared statement, use EXPLAIN (...).  If a
 generic plan is in use, it will contain parameter symbols $n, while a
 custom plan will have the supplied parameter values substituted into
 it."


Using psql, I tried preparing and explaining a very basic query: `SELECT
$1 AS data`.

  Note: Wireshark revealed psql uses simple queries (`PREPARE ...`,
  `EXPLAIN EXECUTE ...` are `Q` messages), not the extended protocol
  (i.e., parse, bind, execute).  I mention this because previous list
  posts mention some libpq prepared statement functions do not result in
  the planning heuristic used by `plan_cache_mode`.


Based on the documentation, I expected the first planned query text to
be: `SELECT 10 AS data`, since it should be a custom plan with
substituted values.  However, the query text always contains a parameter
symbol: `SELECT $1 AS data`.


My questions:


1/ Slightly related, the `EXPLAIN EXECUTE(...)` output does not show the
query text, as the documentation seems to suggest it should (so one may
look for parameter symbols).  Why not?

(Although, none of the documented EXPLAIN options mentions query text
display.  So perhaps it never does?)

  ```
  ~# PREPARE one_param AS SELECT $1 AS data;
  ~# EXPLAIN EXECUTE one_param(10);
                    QUERY PLAN
    -------------------------------------------
     Result  (cost=0.00..0.01 rows=1 width=32)
    (1 row)
  ```

2/ The query text was logged by `auto_explain`.  However, it contains a
parameter symbol; why?  Also, why is the logged query a `PREPARE` statement
(as if a new prepared statement is being created), instead of only the
`SELECT ..` which was executed?

  ```
  LOG:  statement: EXPLAIN EXECUTE one_param(10);
  LOG:  duration: 0.000 ms  plan:
        Query Text: PREPARE one_param AS SELECT $1 AS data;
        Result  (cost=0.00..0.01 rows=1 width=32)
          Output: '10'::text
  ```


Let me know if I should post my postgresql.conf and `auto_explain` settings.

Thanks for any advice,
Richard

Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Mladen Gogala
Date:
On 9/5/21 9:35 AM, Richard Michael wrote:
> TL;DR --
>
> 1/ My basic prepared statement doesn't seem to start out with a custom 
> plan
> (i.e., no parameter symbols in query text).
>
> 2/ EXPLAIN EXECUTE of prepared statement doesn't show query text, as the
> documentation seems to imply.  Should it?
>
> 3/ How can I observe the effect of plan_cache_mode?
>     e.g., Possible plan change from custom to generic
>
> Apologies if this belongs on -novice.
>
>
> Details --
>
> I am trying to understand `plan_cache_mode`, related to a report of a
> performance problem (resolved by changing `auto` to
> `force_custom_plan`).  I do not have access to the server or data in the
> report, so I am trying to reproduce a similar situation myself. I am
> using PostgreSQL 12.
>
> My first step is to observe the change in the plan, from custom to
> generic, after the first five queries.  I have unsuccessfully tried to
> do this with both `EXPLAIN` and the `auto_explain` extension.
>
> I would appreciate help trying to understand what's happening and what
> I've misunderstood.
>
> (After I see the plan changing, I'll use more complicated data and
> queries to investigate different performance scenarios. Eventually, I'm
> interested in queries using the extended protocol. I'm unsure if 
> parse/bind/execute
> will go through the same server code path as `PREPARE ... ; EXECUTE ..`.)
>
> The `PREPARE` documentation
> (https://www.postgresql.org/docs/12/sql-prepare.html 
> <https://www.postgresql.org/docs/12/sql-prepare.html>) indicates:
>
>  1/ Regarding `plan_cache_mode`, "... the first five executions are
>  done with custom plans ..."
>
>  2/ Regarding `EXPLAIN EXECUTE ...`, "To examine the query plan
>  PostgreSQL is using for a prepared statement, use EXPLAIN (...).  If a
>  generic plan is in use, it will contain parameter symbols $n, while a
>  custom plan will have the supplied parameter values substituted into
>  it."
>
>
> Using psql, I tried preparing and explaining a very basic query: `SELECT
> $1 AS data`.
>
>   Note: Wireshark revealed psql uses simple queries (`PREPARE ...`,
>   `EXPLAIN EXECUTE ...` are `Q` messages), not the extended protocol
>   (i.e., parse, bind, execute).  I mention this because previous list
>   posts mention some libpq prepared statement functions do not result in
>   the planning heuristic used by `plan_cache_mode`.
>
>
> Based on the documentation, I expected the first planned query text to
> be: `SELECT 10 AS data`, since it should be a custom plan with
> substituted values.  However, the query text always contains a parameter
> symbol: `SELECT $1 AS data`.
>
>
> My questions:
>
>
> 1/ Slightly related, the `EXPLAIN EXECUTE(...)` output does not show the
> query text, as the documentation seems to suggest it should (so one may
> look for parameter symbols).  Why not?
>
> (Although, none of the documented EXPLAIN options mentions query text
> display.  So perhaps it never does?)
>
>   ```
>   ~# PREPARE one_param AS SELECT $1 AS data;
>   ~# EXPLAIN EXECUTE one_param(10);
>                     QUERY PLAN
>     -------------------------------------------
>      Result  (cost=0.00..0.01 rows=1 width=32)
>     (1 row)
>   ```
>
> 2/ The query text was logged by `auto_explain`.  However, it contains a
> parameter symbol; why?  Also, why is the logged query a `PREPARE` 
> statement
> (as if a new prepared statement is being created), instead of only the
> `SELECT ..` which was executed?
>
>   ```
>   LOG:  statement: EXPLAIN EXECUTE one_param(10);
>   LOG:  duration: 0.000 ms  plan:
>         Query Text: PREPARE one_param AS SELECT $1 AS data;
>         Result  (cost=0.00..0.01 rows=1 width=32)
>           Output: '10'::text
>   ```
>
>
> Let me know if I should post my postgresql.conf and `auto_explain` 
> settings.
>
> Thanks for any advice,
> Richard


Well, some other databases that shall remain unnamed have a thing called 
"bind variable peeking". Essentially, the database instance uses the 
first set of bind variables to create the execution plan. That is far 
from optimal because the first set of the bind variables encountered by 
the instance may not be representative of  what the vast majority of the 
work is comprised of. That particular database also has tricks like 
dynamic statistics sampling, cardinality feedback and adaptive plans (if 
the database finds a cheaper plan, it can substitute it on the go). All 
of that mess results in high degree of unpredictability. With that 
particular database, plan stability is like El Dorado: everybody is 
looking for it but nobody can find it. There are outlines, baselines and 
SQL patches, none of which is quite satisfactory.

Postgres has a different method: it executes the same statement with 5 
different sets of bind variables and if it finds a plan that is cheaper 
than the generic plan, it caches it and uses it in the future. Caching 
is pertinent for one session only. There is no cursor sharing between 
sessions. If it doesn't find such a plan, it uses a generic plan with 
hard coded values instead of the bind values and table statistics. 
People switching from Oracle, me being one of those, frequently make 
mistake of using bind variables in Postgres. Sometimes, there is no 
choice, for instance if your app uses an application server and an ORM. 
Postgres doesn't have the memory structure called "shared pool" and 
cursors and compiled procedures are not cached. That means that by using 
Postgres, you are deprived of all the fun managing the cursor caching, 
shared pool, database statistics and latches (don't ask, please). There 
is a small consolation that you get to keep a ton of money in your 
pocket. You also can have a stable plan. There are no adaptive plans or 
cursor sharing.

Frank Pachot, an Oracle Ace, has an excellent article on the subject:

https://franckpachot.medium.com/postgresql-bind-variable-peeking-fb4be4942252

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com




Re: How to observe plan_cache_mode transition from custom to generic plan?

From
"David G. Johnston"
Date:
On Sunday, September 5, 2021, Richard Michael <rmichael@edgeofthenet.org> wrote:

Based on the documentation, I expected the first planned query text to
be: `SELECT 10 AS data`, since it should be a custom plan with
substituted values.  However, the query text always contains a parameter
symbol: `SELECT $1 AS data`.


A query plan is not the same as the query text.  While the executed plan can be generic or custom the query text is constant.

If you want to see the difference between a generic and a custom plan you need to comprise a query that would actually have different custom and generic plans.  Queries that don’t involve tables, indexes, or joins don’t have any choices to make with respect to how they are executed.

David J.

Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Mladen Gogala
Date:
On 9/6/21 2:26 AM, Laurenz Albe wrote:
> "Bind variables" just being an Oraclism for parameters, it is*not*  a
> mistake to use them in PostgreSQL.

Actually, it is a mistake because they don't give you any performance 
benefit and can potentially worsen the performance. There is no cursor 
sharing and generic plans can be much worse than "custom" plans, 
generated with the actual values. The only reason for using bind 
variables/parameters is to protect yourself from SQL injection. Creating 
SQL dynamically from input is the recipe for the "little Bobby Tables" 
situation: https://xkcd.com/327/


-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com




Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Richard Michael
Date:
Hi Laurenz,

Thank you for the quick response and clear demonstration.  I have a few comments, inline below.

On Mon, Sep 6, 2021 at 8:26 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Sun, 2021-09-05 at 14:20 -0400, Mladen Gogala wrote:
> On 9/5/21 9:35 AM, Richard Michael wrote:
> > TL;DR --
> >
> > 1/ My basic prepared statement doesn't seem to start out with a custom
> > plan
> > (i.e., no parameter symbols in query text).

That is notmal for the first five executions.  PostgreSQL will consider a
generic plan only from the sixth execution on.

Yes, I understood from the documentation that I should expect the sixth query to be generic.  My comment was perhaps awkwardly phrased in the negative--  I wondered why I was *not* seeing a custom query as the first planned (indeed, first five), that is: one with substituted literals, not placeholders.

I misunderstood that parameters only appear (either subtituted or as placeholders, $1, etc.) in portions of the query *plan* output (for example, "Filter:" and "Output:" lines).  I overlooked that even my basic query had a plan containing "Output: $1" on the sixth EXPLAIN EXECUTE.

Briefly:

PREPARE basic(int) AS SELECT $1 AS number;
EXPLAIN EXECUTE basic(10);
... Output: 10

-- Repeat 4 more times, then:

EXPLAIN EXECUTE basic(10);
... Output: $1


> > 2/ EXPLAIN EXECUTE of prepared statement doesn't show query text, as the
> > documentation seems to imply.  Should it?

Yes, and it does for me - see below.

I do not see the query text in your EXPLAIN EXECUTE output.  Perhaps I am misusing terminology?

By "query text", I mean-- I do not see: "SELECT * FROM jobs WHERE job_id = 1" (in the case of a custom plan), or "... job_id = $1" (in the case of a generic query).

(I do, of course, see the substituted parameter value (1, 2, 3, etc. for $1 as job_id) on various lines, and thanks again for the quick demo.)


> > 2/ The query text was logged by `auto_explain`.  However, it contains a
> > parameter symbol; why?  Also, why is the logged query a `PREPARE`
> > statement
> > (as if a new prepared statement is being created), instead of only the
> > `SELECT ..` which was executed?

You saw the log entry for the PREPARE statement. Since your statement
contains $1 in its query text, you'll get that logged.

Look for the log entries for the EXECUTEs.

In my log (with auto_explain enabled), even for EXPLAIN EXECUTE, the plan shows the query text as: PREPARE ...

In my initial post, I was trying to describe that I expected "EXPLAIN EXECUTE foo('red')" to log, without "PREPARE ..." prefixed, e.g.,
    Query Text: SELECT * FROM data WHERE colour = $1


Here is my psql session, interspersed with the postgres.log content:

~# PREPARE foo(text) AS SELECT * FROM data WHERE colour = $1 LIMIT 1;
PREPARE

2021-09-09 16:29:51.235 CEST [3680] LOG:  statement: PREPARE foo(text) AS SELECT * FROM data WHERE colour = $1 LIMIT 1;
2021-09-09 16:29:51.236 CEST [3680] LOG:  duration: 0.840 ms

~# EXPLAIN EXECUTE foo('red');
                              QUERY PLAN
----------------------------------------------------------------------
 Limit  (cost=0.00..0.04 rows=1 width=46)
   ->  Seq Scan on data  (cost=0.00..109229.00 rows=2500000 width=46)
         Filter: (colour = 'red'::text)
(3 rows)


2021-09-09 16:31:17.782 CEST [3680] LOG:  statement: EXPLAIN EXECUTE foo('red');
2021-09-09 16:31:17.783 CEST [3680] LOG:  duration: 0.000 ms  plan:
        Query Text: PREPARE foo(text) AS SELECT * FROM data WHERE colour = $1 LIMIT 1;        <====  Why "PREPARE ...", not just "SELECT ..." ?
        Limit  (cost=0.00..0.04 rows=1 width=46)
          Output: id, md5sum, colour
          ->  Seq Scan on public.data  (cost=0.00..109229.00 rows=2490000 width=46)
                Output: id, md5sum, colour
                Filter: (data.colour = 'red'::text)
2021-09-09 16:31:17.783 CEST [3680] LOG:  duration: 1.390 ms


Thanks again Laurenz, now I know what to look for in the output and logs.

Regards,
Richard




By the way, here is one of the shortcomings of using prepared statements
in my opinion: if EXECUTE is slow, you will get the EXECUTE logged, but
not the statement text of the prepared statements.

> Well, some other databases that shall remain unnamed have a thing called
> "bind variable peeking". [more off-topic things skipped]

Most of your explanations about PostgreSQL are correct, but:

> Postgres has a different method: it executes the same statement with 5
> different sets of bind variables and if it finds a plan that is cheaper
> than the generic plan, it caches it and uses it in the future.

That is wrong: If the estimate for the generic plan is more expensive
than the plans for the first five executions, it will keep generating a
custom plan for subsequent executions of the prepared statement.

> People switching from Oracle, me being one of those, frequently make
> mistake of using bind variables in Postgres.

"Bind variables" just being an Oraclism for parameters, it is *not* a
mistake to use them in PostgreSQL.

Yours,
Laurenz Albe



Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Richard Michael
Date:
Hi David,

On Sun, Sep 5, 2021 at 8:32 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, September 5, 2021, Richard Michael <rmichael@edgeofthenet.org> wrote:

Based on the documentation, I expected the first planned query text to
be: `SELECT 10 AS data`, since it should be a custom plan with
substituted values.  However, the query text always contains a parameter
symbol: `SELECT $1 AS data`.


A query plan is not the same as the query text.  While the executed plan can be generic or custom the query text is constant.

Thanks for drawing my attention to the difference between the query text and query plan.


If you want to see the difference between a generic and a custom plan you need to comprise a query that would actually have different custom and generic plans.  Queries that don’t involve tables, indexes, or joins don’t have any choices to make with respect to how they are executed.

After reading Laurenz's reply, I experimented again with `PREPARE basic(int) AS SELECT $1 AS number;" and the query plan logged in the log file (by auto_explain) does indeed show "Output: $1" on the sixth EXPLAIN EXECUTE!  Sorry I missed this earlier.  (The EXPLAIN EXECUTE output itself does not contain an "Output:" line, so I didn't notice this lack of parameter substitution in psql.)

I appreciate your point about query plans which may not involve choices; thank you for mentioning it.

Regards,
Richard


David J.

Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Tom Lane
Date:
Richard Michael <rmichael@edgeofthenet.org> writes:
> After reading Laurenz's reply, I experimented again with `PREPARE
> basic(int) AS SELECT $1 AS number;" and the query plan logged in the log
> file (by auto_explain) does indeed show "Output: $1" on the sixth EXPLAIN
> EXECUTE!  Sorry I missed this earlier.  (The EXPLAIN EXECUTE output itself
> does not contain an "Output:" line, so I didn't notice this lack of
> parameter substitution in psql.)

I think EXPLAIN VERBOSE would add that bit of detail.

            regards, tom lane



Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Richard Michael
Date:
Hi Tom,

On Thu, Sep 9, 2021 at 6:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Michael <rmichael@edgeofthenet.org> writes:
> After reading Laurenz's reply, I experimented again with `PREPARE
> basic(int) AS SELECT $1 AS number;" and the query plan logged in the log
> file (by auto_explain) does indeed show "Output: $1" on the sixth EXPLAIN
> EXECUTE!  Sorry I missed this earlier.  (The EXPLAIN EXECUTE output itself
> does not contain an "Output:" line, so I didn't notice this lack of
> parameter substitution in psql.)

I think EXPLAIN VERBOSE would add that bit of detail.

Indeed it does!  I literally just now experimented with this and was returning to the ML reply to myself to avoid wasting more of anyone's time.

Would a tiny patch to the PREPARE docs be accepted?  I'd like to help clarify investigating `plan_cache_mode` for future readers.

Thanks for replying, regards,
Richard


                        regards, tom lane

Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Tom Lane
Date:
Richard Michael <rmichael@edgeofthenet.org> writes:
> Would a tiny patch to the PREPARE docs be accepted?  I'd like to help
> clarify investigating `plan_cache_mode` for future readers.

What did you have in mind?

(I'm kind of allergic to documenting the plan-caching behavior in too
much detail, because I keep expecting us to change it.  On the other
hand, nobody's put forward any improvement proposals.)

            regards, tom lane



Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Richard Michael
Date:

On Thu, Sep 9, 2021 at 7:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Michael <rmichael@edgeofthenet.org> writes:
> Would a tiny patch to the PREPARE docs be accepted?  I'd like to help
> clarify investigating `plan_cache_mode` for future readers.

What did you have in mind?

(I'm kind of allergic to documenting the plan-caching behavior in too
much detail, because I keep expecting us to change it.  On the other
hand, nobody's put forward any improvement proposals.)

For reference, currently in PREPARE docs:

"If a generic plan is in use, it will contain parameter symbols $n, while a custom plan will have the supplied parameter values substituted into it."


Here is a rough stab at a few additional sentences which hopefully will not induce anaphylaxis; my motivation follows.

"If a generic plan is in use, it will contain parameter symbols $n, while a custom plan will have the supplied parameter values substituted into plan nodes which use the parameter, such as filters or conditions.  If the parameters are used in the output, `EXPLAIN VERBOSE` can be helpful, as it will display query output.  Note the query text (displayed by `auto_explain`) is constant, and will not have parameters substituted, even when a generic plan is used."

(Perhaps mentioning auto_explain is overly specific. Although, no other command or tooling prints the query text.  This note could or should go in the auto_explain docs.)

As you noticed up-thread, I was testing with a very basic query: `SELECT $1 AS number`, for which EXPLAIN EXECUTE shows no parameter related values (it occurs only in the output, requiring VERBOSE).

Also, I was testing from several clients: psql, my test C program, and Elixir.  For simplicity (oops), I enabled `auto_explain`.  Auto-explain prints `plan: ... Query Text: SELECT $1 AS number`, and I see `ExplainQueryText()` was added to commands/explain.c years ago for this purpose.

In my situation, only the auto_explain output was displaying parameter symbols ($1), in the query text, and I allowed myself to be misled.  (In auto_explain output, the query text seems to be "part of the plan", and hence I thought it would have values substituted).

Re-reading the original sentence, I concede it is sufficient given what I now know (especially static, non-substituted query text).  It can definitely be argued that: (1) I don't know the planner or EXPLAIN or auto_explain well enough to be competent at this type of investigation, and (2) the PREPARE documentation did not mention `auto_explain`, so I should not have used it, and followed the documentation precisely.  (The static query text matter is a bit subtle, I think.)

However, I would also say the documentation could nudge the reader in a helpful direction, and my general position on documentation is that it's for newcomers not experts (who don't need any :).  I'd like to believe those additional sentences would have spared you all this thread.


Aside, might EXPLAIN VERBOSE be enhanced to also print Query Text, so that `auto_explain` really behaves like "automatic explain"?  I note only `auto_explain` uses the function added to explain.c.

Regards,
richard








                        regards, tom lane

Re: How to observe plan_cache_mode transition from custom to generic plan?

From
Richard Michael
Date:


On Fri, Sep 10, 2021 at 10:38 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2021-09-09 at 16:51 +0200, Richard Michael wrote:
> > > 2/ EXPLAIN EXECUTE of prepared statement doesn't show query text, as the
> > > documentation seems to imply.  Should it?
> >
> > Yes, and it does for me - see below.
>
> I do not see the query text in your EXPLAIN EXECUTE output.  Perhaps I am misusing terminology?

My sloppiness.  You are right, it does not show the query text.  But no EXPLAIN
shows the query text, right?

That seems to be correct, yes.

Somewhat related, I noted in my reply to Tom that `auto_explain` does log the query text,
and ExplainQueryText() was added to commands/explain.c for just that purpose.  auto_explain is
the only code which uses that function.

Regards,
richard


Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com