Thread: Monitoring query plan cache

Monitoring query plan cache

From

Andomar

Date:

20 December 2014, 12:00:46

Hi,

We run a Postgres 9.1 system that handles a lot of identical queries
(around 500 per second.) The queries originate from a Python WSGI script
running under Apache.

Performance is an issue, and we suspect query planning might be an
underlying cause. I've rewritten the scripts from ad-hoc SQL to a
Postgres functions (CREATE FUNCTION) and we saw server load go down
quite a bit.

Now we'd like to monitor the Postgres query parser, to confirm that it's
not analyzing the same SQL over and over, and see if we missed other things.

a) Is it possible that Postgres caches the query plans for psql
functions, but not for ad-hoc SQL?
b) Is there a way to see if a single query uses a cached execution plan?
c) Can you monitor the query parser as a whole, with stats like
parses/sec or cache hits/sec?
d) Is there a way to montior how many reads are done against the
statistics table (pg_statistics)?

Thanks for any answers or pointers,
Andomar

P.S. I've asked this question on StackExchange
http://dba.stackexchange.com/questions/86620/monitoring-the-postgres-query-parser/86626#86626
but I'm looking for more specific information.

Re: Monitoring query plan cache

From

Tom Lane

Date:

20 December 2014, 18:46:19

Andomar <andomar@aule.net> writes:
> We run a Postgres 9.1 system that handles a lot of identical queries
> (around 500 per second.) The queries originate from a Python WSGI script
> running under Apache.

> Performance is an issue, and we suspect query planning might be an
> underlying cause. I've rewritten the scripts from ad-hoc SQL to a
> Postgres functions (CREATE FUNCTION) and we saw server load go down
> quite a bit.

> Now we'd like to monitor the Postgres query parser, to confirm that it's
> not analyzing the same SQL over and over, and see if we missed other things.

> a) Is it possible that Postgres caches the query plans for psql
> functions, but not for ad-hoc SQL?

plpgsql functions would cache query plans.  Ad-hoc SQL doesn't, unless you
explicitly make use of prepared queries.

> c) Can you monitor the query parser as a whole, with stats like
> parses/sec or cache hits/sec?

Possibly log_parser_stats/log_planner_stats/log_statement_stats
would help you.  They're pretty old-school though; you'd need to
write some tool that scans the postmaster log and accumulates the
stats to get anything very useful out of those features.  (It
could be that somebody's already done that, though --- take a
look at things like pgFouine.)

> d) Is there a way to montior how many reads are done against the
> statistics table (pg_statistics)?

Up to a point.  The pg_stats functionality should work just the same for
pg_statistic as for any user table.  However, that will only count actual
reads, and there's a per-backend catalog cache that will buffer rows
fetched from pg_statistic.  So you can't really use pg_statistic reads as
a proxy for how often planning happened.

            regards, tom lane

Re: Monitoring query plan cache

From

Andomar

Date:

21 December 2014, 15:08:51

Thanks for your reply, I have a  follow-up question:
>> c) Can you monitor the query parser as a whole, with stats like
>> parses/sec or cache hits/sec?
> Possibly log_parser_stats/log_planner_stats/log_statement_stats
> would help you.  They're pretty old-school though; you'd need to
> write some tool that scans the postmaster log and accumulates the
> stats to get anything very useful out of those features.  (It
> could be that somebody's already done that, though --- take a
> look at things like pgFouine.)
>
Below is an example output from log_planner_stats:

LOG:  PLANNER STATISTICS
DETAIL:  ! system usage stats:
         !       0.000132 elapsed 0.000000 user 0.000000 system sec
         !       [0.181972 user 0.052991 sys total]
         !       0/0 [0/248] filesystem blocks in/out
         !       0/0 [0/2705] page faults/reclaims, 0 [0] swaps
         !       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
         !       0/0 [1/4249] voluntary/involuntary context switches

How can we tell from this whether the query planner used a cached plan?
Is the example above a hit or a miss?

>> a) Is it possible that Postgres caches the query plans for psql
>> functions, but not for ad-hoc SQL?
> plpgsql functions would cache query plans.  Ad-hoc SQL doesn't, unless you
> explicitly make use of prepared queries.

It is not always easy to tell the query type (function, prepared or
ad-hoc.) We use Python in mod_wsgi with psycopg2. The code shows ad-hoc
SQL, but who knows what the many layers between Python and the database do.

As suggested by Erwin Brandstetter on dba.stackexchange.com, I ran a
test with:

     cursor.execute("""
         load 'auto_explain';
         set auto_explain.log_min_duration = 0;
         set auto_explain.log_nested_statements = on;
         select col1 from table1 where id = %(id)s
         """, {'id': 123});
     print(curs.fetchone()[0]);

The logging doesn't look like a cached plan, you can see the 123 value
but not a parameter like $1. This suggests Postgres was previously
compiling around 200 queries a second on our production machine. Is that
even possible?

Cheers,
Andomar

Re: Monitoring query plan cache

From

Jan de Visser

Date:

21 December 2014, 16:55:40

On December 21, 2014 04:08:43 PM Andomar wrote:
> It is not always easy to tell the query type (function, prepared or
> ad-hoc.) We use Python in mod_wsgi with psycopg2. The code shows ad-hoc
> SQL, but who knows what the many layers between Python and the database do.

psycopg2 sends the SQL you feed it straight to the DB. If you don't feed it a
PREPARE statement [1] it'll be an ad-hoc query; the value placeholders will be
interpolated prior to statement submission by psycopg2.

[1] http://www.postgresql.org/docs/9.2/interactive/sql-prepare.html

Re: Monitoring query plan cache

From

Tom Lane

Date:

21 December 2014, 18:08:41

Andomar <andomar@aule.net> writes:
> Below is an example output from log_planner_stats:

> LOG:  PLANNER STATISTICS
> DETAIL:  ! system usage stats:
>          !       0.000132 elapsed 0.000000 user 0.000000 system sec
>          !       [0.181972 user 0.052991 sys total]
>          !       0/0 [0/248] filesystem blocks in/out
>          !       0/0 [0/2705] page faults/reclaims, 0 [0] swaps
>          !       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
>          !       0/0 [1/4249] voluntary/involuntary context switches

> How can we tell from this whether the query planner used a cached plan?

If you're seeing that output then planning happened.  The only way you get
a cached plan for a client-issued SQL statement is if the client uses
PREPARE/EXECUTE or the "extended query" protocol (parse/bind/execute).
Neither of those cases would go through here.

> The logging doesn't look like a cached plan, you can see the 123 value
> but not a parameter like $1. This suggests Postgres was previously
> compiling around 200 queries a second on our production machine. Is that
> even possible?

Well, at 132 microseconds apiece, it does not seem from this example that
repeat planning is a huge problem for you ... of course, some of your
statements might take longer, but you've not demonstrated here that you
have anything to worry about.

            regards, tom lane

Re: Monitoring query plan cache

From

Andomar

Date:

21 December 2014, 22:57:59

> psycopg2 sends the SQL you feed it straight to the DB. If you don't feed it a
> PREPARE statement [1] it'll be an ad-hoc query; the value placeholders will be
> interpolated prior to statement submission by psycopg2.
Thanks, that confirms what I saw during tests.
> If you're seeing that output then planning happened.  The only way you get
> a cached plan for a client-issued SQL statement is if the client uses
> PREPARE/EXECUTE or the "extended query" protocol (parse/bind/execute).
> Neither of those cases would go through here.
So if there was a cached plan, there is no "LOG:  PLANNER STATISTICS".
That is helpful, thanks.
> Well, at 132 microseconds apiece, it does not seem from this example that
> repeat planning is a huge problem for you ... of course, some of your
> statements might take longer, but you've not demonstrated here that you
> have anything to worry about.
Well that was just an example query, but checking a realistic query on a
test machine shows 15ms spent on query analysis. For 200 queries/sec
that would keep around 3 CPU's busy, which is what we saw. I will try to
verify that using the log_planner_stats option.

Cheers,
Andomar