Plan invalidation vs. unnamed prepared statements - Mailing list pgsql-jdbc

From Tom Lane
Subject Plan invalidation vs. unnamed prepared statements
Date
Msg-id 6080.1173201777@sss.pgh.pa.us
Whole thread Raw
Responses Re: [HACKERS] Plan invalidation vs. unnamed prepared statements
Re: [HACKERS] Plan invalidation vs. unnamed prepared statements
List pgsql-jdbc
[ cc'd to pgsql-jdbc which seems the group most likely to be affected
  by any protocol change ]

So I've been working on a plan cache module per my earlier proposal,
and I've run up against a problem with getting exec_parse_message
to use it.  The problem is that the current rather hackish handling
of unnamed prepared statements doesn't fit in.  Per the documentation,
unnamed statements are supposed to be "optimized for the case of
executing a query only once and then discarding it".  In the current
code this largely just means that we avoid copying the parse/plan trees
into the normal PreparedStatement cache, preferring to let them sit
in the context where they were generated (which means that any detritus
generated by the parser/planner can't be recovered until we discard the
unnamed statement, but that seems a good tradeoff in this situation).

To use the plan cache for unnamed statements, there's going to have to
be more overhead (more tree-copying) in this code path; moreover having
the unnamed statement's plan in the cache will result in distributed
overhead for checking it to see if it's still valid.  This overhead is
largely going to be wasted if the statement is always discarded
immediately after use.

I can think of several options for dealing with this:

A. Just accept the extra overhead, thereby preserving the current
behavior of unnamed statements, and gaining the benefit that plan
invalidation will work correctly in the few cases where an unnamed
statement's plan lasts long enough to need replanning.

B. Don't store the unnamed statement in the plan cache.  To make sure
it's not used anymore when the plan might be stale, forcibly discard
the unnamed statement after execution.  This would get rid of a lot
of overhead but would mean a significant change in the protocol-level
behavior.  It's hard to guess how many clients might be broken by it
--- conceivably not any, but that seems too optimistic :-(

C. Don't store the unnamed statement in the plan cache.  To make sure
it's not used anymore when the plan might be stale, don't analyze or
plan at Parse-message time, but postpone *all* that work until Bind;
and always discard the plan after Execute.  We could still do "raw
parsing" at Parse time, since that's independent of database state,
but all but the most trivial syntactic errors would now occur at Bind
not Parse time, as well as the majority of the time expenditure.  This
still amounts to a change in the protocol semantics, although it's a
lot more subtle than plan B.  Also there's a problem if the client
does Describe Statement before Bind: we still have to run parse analysis
before we can answer, and if we then throw that away, we have no very
good way to guarantee that the statement still has the same description
when it's subsequently executed; plus we end up doing parse analysis
twice.

D. Don't store the unnamed statement in the plan cache, and just ignore
the possibility that its plan might become stale before use.  That's
exactly what happens now, but considering that the whole point of the
plan inval work is to seal off such pitfalls, I can't say that I care
for this alternative.

Comments?  I'm leaning to plan A but wanted to see if anyone would
support plan B or sees a way to fix plan C.

            regards, tom lane

pgsql-jdbc by date:

Previous
From: Kris Jurka
Date:
Subject: Re: [BUGS] BUG #3106: A problem with escaping table name pattern for DatabaseMetaData.getColumns()
Next
From: Gregory Stark
Date:
Subject: Re: [HACKERS] Plan invalidation vs. unnamed prepared statements