Thread: Parameterized prepared statements

Parameterized prepared statements

From
Craig Ringer
Date:
Hi folks

While replying to another query, something struck me as interesting that
I thought I'd raise.

People here often raise issues where preparing a parameterised query and
executing the prepared query results in a plan that's sub-optimal for
the particular values substituted into it. The cause is well understood
- the planner has to pick a plan that looks good without knowledge of
what the value to be substituted in will be.

Things like pre-parsed prepared statements that're re-planned on every
execution are often proposed as solutions to this. This has me
wondering: rather than expensively re-planning from scratch, would it be
possiblet to adjust the planning process so that *multiple* alternative
plans would be cached for a query, using placeholders for unknown
rowcounts and costs? At execution, the unknown costs would be filled in
and the plans compared then the best plan picked for this execution. Is
this crazy talk, or could it significantly reduce the cost of
re-planning parameterized prepared statements to the point where it'd be
worth doing by default?

On an unrelated note, does Pg do any kind of smart searching on `IN'
lists, or just a linear scan? Would it be worth sorting longer IN list
results so each iteration could do a binary search of the list?

--
Craig Ringer

Re: Parameterized prepared statements

From
Martijn van Oosterhout
Date:
On Wed, Aug 31, 2011 at 09:44:09AM +0800, Craig Ringer wrote:
> Things like pre-parsed prepared statements that're re-planned on
> every execution are often proposed as solutions to this. This has me
> wondering: rather than expensively re-planning from scratch, would
> it be possiblet to adjust the planning process so that *multiple*
> alternative plans would be cached for a query, using placeholders
> for unknown rowcounts and costs? At execution, the unknown costs
> would be filled in and the plans compared then the best plan picked
> for this execution. Is this crazy talk, or could it significantly
> reduce the cost of re-planning parameterized prepared statements to
> the point where it'd be worth doing by default?

The problem is that the number of alternate plans is enourmous
(combinatorial).  You need something to prune the results and that's
where the stats come in.  What you need is some way of reducing the
number of plans while keeping the useful ones.  For example, an
equality on a primary key column is going to return one result, no
matter what the parameter.  But for other things it gets really hard.
It could be done, but I'm not sure if the payoff is worth it.

> On an unrelated note, does Pg do any kind of smart searching on `IN'
> lists, or just a linear scan? Would it be worth sorting longer IN
> list results so each iteration could do a binary search of the list?

I beleive large INs can be turned into hash lookups, but I'm not sure.
Try a query with 10,000 element in an IN and see what happens.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

Attachment

Re: Parameterized prepared statements

From
Dmitriy Igrishin
Date:
Hey Craig,

Things like pre-parsed prepared statements that're re-planned on every execution are often proposed as solutions to this. This has me wondering: rather than expensively re-planning from scratch, would it be possiblet to adjust the planning process so that *multiple* alternative plans would be cached for a query, using placeholders for unknown rowcounts and costs? At execution, the unknown costs would be filled in and the plans compared then the best plan picked for this execution. Is this crazy talk, or could it significantly reduce the cost of re-planning parameterized prepared statements to the point where it'd be worth doing by default?
Its a good suggestion of some kind of optimization at the server side.
This idea can be extended to an "auto-prepare" mode (like an auto-commit
mode that we have today in Postgres).
But its not so hard to let the application (or library) to decide what to use in
different cases: prepared statement or regular statement. Thus, I think it is
not worth it...

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
// Dmitriy.