Re: Performance problem in PLPgSQL - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Performance problem in PLPgSQL
Date
Msg-id CAFj8pRDq24w_efgPcE_KCs_GtyvAP0F-AhXdXPeRRa1rBcEN1Q@mail.gmail.com
Whole thread Raw
In response to Re: Performance problem in PLPgSQL  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Performance problem in PLPgSQL  (dlight <avinfo79@gmail.com>)
List pgsql-hackers



2013/8/23 Tom Lane <tgl@sss.pgh.pa.us>
Pavel Stehule <pavel.stehule@gmail.com> writes:
> please, can you send a self explained test
> this issue should be fixed, and we need a examples.

We already had a perfectly good example at the beginning of this thread.
What's missing is a decision on how we ought to approximate the cost of
planning (relative to execution costs).

As I mentioned upthread, it doesn't seem unreasonable to me to do
something quick-and-dirty based on the length of the plan's rangetable.
Pretty nearly anything would fix these specific situations where the
estimated execution cost is negligible.  It's possible that there are
more complicated cases where we'll need a more accurate estimate, but
we've not seen an example of that yet.

My previous suggestion was to estimate planning cost as
   10 * (length(plan->rangetable) + 1)
but on reflection it ought to be scaled by one of the cpu cost constants,
so perhaps
   1000 * cpu_operator_cost * (length(plan->rangetable) + 1)
which'd mean a custom plan has to be estimated to save a minimum of
about 5 cost units (more if more than 1 table is used) before it'll
be chosen.  I'm tempted to make the multiplier be 10000 not 1000,
but it seems better to be conservative about changing the behavior
until we see how well this works in practice.

Objections, better ideas?

I am thinking so this problem is little bit more complex and using only this formula is too much simplification - although it is big step forward.

* first 5 queries uses a custom plan - it means so lot of procedures uses custom plan for ever (if are executed without pooling, because almost all functions with SQL are not called twice in one connect ) - and there are really only a few reports related to prepared statements or PL/pgSQL performance - so it can demonstrates so planning in PostgreSQL is relative fast process and probably we don't be afraid of more wide using custom plans. Custom plans has a nice a secondary effect - it solve a problems with predicates in form: field = some_constant OR field IS NULL without any special support in planner. But it sometimes 6. query can be slow, because a generic plan is used.

where we can expect a performance problems?

* frequently fast simple statements:

** INSERT INTO table, { UPDATE | DELETE | SELECT } WHERE PK= const - these queries can use a generic plan directly

* planer expensive queries with fast result - these queries can use a generic plans too, with some logic as you describe.

In other cases probably using a custom plans doesn't do a performance issue, we can use it directly.
 
What I see, a prepared plans (prepared statements) are used now more due protection against SQL injection than due saving a planner time - and badly using a generic plan is more worse than repeated planning.

P.S. Can be magic constant 5 (using custom plans) controlled via GUC? Then we can have very good control for some special using where default mechanism fails (0 .. use a generic plans ever, -1 use a generic plan newer)

Regards

Pavel






                        regards, tom lane

pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Unpacking scalar JSON values
Next
From: Michael Paquier
Date:
Subject: Re: Hstore: Query speedups with Gin index