Home > mailing lists

Re: Performance problem in PLPgSQL - Mailing list pgsql-hackers

From	Pavel Stehule
Subject	Re: Performance problem in PLPgSQL
Date	August 25, 2013 21:27:55
Msg-id	CAFj8pRDq24w_efgPcE_KCs_GtyvAP0F-AhXdXPeRRa1rBcEN1Q@mail.gmail.com Whole thread Raw
In response to	Re: Performance problem in PLPgSQL (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Performance problem in PLPgSQL (dlight <avinfo79@gmail.com>)
List	pgsql-hackers

Tree view

2013/8/23 Tom Lane <tgl@sss.pgh.pa.us>

Pavel Stehule <pavel.stehule@gmail.com> writes:
> please, can you send a self explained test
> this issue should be fixed, and we need a examples.

We already had a perfectly good example at the beginning of this thread.
What's missing is a decision on how we ought to approximate the cost of
planning (relative to execution costs).

As I mentioned upthread, it doesn't seem unreasonable to me to do
something quick-and-dirty based on the length of the plan's rangetable.
Pretty nearly anything would fix these specific situations where the
estimated execution cost is negligible. It's possible that there are
more complicated cases where we'll need a more accurate estimate, but
we've not seen an example of that yet.

My previous suggestion was to estimate planning cost as
10 * (length(plan->rangetable) + 1)
but on reflection it ought to be scaled by one of the cpu cost constants,
so perhaps
1000 * cpu_operator_cost * (length(plan->rangetable) + 1)
which'd mean a custom plan has to be estimated to save a minimum of
about 5 cost units (more if more than 1 table is used) before it'll
be chosen. I'm tempted to make the multiplier be 10000 not 1000,
but it seems better to be conservative about changing the behavior
until we see how well this works in practice.

Objections, better ideas?

I am thinking so this problem is little bit more complex and using only this formula is too much simplification - although it is big step forward.

* first 5 queries uses a custom plan - it means so lot of procedures uses custom plan for ever (if are executed without pooling, because almost all functions with SQL are not called twice in one connect ) - and there are really only a few reports related to prepared statements or PL/pgSQL performance - so it can demonstrates so planning in PostgreSQL is relative fast process and probably we don't be afraid of more wide using custom plans. Custom plans has a nice a secondary effect - it solve a problems with predicates in form: field = some_constant OR field IS NULL without any special support in planner. But it sometimes 6. query can be slow, because a generic plan is used.

where we can expect a performance problems?

* frequently fast simple statements:

** INSERT INTO table, { UPDATE | DELETE | SELECT } WHERE PK= const - these queries can use a generic plan directly

* planer expensive queries with fast result - these queries can use a generic plans too, with some logic as you describe.

In other cases probably using a custom plans doesn't do a performance issue, we can use it directly.

What I see, a prepared plans (prepared statements) are used now more due protection against SQL injection than due saving a planner time - and badly using a generic plan is more worse than repeated planning.

P.S. Can be magic constant 5 (using custom plans) controlled via GUC? Then we can have very good control for some special using where default mechanism fails (0 .. use a generic plans ever, -1 use a generic plan newer)

Regards

Pavel

regards, tom lane

pgsql-hackers by date:

From: Josh Berkus
Date: 25 August 2013, 21:14:12
Subject: Re: Unpacking scalar JSON values

From: Michael Paquier
Date: 26 August 2013, 04:37:05
Subject: Re: Hstore: Query speedups with Gin index

Re: Performance problem in PLPgSQL - Mailing list pgsql-hackers

Previous

Next