Re: PostgreSQL performance enhancement when query - Mailing list pgsql-hackers

From Csaba Nagy
Subject Re: PostgreSQL performance enhancement when query
Date
Msg-id 1155027253.21451.168.camel@coppola.muc.ecircle.de
Whole thread Raw
In response to PostgreSQL performance enhancement when query planner fails to guess the right plan  (Constantin Teodorescu <teo@flex.ro>)
Responses Re: PostgreSQL performance enhancement when query planner
List pgsql-hackers
Constantin,

What binding are you using ? We here use Java+JDBC, and we were able to
get stable query plans by forcing server side prepared statements (using
PGStatement#setPrepareThreshold with 1 as the threshold), where the
query is prepared without knowing the parameter values. This can
backfire too, but for our purposes it was the right thing (probably
sacrificing some performance, but getting a stable system). The plans in
this case are made to work with guessed mean values for the estimates,
and that's usually resulting in a stable plan, so once you got it right
it will stay like that.

Cheers,
Csaba.

On Mon, 2006-08-07 at 22:02, Constantin Teodorescu wrote:
> Hello all, hope you are remembering me, some years ago I've designed the 
> PgAccess , the Tcl/Tk visual interface to PostgreSQL.
> 
> Thought you haven't received any news from me, I continued working with 
> PostgreSQL, being involved in very big projects in Romania.
> Right now, the national identification of the cows, sheep, goats and 
> pigs in Romania runs on PostgreSQL on a very big database.
> Once again , I had to thank you all for keeping up maintaining and 
> improving PostgreSQL.
> 
> My message to all of you is related to this big project (a government 
> sustained project) and some performance issues.
> 
> Very few words about the database: approx. 60 tables, 30 of them 
> containing 10 millions to 50 millions records , the whole database is 
> approx 40 Gb size !
> 
> In order to get a good performance, the database is operated on a dual 
> XEON with 6 Gb RAM IBM x235 server, the database with tables and indexes 
> carefully distributed on 6 different SCSI disks, in different 
> tablespaces in such a manner to allow parallelizing reads and HDD head 
> movements on different devices when joining those big tables.
> 
> We have tuned every possible parameter in config file, we have 
> reorganized queries, analyzing explains in order to get the best results 
> for all big queries and we succeeded most of the time.
> But we have encountered some problems. Due to constant updates and 
> inserts into the database, it's size is growing continuously.
> Of course we are doing DAILY the needed maintaince, vacuums, analyzes 
> and backups.
> Due to permanent changes in database size and statistics there are 
> queries that sometimes change their execution plan, badly choosing 
> another plan and executing those queries in 2,3 minutes instead of 10 
> seconds, the usual execution time since the query plan is "switched". We 
> have done any effort in changing subselects and the query sentence in 
> order to "force" using some indexes, continuously watching the explain 
> results.
> 
> We have faced yesterday with such a problem with a query that "switched" 
> the query plan to a very bad one, almost putting the whole system down.
> The only way that we have succeeded to make it work again was by using 
> the "SET ENABLE_MERGE_JOIN to OFF".
> For the moment it works but in our opinion this is NOT the best approach 
> to guide the planner to a better query-plan variant.
> 
> Our suggestion would be : extending the EXPLAIN and SELECT commands like 
> that:
> 
> EXPLAIN VARIANTS SELECT ...... (and so on) that will display the 
> different query plans analyzed by the planner and their "estimated time 
> values" , not just the "best guess" .
> 
> assuming that the EXPLAIN VARIANTS will show 3 or 4 different query 
> plans, the database manager will be able to experiment, to test, and to 
> decide by himself what is "THE BEST PLAN FOR ME", instead of letting 
> postgresql planner to to that. Doing this, we would be able to clearly 
> specify then in the SELECT statement the "version" of the query-plan 
> that would be used in execution like in the following example:
> 
> SELECT .... (very big and complex query) ... USING PLAN 3;
> 
> Specifying the desired plan could be of course, different.
> I realise that it would be probably better that the query-plan will 
> guess the right and optimal plan. I agree that this can be done be 
> tweaking parameters and costs BUT THIS CAN TAKE A LOT OF TIME, much more 
> than a couple of tests on the real database. An experimented database 
> admin can detect much easier the appropriate plan and "force" the 
> executor to select that one that he desires.
> 
> In our opinion, this would be the simplest and the most non-intrusive 
> method of "manual choosing" another query plan rather than indirectly 
> setting ON or OFFS various parameters that could affect badly other 
> queries.
> First of all, it's assumed that the query planner HAS ALREADY evaluated 
> different variants and it decides to use one based upon the statistics 
> informations of the involved tables and "costs" for various types of 
> access.
> Unfortunately, due to a very difficult adjustment of those costs and 
> timings of the HDD performance, IO transfer speeds, PostgreSQL is 
> choosing sometimes a wrong plan.
> If we would have the power of choosing and experimenting different plans 
> with "SELECT .... USING PLAN <that-one>" we can select than the right 
> one in our real world.
> 
> The "... USING PLAN 9" extension to the language I hope that it's the 
> most delicate and innocent  :-) that I hope that it can be accepted and 
> it will give a extremely powerful way of controlling the execution 
> performance.
> 
> Hope that you know the old joke with someone who is receiving an email 
> message : "Hi , I'm the Albanian virus. Due to our poor technologies, I 
> cannot do much so please delete some of your files and pretend to be 
> scared" :-)
> I admit that this approach might be called "the Albanian way of choosing 
> the best query plan" :-) but you must admit also that it does not place 
> a big burden on the developers, it does not change anything in what have 
> been done since now and it allows the developers and database 
> administrators to dive into the query plan ocean and to get out of there 
> the best of the quickest of the fastest query plan. :-)
> 
> Hope that this long message have not disturb you so much ... so I'm 
> waiting for your comments and suggestions.
> 
> Best regards,
> Constantin Teodorescu
> Braila, ROMANIA
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings



pgsql-hackers by date:

Previous
From: "Florian G. Pflug"
Date:
Subject: Re: "Constraint exclusion" is not general enough
Next
From: Csaba Nagy
Date:
Subject: Re: PostgreSQL performance enhancement when query