Thread: Performance issue with libpq prepared queries on 9.3 and 9.4
I have hit a rather odd issue with prepared queries on both pg 9.3 and 9.4 beta. I have this table (copy at http://samsaffron.com/testing.db.gz) with a very odd performance profile: When I run the following prepared query it is running significantly slower than the raw counterpart: ``` select * from topics where archetype = $1 limit 1 ``` Full test harness here: ``` #include <stdio.h> #include <stdlib.h> #include <string.h> #include <errno.h> #include <sys/time.h> #include <time.h> #include <libpq-fe.h> static void exit_nicely(PGconn *conn) { PQfinish(conn); exit(1); } static double get_wall_time(){ struct timeval time; if (gettimeofday(&time,NULL)){ // Handle error return 0; } return (double)time.tv_sec + (double)time.tv_usec * .000001; } int main(int argc, char **argv) { const char *conninfo; PGconn *conn; PGresult *res; PGresult *stmt; int i; Oid textOid = 25; char *paramValues[1]; int paramLengths[1]; paramLengths[0] = 6; paramValues[0] = "banner"; if (argc > 1) conninfo = argv[1]; else conninfo = "dbname = testing"; printf("connecting database\n"); /* Make a connection to the database */ conn = PQconnectdb(conninfo); /* Check to see that the backend connection was successfully made */ if (PQstatus(conn) != CONNECTION_OK) { fprintf(stderr, "Connection to database failed: %s", PQerrorMessage(conn)); exit_nicely(conn); } stmt = PQprepare(conn, "test", "select * from topics where archetype = $1 limit 1", 1, &textOid); printf("prepared statement\n"); double start = get_wall_time(); for(i=0; i<2000; i++){ res = PQexecPrepared(conn, "test", 1, paramValues, paramLengths, NULL, 0); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, "command failed: %s", PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } PQclear(res); } double finish = get_wall_time(); fprintf(stderr, "Prepared %f \n", (finish-start)); start = get_wall_time(); for(i=0; i<2000; i++){ res = PQexec(conn, "select * from topics where archetype = 'banner' limit 1"); if (PQresultStatus(res) != PGRES_TUPLES_OK) { fprintf(stderr, "command failed: %s", PQerrorMessage(conn)); PQclear(res); exit_nicely(conn); } PQclear(res); } finish = get_wall_time(); fprintf(stderr, "raw %f \n", (finish-start)); /* close the connection to the database and cleanup */ PQfinish(conn); return 0; } ``` Results: ```text sam@ubuntu pq_play % cc -o play -I/usr/include/postgresql play.c -lpq -L/usr/include/postgresql/libpq && ./play connecting database prepared statement Prepared 9.936938 Raw 1.369071 ``` So my prepared counterpart is running at an 8th the speed of the raw. I had a nightmare of a time generating a workload that exhibits the issue via script but managed to anonymise the data which is linked above. Very strangely when I run the query in "psql" it does not exhibit the issue: ```text sam@ubuntu pq_play % psql testing psql (9.3.5) Type "help" for help. testing=# prepare test as select * from topics where archetype = $1 limit 1; PREPARE testing=# explain analyze execute test('banner'); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.29..651.49 rows=1 width=520) (actual time=0.983..0.983 rows=0 loops=1) -> Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) (actual time=0.980..0.980 rows=0 loops=1) Index Cond: ((archetype)::text = 'banner'::text) Total runtime: 1.037 ms (4 rows) testing=# explain analyze select * from topics where archetype = 'banner' limit 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.29..651.49 rows=1 width=520) (actual time=0.642..0.642 rows=0 loops=1) -> Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) (actual time=0.641..0.641 rows=0 loops=1) Index Cond: ((archetype)::text = 'banner'::text) Total runtime: 0.673 ms (4 rows) ``` Something about running this from libpq is causing it to scan the table as opposed to scanning the index. Any idea why is this happening? (note: Rails 4.2 is moving to a pattern of using prepared statements far more often which is why I discovered this issue) ps. also posted on our meta to keep track of it: https://meta.discourse.org/t/performance-issue-with-prepared-queries/22141
Sam Saffron <sam.saffron@gmail.com> writes: > I have hit a rather odd issue with prepared queries on both pg 9.3 and 9.4 beta. > I have this table (copy at http://samsaffron.com/testing.db.gz) with a > very odd performance profile: Interesting case. The issue seems to be that your statistics look like this: select * from pg_stats where tablename = 'topics' and attname = 'archetype'; schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram ------------+-----------+-----------+-----------+-----------+-----------+------------+---------------------------+---------------------+------------------+-------------+-------------------+------------------------+--------------------- public | topics | archetype | f | 0 | 12 | 2 | {private_message,regular} | {0.604957,0.395043}| | 0.612985 | | | (1 row) That is, archetype consists of 60% 'private_message', 40% 'regular', and absolutely nothing else. So the condition archetype = 'banner' is very selective, and a plan built knowing that that is the parameter value will use the index: # explain select * from topics where archetype = 'banner' limit 1; QUERY PLAN ------------------------------------------------------------------------------ Limit (cost=0.29..651.49 rows=1 width=520) -> Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) Index Cond: ((archetype)::text = 'banner'::text) (3 rows) However, that's still a pretty darn expensive indexscan, mainly because archetype is not the leading key ... if you care about the performance of this query, why don't you have an index to match? # create index on topics(archetype); CREATE INDEX # explain select * from topics where archetype = 'banner' limit 1; QUERY PLAN ------------------------------------------------------------------------------------------- Limit (cost=0.29..6.80 rows=1 width=520) -> Index Scan using topics_archetype_idx on topics (cost=0.29..6.80 rows=1 width=520) Index Cond: ((archetype)::text = 'banner'::text) (3 rows) However, just fixing the index availability actually makes the performance ratio even worse, because the prepared query still doesn't use the index: # explain execute foo('banner'); QUERY PLAN --------------------------------------------------------------------- Limit (cost=0.00..0.11 rows=1 width=520) -> Seq Scan on topics (cost=0.00..1158.19 rows=10088 width=520) Filter: ((archetype)::text = $1) (3 rows) (Yes, you can get this result in psql, you just need to repeat the EXECUTE half a dozen times until it shifts to a generic plan.) The problem here is that without knowledge of the comparison value, the planner assumes that it will probably be one of the two values that make up the table content. (After all, why would you query for something else?) On that basis, a seqscan will probably hit a matching row in no time, and so (with the LIMIT 1) it looks like a better deal than the indexscan. We've talked about this type of problem before. Part of the issue is that costing of LIMIT doesn't apply any penalty for a bad worst-case scenario, and part of it is that the heuristics for choosing between custom and generic plans don't consider the possibility that the generic plan's estimated cost is way wrong for lack of knowledge of the comparison value. It's not real obvious how to improve either heuristic without probably making some cases worse. One thing that occurs to me is that if the generic plan estimate comes out much cheaper than the custom one, maybe we should assume that the generic's cost estimate is bogus. Right offhand I can't think of a reason for a custom plan to look worse than a generic one, unless there's a statistical quirk like this one. In the meantime, I assume that your real data contains a small percentage of values other than these two? If so, maybe cranking up the statistics target would help. If the planner knows that there are more than two values in the column, I think it would be less optimistic about assuming that the comparison value is one of the big two. regards, tom lane
Re: [HACKERS] Re: Performance issue with libpq prepared queries on 9.3 and 9.4
From
David Johnston
Date:
David G Johnston <david.g.johnston@gmail.com> writes:
> Tom Lane-2 wrote
>> In the meantime, I assume that your real data contains a small percentage
>> of values other than these two? If so, maybe cranking up the statistics
>> target would help. If the planner knows that there are more than two
>> values in the column, I think it would be less optimistic about assuming
>> that the comparison value is one of the big two.
> Is there any value (or can value be added) in creating a partial index of
> the form:
> archetype IN ('banner','some other rare value')
> such that the planner will see that such a value is possible but infrequent
> and will, in the presence of a plan using a value contained in the partial
> index, refuse to use a generic plan knowing that it will be unable to use
> the very specific index that the user created?
The existence of such an index wouldn't alter the planner's statistics.
In theory we could make it do so, but I seriously doubt the cost-benefit
ratio is attractive, either as to implementation effort or the added
planning cost.
[adding -general back in...]
While "planner hints" comes to mind...on the SQL side can we extend the "PREPARE" command with two additional keywords?
PREPARE
name [ ( data_type [, ...] ) ] [
[NO] GENERIC]
AS statementI was originally thinking this could attach to EXECUTE and maybe it could there as well. If EXECUTE is bare whatever the PREPARE used would be in effect (a bare PREPARE exhibiting the current dynamic behavior). If EXECUTE and PREPARE disagree execute wins and the current call is (re-)prepared as requested.
We have introduced intelligence to PREPARE/EXECUTE that is not always favorable but provide little way to override it if the user has superior knowledge. The dual role of prepared statements to both prevent SQL-injection as well as create cache-able generic plans further complicates things. In effect by supplying NO GENERIC on the PREPARE the caller is saying they only wish to make use of the SQL-injection aspect of prepared statements. Adding the EXECUTE piece allows for the same plan to be used in injection-prevention mode if the caller knows that the user-supplied value does not play well with the generic plan.
David J.
David Johnston <david.g.johnston@gmail.com> writes: > While "planner hints" comes to mind...on the SQL side can we extend the > "PREPARE" command with two additional keywords? > PREPARE > name [ ( data_type [, ...] ) ] [ > [NO] GENERIC > ] > AS statement Don't really see the point. The OP's problem is that the prepare is being driven by a client-side library, which would have even less clue than the backend as to whether a generic plan is more appropriate than a custom plan. The right thing here IMO is to improve the heuristics on the backend side. I'm sure we can do better, it's just going to take some thought. regards, tom lane
Thank you so much! So to recap the general way to reproduce this issue is: create table products(id int primary key, type varchar); insert into products select generate_series(1,10000), 'aaa'; insert into products select generate_series(10001,20000), 'bbb'; create index idx on products(type); prepare stmt as select * from products where type = $1 limit 1; Which quickly devolves into: explain analyze execute stmt ('ccc'); QUERY PLAN -------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..0.03 rows=1 width=8) (actual time=1.821..1.821 rows=0 loops=1) -> Seq Scan on products (cost=0.00..339.00 rows=10000 width=8) (actual time=1.819..1.819 rows=0 loops=1) Filter: ((type)::text = $1) Rows Removed by Filter: 20000 Total runtime: 1.843 ms (5 rows) So if I am searching for 'ccc' eventually the prepared plan "optimises" and uses the better mechanism of just scanning the table to find the first hit which is what the statistics suggest. However a fairly common pattern I use it to check for "lack of presence" of a value. For example: if the product type 'ccc' is not in the table do this. Unfortunately the optimiser deoptimises this class of operation. I tried the exact example above with an "int" instead of a "varchar" in the "type" column and was not able to reproduce the issue, I wonder if there is some sort of different handling for strings vs numbers. Unfortunately my actual table in play has a rather bad schema, the "archetype" column really should be an int. That said we only have 2 general archetypes at the moment (private message and topic) and the occasional single "banner" outlier, which may or may not be there. So the data modelling is pretty hostile to performance. I have some ideas on how to solve my particular problem but I do have some general concerns. Ruby on Rails is just about to ship a new version that heavily changes the mechanics of query execution. For example, Product.where(name: "foo").first will now result in a prepared query whereas in the past it would just send the raw query. Overall this approach is better and saner, but my general concern is that our API offers no "escape hatch" for these outlier conditions. You can disable globally, but can not just disable for a single call. I will raise this particular concern to the team. My second question/concern is that I feel I am totally misunderstanding the changes to 'plancache.c', I thought that the decision of the plan to use was purely based on the value sent in to the prepared query. However it seems that the planner completely ignores the value in some steps. (so, for example I was thinking that "aaa" and "ccc" would result in completely different plans) Thank you so much for your time, patience and general awesomeness On Fri, Nov 14, 2014 at 11:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Sam Saffron <sam.saffron@gmail.com> writes: >> I have hit a rather odd issue with prepared queries on both pg 9.3 and 9.4 beta. >> I have this table (copy at http://samsaffron.com/testing.db.gz) with a >> very odd performance profile: > > Interesting case. The issue seems to be that your statistics look like > this: > > select * from pg_stats where tablename = 'topics' and attname = 'archetype'; > schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation | most_common_elems | most_common_elem_freqs | elem_count_histogram > ------------+-----------+-----------+-----------+-----------+-----------+------------+---------------------------+---------------------+------------------+-------------+-------------------+------------------------+--------------------- > public | topics | archetype | f | 0 | 12 | 2 | {private_message,regular} | {0.604957,0.395043}| | 0.612985 | | | > (1 row) > > That is, archetype consists of 60% 'private_message', 40% 'regular', and > absolutely nothing else. So the condition archetype = 'banner' is very > selective, and a plan built knowing that that is the parameter value will > use the index: > > # explain select * from topics where archetype = 'banner' limit 1; > QUERY PLAN > ------------------------------------------------------------------------------ > Limit (cost=0.29..651.49 rows=1 width=520) > -> Index Scan using idx11 on topics (cost=0.29..651.49 rows=1 width=520) > Index Cond: ((archetype)::text = 'banner'::text) > (3 rows) > > However, that's still a pretty darn expensive indexscan, mainly because > archetype is not the leading key ... if you care about the performance > of this query, why don't you have an index to match? > > # create index on topics(archetype); > CREATE INDEX > # explain select * from topics where archetype = 'banner' limit 1; > QUERY PLAN > ------------------------------------------------------------------------------------------- > Limit (cost=0.29..6.80 rows=1 width=520) > -> Index Scan using topics_archetype_idx on topics (cost=0.29..6.80 rows=1 width=520) > Index Cond: ((archetype)::text = 'banner'::text) > (3 rows) > > However, just fixing the index availability actually makes the performance > ratio even worse, because the prepared query still doesn't use the index: > > # explain execute foo('banner'); > QUERY PLAN > --------------------------------------------------------------------- > Limit (cost=0.00..0.11 rows=1 width=520) > -> Seq Scan on topics (cost=0.00..1158.19 rows=10088 width=520) > Filter: ((archetype)::text = $1) > (3 rows) > > (Yes, you can get this result in psql, you just need to repeat the EXECUTE > half a dozen times until it shifts to a generic plan.) > > The problem here is that without knowledge of the comparison value, the > planner assumes that it will probably be one of the two values that make > up the table content. (After all, why would you query for something > else?) On that basis, a seqscan will probably hit a matching row in no > time, and so (with the LIMIT 1) it looks like a better deal than the > indexscan. > > We've talked about this type of problem before. Part of the issue is > that costing of LIMIT doesn't apply any penalty for a bad worst-case > scenario, and part of it is that the heuristics for choosing between > custom and generic plans don't consider the possibility that the generic > plan's estimated cost is way wrong for lack of knowledge of the comparison > value. It's not real obvious how to improve either heuristic without > probably making some cases worse. > > One thing that occurs to me is that if the generic plan estimate comes > out much cheaper than the custom one, maybe we should assume that the > generic's cost estimate is bogus. Right offhand I can't think of a reason > for a custom plan to look worse than a generic one, unless there's a > statistical quirk like this one. > > In the meantime, I assume that your real data contains a small percentage > of values other than these two? If so, maybe cranking up the statistics > target would help. If the planner knows that there are more than two > values in the column, I think it would be less optimistic about assuming > that the comparison value is one of the big two. > > regards, tom lane
On Thu, Nov 13, 2014 at 7:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > One thing that occurs to me is that if the generic plan estimate comes > out much cheaper than the custom one, maybe we should assume that the > generic's cost estimate is bogus. Right offhand I can't think of a reason > for a custom plan to look worse than a generic one, unless there's a > statistical quirk like this one. That's an interesting idea, but what do we do after deciding that it's bogus? The generic plan really can't be cheaper than the custom plan, but it could be the same price, or as close as makes no difference. I have long thought that maybe the custom vs. generic plan decision should have something to do with whether the parameters are MCVs of the table columns they're being compared against. This case is interesting because it demonstrates that querying for a *non*-MCV can require a switch to a custom plan, which is something I don't think I've seen before. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Nov 13, 2014 at 7:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> One thing that occurs to me is that if the generic plan estimate comes >> out much cheaper than the custom one, maybe we should assume that the >> generic's cost estimate is bogus. Right offhand I can't think of a reason >> for a custom plan to look worse than a generic one, unless there's a >> statistical quirk like this one. > That's an interesting idea, but what do we do after deciding that it's > bogus? Keep using custom plans. It's possible that the estimate that's in error is the custom one, but that's not the way to bet IMO, since the custom plan estimate is based on better information. > The generic plan really can't be cheaper than the custom plan, > but it could be the same price, or as close as makes no difference. Right, and what we want to do is use the generic plan as long as it's close to the same cost (close enough to not justify replanning effort). The trick here is to not be fooled by estimation errors. Can we assume that generic cost < custom cost is always an estimation error? Another idea that occurred to me is to run a planning cycle in which the actual parameter values are made available to the planner, but as estimates not hard constants (this facility already exists, it's just not being used by plancache.c). This would yield cost estimates that are more safely comparable to the custom plan. But I'm not sure that we'd want to expend yet another planning cycle to do this, nor am I sure that we'd want to use such a plan as The Generic Plan. regards, tom lane
One interesting option would be kicking in an extra more expensive planning cycle after the Nth run of the query, in general a lot of these planned queries run 1000s of times, if you add some extra cost to run 100 it may not be prohibitive cost wise. On Tue, Nov 18, 2014 at 8:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Nov 13, 2014 at 7:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> One thing that occurs to me is that if the generic plan estimate comes >>> out much cheaper than the custom one, maybe we should assume that the >>> generic's cost estimate is bogus. Right offhand I can't think of a reason >>> for a custom plan to look worse than a generic one, unless there's a >>> statistical quirk like this one. > >> That's an interesting idea, but what do we do after deciding that it's >> bogus? > > Keep using custom plans. It's possible that the estimate that's in error > is the custom one, but that's not the way to bet IMO, since the custom > plan estimate is based on better information. > >> The generic plan really can't be cheaper than the custom plan, >> but it could be the same price, or as close as makes no difference. > > Right, and what we want to do is use the generic plan as long as it's > close to the same cost (close enough to not justify replanning effort). > The trick here is to not be fooled by estimation errors. Can we assume > that generic cost < custom cost is always an estimation error? > > Another idea that occurred to me is to run a planning cycle in which the > actual parameter values are made available to the planner, but as > estimates not hard constants (this facility already exists, it's just not > being used by plancache.c). This would yield cost estimates that are more > safely comparable to the custom plan. But I'm not sure that we'd want to > expend yet another planning cycle to do this, nor am I sure that we'd want > to use such a plan as The Generic Plan. > > regards, tom lane
Re: [HACKERS] Performance issue with libpq prepared queries on 9.3 and 9.4
From
Alban Hertroys
Date:
On 17 November 2014 22:27, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Another idea that occurred to me is to run a planning cycle in which the > actual parameter values are made available to the planner, but as > estimates not hard constants (this facility already exists, it's just not > being used by plancache.c). This would yield cost estimates that are more > safely comparable to the custom plan. But I'm not sure that we'd want to > expend yet another planning cycle to do this, nor am I sure that we'd want > to use such a plan as The Generic Plan. > > regards, tom lane Perhaps a somewhat naive idea, I only have the broad picture of how the query planner works, but... What if prepared statements would not store an entirely pinned down version of the query plan, but instead stores a smashed down version of the query plan that still leaves room for choosing some different paths based on key decision criteria? For example, if an input parameter value matches the most common values, choose the sequential scan path (as in the OP's case, IIRC) and if it's not, attempt an index scan. Of course, one implication of doing this is likely a decrease in planning performance (which matters for simple queries), but if it results in better plan choices for complex queries that may be a nett gain. I recently followed an introductory class about neural networks and the decision logic seems to look like the neuron/perceptron pattern. I'm just throwing this out here in case it's a viable option and nobody else in the world thought of this, however unlikely ;) -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
On Mon, Nov 17, 2014 at 4:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Nov 13, 2014 at 7:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> One thing that occurs to me is that if the generic plan estimate comes >>> out much cheaper than the custom one, maybe we should assume that the >>> generic's cost estimate is bogus. Right offhand I can't think of a reason >>> for a custom plan to look worse than a generic one, unless there's a >>> statistical quirk like this one. > >> That's an interesting idea, but what do we do after deciding that it's >> bogus? > > Keep using custom plans. It's possible that the estimate that's in error > is the custom one, but that's not the way to bet IMO, since the custom > plan estimate is based on better information. > >> The generic plan really can't be cheaper than the custom plan, >> but it could be the same price, or as close as makes no difference. > > Right, and what we want to do is use the generic plan as long as it's > close to the same cost (close enough to not justify replanning effort). > The trick here is to not be fooled by estimation errors. Can we assume > that generic cost < custom cost is always an estimation error? Maybe. It seems like kind of a fragile bet to me. There's going to be some qual selectivity below which an index scan on a particular table outperforms a sequential scan, but the selectivity estimated for a generic plan can be either higher or lower than the selectivity we'd estimate for some particular value. And once one of the two plans decides on an index scan while the other one divides on a sequential scan, it can cascade through and change the whole plan - e.g. because it affects whether the tuples emerge with usable pathkeys. I don't feel very confident about assuming that applying < to the result of all that is going to tell us anything useful. I think what's really going on here is that the generic plan will be optimal for some range of possible qual selectivities. Typically, the qual is of the form col = val, so the plan will be optimal for all values where the estimated frequency is between some values A and B. What would be nice is to notice when we see a value that is outside that range, and switch to a custom plan in that case. I realize that the planner isn't really well set up to deliver the information we'd need to test that for each new supplied value, but that's really what we need to do. The current system wastes CPU cycles by replanning up to 5 times even when there is no benefit to be gained by it, but can also cause big performance problems when it settles into a generic plan and then a value with different characteristics shows up later on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company