Thread: query planner not using the correct index
This query is run on a test system just after a backup of the database has been restored and it does exactly what I expect it to do EXPLAIN ANALYZE SELECT foos.* FROM foos INNER JOIN bars ON foos.id = bars.foos_id WHERE ((bars.bars_id = 12345)) ORDER BY attr1 LIMIT 3 OFFSET 0; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=12946.83..12946.83 rows=3 width=1175) (actual time=0.123..0.131 rows=1 loops=1) -> Sort (cost=12946.83..12950.83 rows=1602 width=1175) (actual time=0.116..0.119 rows=1 loops=1) Sort Key: foos.attr1 -> Nested Loop (cost=28.69..12035.56 rows=1602 width=1175) (actual time=0.071..0.086 rows=1 loops=1) -> Bitmap Heap Scan on bars (cost=28.69..2059.66 rows=1602 width=4) (actual time=0.036..0.039 rows=1 loops=1) Recheck Cond: (bars_id = 12345) -> Bitmap Index Scan on index_bars_on_bars_id (cost=0.00..28.29 rows=1602 width=0) (actual time=0.024..0.024 rows=1 loops=1) Index Cond: (bars_id = 12345) -> Index Scan using foos_pkey on foos (cost=0.00..6.21 rows=1 width=1175) (actual time=0.017..0.021 rows=1 loops=1) Index Cond: (foos.id = bars.foos_id) Total runtime: 0.350 ms This query is run on a production system and is using foos_1attr1 which is an index on attr1 which is a string. EXPLAIN ANALYZE SELECT foos.* FROM foos INNER JOIN bars ON foos.id = bars.foos_id WHERE ((bars.bars_id = 12345)) ORDER BY attr1 LIMIT 3 OFFSET 0; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..2847.31 rows=3 width=332) (actual time=6175.515..6414.599 rows=1 loops=1) -> Nested Loop (cost=0.00..287578.30 rows=303 width=332) (actual time=6175.510..6414.591 rows=1 loops=1) -> Index Scan using foos_1attr1 on foos (cost=0.00..128038.65 rows=1602 width=332) (actual time=0.182..2451.923 rows=2498 loops=1) -> Index Scan using bars_1ix on bars (cost=0.00..0.37 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=421939) Index Cond: (foos.id = bars.foos_id) Filter: (bars_id = 12345) Total runtime: 6414.804 ms
Joshua Shanks wrote: > This query is run on a test system just after a backup of the database > has been restored and it does exactly what I expect it to do [snip] Obvious questions: - Have you changed the random page cost on either installation? - Have both installations had VACUUM ANALYZE run recently? - Are the stats targets the same on both installations? - Do both installations have similar shared buffers, total available RAM info, etc? -- Craig Ringer
> - Have you changed the random page cost on either installation? This is whatever the default is for both boxes (commented config file says 4.0) > - Have both installations had VACUUM ANALYZE run recently? This is the first thing I did and didn't seem to do anything. Oddly enough I just went and did a VACUUM ANALYZE on a newly restored db on the test server and get the same query plan as production so I am now guessing something with the stats from ANALYZE are making postgres think the string index is the best bet but is clearly 1000's of times slower. > - Are the stats targets the same on both installations? If you mean default_statistics_target that is also the default (commented config file says 10) > - Do both installations have similar shared buffers, total available RAM info, etc? The boxes have different configs as the test box isn't as big as the production on so it doesn't have as much resources available or allocated to it. I did run the query on the backup db box (exact same hardware and configuration as the production box) which gets restored from a backup periodically (how I populated the test db) and got the same results as the test box.
Joshua Shanks wrote: >> - Have you changed the random page cost on either installation? > > This is whatever the default is for both boxes (commented config file says 4.0) > >> - Have both installations had VACUUM ANALYZE run recently? > > This is the first thing I did and didn't seem to do anything. > > Oddly enough I just went and did a VACUUM ANALYZE on a newly restored > db on the test server and get the same query plan as production so I > am now guessing something with the stats from ANALYZE are making > postgres think the string index is the best bet but is clearly 1000's > of times slower. OK, that's interesting. There are ways to examine Pg's statistics on columns, get an idea of which stats might be less than accurate, etc, but I'm not really familiar enough with it all to give you any useful advice on the details. I can make one suggestion in the vein of shotgun throubleshooting, though: Try altering the statistics targets on the tables of interest, or tweak the default_statistics_target, then rerun VACUUM ANALYZE and re-test. Maybe start with a stats target of 100 and see what happens. -- Craig Ringer
On Wed, Aug 6, 2008 at 10:24 PM, Craig Ringer <craig@postnewspapers.com.au> wrote: > OK, that's interesting. There are ways to examine Pg's statistics on > columns, get an idea of which stats might be less than accurate, etc, > but I'm not really familiar enough with it all to give you any useful > advice on the details. I can make one suggestion in the vein of shotgun > throubleshooting, though: > > Try altering the statistics targets on the tables of interest, or tweak > the default_statistics_target, then rerun VACUUM ANALYZE and re-test. > Maybe start with a stats target of 100 and see what happens. > > -- > Craig Ringer I tried 100, 500, and 1000 for default_statistics_target. I think below is the right query to examine the stats. None of the levels of default_statistics_target I tried changed the query planners behavior. It seems obvious that the stats on attr1 at the current level are inaccurate as there are over 100,000 unique enteries in the table. But even tweaking them to be more accurate doesn't seem to add any benefit. default_statistics_target = 10 SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename = 'foos' AND attname='attr1'; null_frac | n_distinct | most_common_vals | most_common_freqs -----------+------------+------------------+------------------- 0 | 1789 | {""} | {0.625667} default_statistics_target = 100 SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename = 'foo' AND attname='attr1'; null_frac | n_distinct | most_common_vals | most_common_freqs -------------+------------+------------------+------------------- 0.000266667 | 17429 | {""} | {0.6223} default_statistics_target = 500 SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename = 'foo' AND attname='attr1'; null_frac | n_distinct | most_common_vals | most_common_freqs -------------+------------+------------------+------------------- 0.000293333 | -0.17954 | {""} | {0.62158} default_statistics_target = 1000 SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename = 'foo' AND attname='attr1'; null_frac | n_distinct | most_common_vals | most_common_freqs -------------+------------+------------------+------------------- 0.000293333 | -0.304907 | {""} | {0.621043}
"Joshua Shanks" <jjshanks@gmail.com> writes: > It seems obvious that the stats on attr1 at the current level are > inaccurate as there are over 100,000 unique enteries in the table. Well, you haven't told us how big any of these tables are, so it's hard to tell if the n_distinct value is wrong or not ... but in any case I don't think that the stats on attr1 have anything to do with your problem. The reason that the "fast" query is fast is that it benefits from the fact that there's only one bars row with bars_id = 12345. So the question is how many such rows does the planner now think there are (try "explain analyze select * from bars where bars_id = 12345"), and if it's badly wrong, then you need to be looking at the stats on bars.bars_id to find out why. regards, tom lane
On Thu, Aug 7, 2008 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, you haven't told us how big any of these tables are, so it's > hard to tell if the n_distinct value is wrong or not ... but in > any case I don't think that the stats on attr1 have anything to do > with your problem. The reason that the "fast" query is fast is that > it benefits from the fact that there's only one bars row with > bars_id = 12345. So the question is how many such rows does the > planner now think there are (try "explain analyze select * from bars > where bars_id = 12345"), and if it's badly wrong, then you need to be > looking at the stats on bars.bars_id to find out why. > > regards, tom lane > foo is 400,000+ rows bar is 300,000+ rows I was just about to write back about this as with all my tinkering today I figured that to be the root cause. SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats WHERE tablename = 'bars' AND attname='bars_id'; null_frac | n_distinct | most_common_vals | most_common_freqs -----------+------------+----------------------+--------------------------- 0 | 14 | {145823,47063,24895} | {0.484667,0.257333,0.242} Those 3 values in reality and in the stats account for 98% of the rows. actual distinct values are around 350 That plus the information information on http://www.postgresql.org/docs/8.3/static/indexes-ordering.html make it all make sense as to why the query planner is doing what it is doing. The only problem is we rarely if ever call the query with the where clause containing those values. I did some testing and the planner works awesome if we were to call those values but 99.9% of the time we are calling other values. It seems like the planner would want to get the result set from bars.bars_id condition and if it is big using the index on the join to avoid the separate sorting, but if it is small (0-5 rows which is our normal case) use the primary key index to join and then just quickly sort. Is there any reason the planner doesn't do this? I found a way to run the query as a subselect which is fast for our normal case but doesn't work for the edge cases so I might just have to do count on the bars_id and then pick a query based on that.
"Joshua Shanks" <jjshanks@gmail.com> writes: > Those 3 values in reality and in the stats account for 98% of the > rows. actual distinct values are around 350 Measuring n_distinct from a sample is inherently difficult and unreliable. When 98% of your table falls into those categories it's leaving very few chances for the sample to find many other distinct values. I haven't seen the whole thread, if you haven't tried already you could try raising the statistics target for these columns -- that's usually necessary anyways when you have a very skewed distribution like this. > It seems like the planner would want to get the result set from > bars.bars_id condition and if it is big using the index on the join to > avoid the separate sorting, but if it is small (0-5 rows which is our > normal case) use the primary key index to join and then just quickly > sort. Is there any reason the planner doesn't do this? Yeah, Heikki's suggested having a kind of "branch" plan node that knows how where the break-point is between two plans and can call the appropriate one. We don't have anything like that yet. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
On Thu, Aug 7, 2008 at 5:38 PM, Gregory Stark <stark@enterprisedb.com> wrote: > Measuring n_distinct from a sample is inherently difficult and unreliable. > When 98% of your table falls into those categories it's leaving very few > chances for the sample to find many other distinct values. > > I haven't seen the whole thread, if you haven't tried already you could try > raising the statistics target for these columns -- that's usually necessary > anyways when you have a very skewed distribution like this. > I did some tweaking on default_statistics_target earlier in the thread with no luck. I just retried it with default_statistics_target set to 500 and did the VACUUM ANALYZE on the other table this time and started to see better results and more of the behavior I would expect. Is there a way to set the stats target for just one column? That seems like what we might need to do. > Yeah, Heikki's suggested having a kind of "branch" plan node that knows how > where the break-point is between two plans and can call the appropriate one. > We don't have anything like that yet. > Is this already on a todo list or is there a bug for it?
"Joshua Shanks" <jjshanks@gmail.com> writes: > SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM > pg_stats WHERE tablename = 'bars' AND attname='bars_id'; > null_frac | n_distinct | most_common_vals | most_common_freqs > -----------+------------+----------------------+--------------------------- > 0 | 14 | {145823,47063,24895} | {0.484667,0.257333,0.242} > Those 3 values in reality and in the stats account for 98% of the > rows. actual distinct values are around 350 So you need to increase the stats target for this column. With those numbers the planner is going to assume that any value that's not one of the big three appears about (1 - (0.484667+0.257333+0.242)) / 11 of the time, or several hundred times in 300K rows. If n_distinct were up around 350 it would be estimating just a dozen or so occurrences, which should push the join plan into the shape you want. It's likely that it won't bother to include any more entries in most_common_vals no matter how much you raise the target; but a larger sample should definitely give it a better clue about n_distinct. regards, tom lane
Yeah with default_statistics_target at 500 most_common_vals had 4 values with the fourth having a frequency of 1.5% and distinct have 250+ in it. How do I increase the stats target for just one column? On Thu, Aug 7, 2008 at 6:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Joshua Shanks" <jjshanks@gmail.com> writes: >> SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM >> pg_stats WHERE tablename = 'bars' AND attname='bars_id'; >> null_frac | n_distinct | most_common_vals | most_common_freqs >> -----------+------------+----------------------+--------------------------- >> 0 | 14 | {145823,47063,24895} | {0.484667,0.257333,0.242} > >> Those 3 values in reality and in the stats account for 98% of the >> rows. actual distinct values are around 350 > > So you need to increase the stats target for this column. With those > numbers the planner is going to assume that any value that's not one > of the big three appears about (1 - (0.484667+0.257333+0.242)) / 11 > of the time, or several hundred times in 300K rows. If n_distinct were > up around 350 it would be estimating just a dozen or so occurrences, > which should push the join plan into the shape you want. It's likely > that it won't bother to include any more entries in most_common_vals > no matter how much you raise the target; but a larger sample should > definitely give it a better clue about n_distinct. > > regards, tom lane >
"Joshua Shanks" <jjshanks@gmail.com> writes: > How do I increase the stats target for just one column? Look under ALTER TABLE. regards, tom lane
Just for closure I ended up doing ALTER TABLE bars ALTER COLUMN bars_id SET STATISTICS 500; On Thu, Aug 7, 2008 at 7:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Joshua Shanks" <jjshanks@gmail.com> writes: >> How do I increase the stats target for just one column? > > Look under ALTER TABLE. > > regards, tom lane >