Thread: Unexpected query plan results
Hi, We have one query which has a left join. If we run this query without the left join, it runs slower than with the left join. -query with the left join: EXPLAIN ANALYZE SELECT artifact.id AS id, artifact.priority AS priority, item.title AS title, item.name AS name, field_value2.value AS status, field_value3.value AS category, sfuser.username AS submittedByUsername, sfuser.full_name AS submittedByFullname, sfuser2.username AS assignedToUsername, sfuser2.full_name AS assignedToFullname, item.version AS version FROM sfuser sfuser, relationship relationship, item item, field_value field_value3, sfuser sfuser2, project project, field_value field_value2, field_value field_value, artifact artifact, folder folder, field_value field_value4 WHERE artifact.id=item.id AND item.folder_id=folder.id AND folder.project_id=project.id AND artifact.group_fv=field_value.id AND artifact.status_fv=field_value2.id AND artifact.category_fv=field_value3.id AND artifact.customer_fv=field_value4.id AND item.created_by_id=sfuser.id AND relationship.is_deleted=false AND relationship.relationship_type_name='ArtifactAssignment' AND relationship.origin_id=sfuser2.id AND artifact.id=relationship.target_id AND item.is_deleted=false AND project.path='projects.gl_coconet_performance_improveme' AND item.folder_id='tracker3641' AND folder.path='tracker.perf_test' AND (field_value2.value_class='Open'); ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=47645.19..89559.37 rows=1 width=155) (actual time=4411.623..6953.329 rows=71 loops=1) Hash Cond: ((folder.project_id)::text = (project.id)::text) -> Nested Loop (cost=47640.91..89553.64 rows=384 width=167) (actual time=4411.558..6953.136 rows=71 loops=1) -> Index Scan using folder_pk on folder (cost=0.00..4.35 rows=1 width=26) (actual time=0.029..0.032 rows=1 loops=1) Index Cond: ('tracker3641'::text = (id)::text) Filter: ((path)::text = 'tracker.perf_test'::text) -> Nested Loop (cost=47640.91..89545.46 rows=384 width=168) (actual time=4411.525..6953.052 rows=71 loops=1) -> Nested Loop (cost=47640.91..89434.35 rows=384 width=150) (actual time=4411.508..6952.049 rows=71 loops=1) -> Nested Loop (cost=47640.91..89296.15 rows=384 width=149) (actual time=4411.489..6950.823 rows=71loops=1) -> Nested Loop (cost=47640.91..89157.95 rows=384 width=162) (actual time=4411.469..6949.629rows=71 loops=1) -> Nested Loop (cost=47640.91..89019.74 rows=384 width=175) (actual time=4411.443..6948.289rows=71 loops=1) -> Nested Loop (cost=47640.91..88464.52 rows=1819 width=157) (actual time=4411.418..6947.188rows=71 loops=1) -> Merge Join (cost=47640.91..83661.94 rows=2796 width=158) (actual time=4411.355..6945.443rows=71 loops=1) Merge Cond: ((item.id)::text = "inner"."?column7?") -> Index Scan using item_pk on item (cost=0.00..176865.31 rows=97498width=88) (actual time=117.304..2405.060 rows=71 loops=1) Filter: ((NOT is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) -> Sort (cost=47640.91..47808.10 rows=66876 width=70) (actual time=4273.919..4401.387rows=168715 loops=1) Sort Key: (artifact.id)::text -> Hash Join (cost=9271.96..42281.07 rows=66876 width=70) (actualtime=124.119..794.667 rows=184378 loops=1) Hash Cond: ((artifact.status_fv)::text = (field_value2.id)::text) -> Seq Scan on artifact (cost=0.00..25206.14 rows=475614width=69) (actual time=0.008..214.459 rows=468173 loops=1) -> Hash (cost=8285.92..8285.92 rows=78883 width=27) (actualtime=124.031..124.031 rows=79488 loops=1) -> Index Scan using field_class_idx on field_valuefield_value2 (cost=0.00..8285.92 rows=78883 width=27) (actual time=0.049..60.599 rows=79488 loops=1) Index Cond: ((value_class)::text = 'Open'::text) -> Index Scan using relation_target on relationship (cost=0.00..1.71 rows=1width=25) (actual time=0.022..0.022 rows=1 loops=71) Index Cond: ((artifact.id)::text = (relationship.target_id)::text) Filter: ((NOT is_deleted) AND ((relationship_type_name)::text = 'ArtifactAssignment'::text)) -> Index Scan using sfuser_pk on sfuser (cost=0.00..0.29 rows=1 width=42) (actualtime=0.013..0.013 rows=1 loops=71) Index Cond: ((item.created_by_id)::text = (sfuser.id)::text) -> Index Scan using field_value_pk on field_value field_value4 (cost=0.00..0.35 rows=1width=13) (actual time=0.017..0.017 rows=1 loops=71) Index Cond: ((artifact.customer_fv)::text = (field_value4.id)::text) -> Index Scan using field_value_pk on field_value (cost=0.00..0.35 rows=1 width=13) (actualtime=0.015..0.015 rows=1 loops=71) Index Cond: ((artifact.group_fv)::text = (field_value.id)::text) -> Index Scan using field_value_pk on field_value field_value3 (cost=0.00..0.35 rows=1 width=27) (actualtime=0.015..0.015 rows=1 loops=71) Index Cond: ((artifact.category_fv)::text = (field_value3.id)::text) -> Index Scan using sfuser_pk on sfuser sfuser2 (cost=0.00..0.28 rows=1 width=42) (actual time=0.012..0.012rows=1 loops=71) Index Cond: ((relationship.origin_id)::text = (sfuser2.id)::text) -> Hash (cost=4.27..4.27 rows=1 width=12) (actual time=0.048..0.048 rows=1 loops=1) -> Index Scan using project_path on project (cost=0.00..4.27 rows=1 width=12) (actual time=0.041..0.042 rows=1loops=1) Index Cond: ((path)::text = 'projects.gl_coconet_performance_improveme'::text) Total runtime: 6966.099 ms -same query but without the left join EXPLAIN ANALYZE SELECT artifact.id AS id, artifact.priority AS priority, item.title AS title, item.name AS name, field_value2.value AS status, field_value3.value AS category, sfuser.username AS submittedByUsername, sfuser.full_name AS submittedByFullname, sfuser2.username AS assignedToUsername, sfuser2.full_name AS assignedToFullname, item.version AS version , mntr_subscription.user_id AS monitoringUserId FROM sfuser sfuser, relationship relationship, item item, field_value field_value3, sfuser sfuser2, project project, field_value field_value2, field_value field_value, artifact artifact LEFT JOIN mntr_subscription mntr_subscription ON mntr_subscription.object_key=artifact.id AND ((mntr_subscription.user_id='17272')), folder folder, field_value field_value4 WHERE artifact.id=item.id AND item.folder_id=folder.id AND folder.project_id=project.id AND artifact.group_fv=field_value.id AND artifact.status_fv=field_value2.id AND artifact.category_fv=field_value3.id AND artifact.customer_fv=field_value4.id AND item.created_by_id=sfuser.id AND relationship.is_deleted=false AND relationship.relationship_type_name='ArtifactAssignment' AND relationship.origin_id=sfuser2.id AND artifact.id=relationship.target_id AND item.is_deleted=false AND project.path='projects.gl_coconet_performance_improveme' AND item.folder_id='tracker3641' AND folder.path='tracker.perf_test' AND (field_value2.value_class='Open'); ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=117.16..102664.10 rows=1 width=167) (actual time=392.383..3412.452 rows=71 loops=1) Join Filter: ((folder.project_id)::text = (project.id)::text) -> Index Scan using project_path on project (cost=0.00..4.27 rows=1 width=12) (actual time=0.040..0.041 rows=1 loops=1) Index Cond: ((path)::text = 'projects.gl_coconet_performance_improveme'::text) -> Nested Loop (cost=117.16..102655.03 rows=384 width=179) (actual time=392.331..3412.303 rows=71 loops=1) -> Index Scan using folder_pk on folder (cost=0.00..4.35 rows=1 width=26) (actual time=0.034..0.036 rows=1 loops=1) Index Cond: ('tracker3641'::text = (id)::text) Filter: ((path)::text = 'tracker.perf_test'::text) -> Nested Loop (cost=117.16..102646.84 rows=384 width=180) (actual time=392.293..3412.193 rows=71 loops=1) -> Nested Loop (cost=117.16..102535.74 rows=384 width=162) (actual time=392.276..3411.189 rows=71 loops=1) -> Nested Loop (cost=117.16..102397.53 rows=384 width=161) (actual time=392.258..3409.958 rows=71loops=1) -> Nested Loop (cost=117.16..102259.33 rows=384 width=174) (actual time=392.239..3408.734 rows=71loops=1) -> Nested Loop (cost=117.16..102121.13 rows=384 width=187) (actual time=392.220..3407.479rows=71 loops=1) -> Nested Loop (cost=117.16..101565.91 rows=1819 width=169) (actual time=392.195..3406.360rows=71 loops=1) -> Nested Loop (cost=117.16..96763.32 rows=2796 width=170) (actual time=392.150..3404.791rows=71 loops=1) -> Merge Join (cost=117.16..89555.79 rows=19888 width=169) (actual time=392.092..3403.281rows=71 loops=1) Merge Cond: ((artifact.id)::text = (item.id)::text) -> Merge Left Join (cost=117.16..52509.18 rows=475614 width=81)(actual time=0.050..715.999 rows=380704 loops=1) Merge Cond: ((artifact.id)::text = "inner"."?column3?") -> Index Scan using artifact_pk on artifact (cost=0.00..51202.63rows=475614 width=69) (actual time=0.014..424.003 rows=380704 loops=1) -> Sort (cost=117.16..117.30 rows=58 width=25) (actual time=0.033..0.033rows=0 loops=1) Sort Key: (mntr_subscription.object_key)::text -> Index Scan using mntr_subscr_usrevt on mntr_subscription (cost=0.00..115.46 rows=58 width=25) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: ((user_id)::text = '17272'::text) -> Index Scan using item_pk on item (cost=0.00..176865.31 rows=97498width=88) (actual time=116.898..2404.612 rows=71 loops=1) Filter: ((NOT is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) -> Index Scan using field_value_pk on field_value field_value2 (cost=0.00..0.35rows=1 width=27) (actual time=0.019..0.019 rows=1 loops=71) Index Cond: ((artifact.status_fv)::text = (field_value2.id)::text) Filter: ((value_class)::text = 'Open'::text) -> Index Scan using relation_target on relationship (cost=0.00..1.71 rows=1width=25) (actual time=0.020..0.020 rows=1 loops=71) Index Cond: ((artifact.id)::text = (relationship.target_id)::text) Filter: ((NOT is_deleted) AND ((relationship_type_name)::text = 'ArtifactAssignment'::text)) -> Index Scan using sfuser_pk on sfuser (cost=0.00..0.29 rows=1 width=42) (actualtime=0.013..0.014 rows=1 loops=71) Index Cond: ((item.created_by_id)::text = (sfuser.id)::text) -> Index Scan using field_value_pk on field_value field_value4 (cost=0.00..0.35 rows=1width=13) (actual time=0.015..0.016 rows=1 loops=71) Index Cond: ((artifact.customer_fv)::text = (field_value4.id)::text) -> Index Scan using field_value_pk on field_value (cost=0.00..0.35 rows=1 width=13) (actualtime=0.015..0.015 rows=1 loops=71) Index Cond: ((artifact.group_fv)::text = (field_value.id)::text) -> Index Scan using field_value_pk on field_value field_value3 (cost=0.00..0.35 rows=1 width=27) (actualtime=0.015..0.015 rows=1 loops=71) Index Cond: ((artifact.category_fv)::text = (field_value3.id)::text) -> Index Scan using sfuser_pk on sfuser sfuser2 (cost=0.00..0.28 rows=1 width=42) (actual time=0.012..0.012rows=1 loops=71) Index Cond: ((relationship.origin_id)::text = (sfuser2.id)::text) Total runtime: 3413.006 ms (43 rows) I am having a hard time to understand why the query runs faster with the left join. Can you help me understanding how that is possible? Thanks, Anne
> From: Anne Rosset > Subject: [PERFORM] Unexpected query plan results > > Hi, > We have one query which has a left join. If we run this query without > the left join, it runs slower than with the left join. [snip] > I am having a hard time to understand why the query runs > faster with the > left join. > It looks like the query plan for the query without the left join is less than optimal. Adding the left join just seemed to shake things up enough that postgres picked a better plan. The slow step in the query without the left join appears to be sorting the result of a hash join so it can be used in a merge join. -> Sort (cost=47640.91..47808.10 rows=66876 width=70) (actual time=4273.919..4401.387 rows=168715 loops=1) Sort Key: (artifact.id)::text -> Hash Join (cost=9271.96..42281.07 rows=66876 width=70) (actual time=124.119..794.667 rows=184378 loops=1) The plan might be sped up by removing the sort or making the sort faster. Postgres thinks the Hash Join will only produce 66,876 rows, but it produces 184,378 rows. If it made a better estimate of the results of the hash join, it might not choose this plan. I don't really know if there is a way to improve the estimate on a join when the estimates of the inputs look pretty good. As a test you might try disabling sorts by setting enable_sort to false, then run the explain analyze again to see what you get. You might be able to make the sort faster by increasing work mem. What do you have work mem set to now and what version of Postgres are you using? Dave
Dave Dutcher wrote: >>From: Anne Rosset >>Subject: [PERFORM] Unexpected query plan results >> >>Hi, >>We have one query which has a left join. If we run this query without >>the left join, it runs slower than with the left join. >> >> >[snip] > > >>I am having a hard time to understand why the query runs >>faster with the >>left join. >> >> >> > >It looks like the query plan for the query without the left join is less >than optimal. Adding the left join just seemed to shake things up enough >that postgres picked a better plan. The slow step in the query without the >left join appears to be sorting the result of a hash join so it can be used >in a merge join. > > -> Sort (cost=47640.91..47808.10 rows=66876 width=70) (actual >time=4273.919..4401.387 rows=168715 loops=1) > Sort Key: (artifact.id)::text > -> Hash Join (cost=9271.96..42281.07 rows=66876 width=70) >(actual time=124.119..794.667 rows=184378 loops=1) > >The plan might be sped up by removing the sort or making the sort faster. >Postgres thinks the Hash Join will only produce 66,876 rows, but it produces >184,378 rows. If it made a better estimate of the results of the hash join, >it might not choose this plan. I don't really know if there is a way to >improve the estimate on a join when the estimates of the inputs look pretty >good. > >As a test you might try disabling sorts by setting enable_sort to false, >then run the explain analyze again to see what you get. > >You might be able to make the sort faster by increasing work mem. What do >you have work mem set to now and what version of Postgres are you using? > > >Dave > > > Thank Dave. We are using postgresql-server-8.2.4-1PGDG and have work-mem set to 20MB. What value would you advise? thanks, Anne
> From: Anne Rosset > Subject: Re: [PERFORM] Unexpected query plan results > > > > > Thank Dave. We are using postgresql-server-8.2.4-1PGDG and > have work-mem set to 20MB. > What value would you advise? > thanks, > > Anne Work-mem is kind of tricky because the right setting depends on how much ram your machine has, is the machine dedicated to postgres, and how many simultaneous connections you have. If this is a test server, and not used in production, you could just play around with the setting and see if your query gets any faster. Here are the docs on work mem http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html# RUNTIME-CONFIG-RESOURCE-MEMORY
On Fri, May 29, 2009 at 1:30 PM, Dave Dutcher <dave@tridecap.com> wrote:
> From: Anne Rosset
> Subject: Re: [PERFORM] Unexpected query plan results> >Work-mem is kind of tricky because the right setting depends on how much ram
> >
> Thank Dave. We are using postgresql-server-8.2.4-1PGDG and
> have work-mem set to 20MB.
> What value would you advise?
> thanks,
>
> Anne
your machine has, is the machine dedicated to postgres, and how many
simultaneous connections you have. If this is a test server, and not used
in production, you could just play around with the setting and see if your
query gets any faster.
Right, the trick to remember is that you could possibly end up in a scenario where you have max_connections * work_mem being used just for sorting / joins and the rest of your memory will be swapped, so be careful not to push too high. Also, work_mem is not going to be fully allocated at fork time, it'll only use up to that much as needed.
--Scott
Here are the docs on work mem
http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html#
RUNTIME-CONFIG-RESOURCE-MEMORY
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Dave Dutcher wrote: >>From: Anne Rosset >>Subject: Re: [PERFORM] Unexpected query plan results >> >> >>> >>> >>> >>> >>Thank Dave. We are using postgresql-server-8.2.4-1PGDG and >>have work-mem set to 20MB. >>What value would you advise? >>thanks, >> >>Anne >> >> > > >Work-mem is kind of tricky because the right setting depends on how much ram >your machine has, is the machine dedicated to postgres, and how many >simultaneous connections you have. If this is a test server, and not used >in production, you could just play around with the setting and see if your >query gets any faster. > >Here are the docs on work mem > >http://www.postgresql.org/docs/8.2/interactive/runtime-config-resource.html# >RUNTIME-CONFIG-RESOURCE-MEMORY > > > Thanks Dave. The result with enable_sort=false is much better (at least the left join is not having better result): Now I am getting a 4s runtime. ( I also got the same performance by setting enable_mergejoin to false). Do you see anything I could do to make it faster? When the query plan takes a wrong path, is it possible that it is because statistics have not been run or updated? Thanks Anne QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------- Hash Join (cost=9276.24..100313.55 rows=1 width=155) (actual time=168.148..4144.595 rows=71 loops=1) Hash Cond: ((folder.project_id)::text = (project.id)::text) -> Nested Loop (cost=9271.96..100302.44 rows=1819 width=167) (actual time=168.080..4144.363 rows=71 loops=1) -> Index Scan using folder_pk on folder (cost=0.00..4.35 rows=1 width=26) (actual time=0.029..0.032 rows=1 loops=1) Index Cond: ('tracker3641'::text = (id)::text) Filter: ((path)::text = 'tracker.perf_test'::text) -> Nested Loop (cost=9271.96..100279.90 rows=1819 width=168) (actual time=168.045..4144.249 rows=71 loops=1) -> Nested Loop (cost=9271.96..99724.69 rows=1819 width=150) (actual time=168.028..4143.126 rows=71 loops=1) -> Nested Loop (cost=9271.96..99198.39 rows=1819 width=132) (actual time=168.008..4141.973 rows=71 loops=1) -> Nested Loop (cost=9271.96..98543.72 rows=1819 width=131) (actual time=167.989..4140.718 rows=71 loops=1) -> Nested Loop (cost=9271.96..97889.05 rows=1819 width=144) (actual time=167.971..4139.482 rows=71 loops=1) -> Nested Loop (cost=9271.96..97234.38 rows=1819 width=157) (actual time=167.943..4137.998 rows=71 loops=1) -> Nested Loop (cost=9271.96..92431.80 rows=2796 width=158) (actual time=167.893..4136.297 rows=71 loops=1) -> Hash Join (cost=9271.96..42281.07 rows=66876 width=70) (actual time=125.019..782.122 rows=184378 loops=1) Hash Cond: ((artifact.status_fv)::text = (field_value2.id)::text) -> Seq Scan on artifact (cost=0.00..25206.14 rows=475614 width=69) (actual time=0.006..211.907 rows=468173 loops=1 ) -> Hash (cost=8285.92..8285.92 rows=78883 width=27) (actual time=124.929..124.929 rows=79488 loops=1) -> Index Scan using field_class_idx on field_value field_value2 (cost=0.00..8285.92 rows=78883 width=27) (ac tual time=0.040..60.861 rows=79488 loops=1) Index Cond: ((value_class)::text = 'Open'::text) -> Index Scan using item_pk on item (cost=0.00..0.74 rows=1 width=88) (actual time=0.018..0.018 rows=0 loops=184378) Index Cond: ((artifact.id)::text = (item.id)::text) Filter: ((NOT is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) -> Index Scan using relation_target on relationship (cost=0.00..1.71 rows=1 width=25) (actual time=0.021..0.022 rows=1 loops=7 1) Index Cond: ((artifact.id)::text = (relationship.target_id)::text) Filter: ((NOT is_deleted) AND ((relationship_type_name)::text = 'ArtifactAssignment'::text)) -> Index Scan using field_value_pk on field_value field_value4 (cost=0.00..0.35 rows=1 width=13) (actual time=0.018..0.019 rows=1 lo ops=71) Index Cond: ((artifact.customer_fv)::text = (field_value4.id)::text) -> Index Scan using field_value_pk on field_value (cost=0.00..0.35 rows=1 width=13) (actual time=0.015..0.015 rows=1 loops=71) Index Cond: ((artifact.group_fv)::text = (field_value.id)::text) -> Index Scan using field_value_pk on field_value field_value3 (cost=0.00..0.35 rows=1 width=27) (actual time=0.015..0.015 rows=1 loops=71) Index Cond: ((artifact.category_fv)::text = (field_value3.id)::text) -> Index Scan using sfuser_pk on sfuser sfuser2 (cost=0.00..0.28 rows=1 width=42) (actual time=0.013..0.014 rows=1 loops=71) Index Cond: ((relationship.origin_id)::text = (sfuser2.id)::text) -> Index Scan using sfuser_pk on sfuser (cost=0.00..0.29 rows=1 width=42) (actual time=0.013..0.014 rows=1 loops=71) Index Cond: ((item.created_by_id)::text = (sfuser.id)::text) -> Hash (cost=4.27..4.27 rows=1 width=12) (actual time=0.047..0.047 rows=1 loops=1) -> Index Scan using project_path on project (cost=0.00..4.27 rows=1 width=12) (actual time=0.041..0.043 rows=1 loops=1) Index Cond: ((path)::text = 'projects.gl_coconet_performance_improveme'::text) Total runtime: 4146.198 ms (39 rows)
On Thu, May 28, 2009 at 6:46 PM, Anne Rosset <arosset@collab.net> wrote: > -> Index Scan using > item_pk on item (cost=0.00..176865.31 rows=97498 width=88) (actual > time=117.304..2405.060 rows=71 loops=1) > Filter: ((NOT > is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) The fact that the estimated row count differs from the actual row count by a factor of more than 1000 is likely the root cause of your problem here. You probably want to figure out why that's happening. How many rows are in that table and what value are you using for default_statistics_target? ...Robert
> When the query plan takes a wrong path, is it possible that it is because > statistics have not been run or updated? Yes. If you are not using autovacuum, you need to ANALYZE regularly, or bad things will happen to you. ...Robert
Robert Haas wrote: >On Thu, May 28, 2009 at 6:46 PM, Anne Rosset <arosset@collab.net> wrote: > > >> -> Index Scan using >>item_pk on item (cost=0.00..176865.31 rows=97498 width=88) (actual >>time=117.304..2405.060 rows=71 loops=1) >> Filter: ((NOT >>is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) >> >> > >The fact that the estimated row count differs from the actual row >count by a factor of more than 1000 is likely the root cause of your >problem here. You probably want to figure out why that's happening. >How many rows are in that table and what value are you using for >default_statistics_target? > >...Robert > > The table has 468173 rows and the value for default_statistics_target is 750. Anne
On Fri, May 29, 2009 at 5:57 PM, Anne Rosset <arosset@collab.net> wrote: > Robert Haas wrote: > >> On Thu, May 28, 2009 at 6:46 PM, Anne Rosset <arosset@collab.net> wrote: >> >>> >>> -> Index Scan using >>> item_pk on item (cost=0.00..176865.31 rows=97498 width=88) (actual >>> time=117.304..2405.060 rows=71 loops=1) >>> Filter: ((NOT >>> is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) >>> >> >> The fact that the estimated row count differs from the actual row >> count by a factor of more than 1000 is likely the root cause of your >> problem here. You probably want to figure out why that's happening. >> How many rows are in that table and what value are you using for >> default_statistics_target? >> >> ...Robert >> > > The table has 468173 rows and the value for default_statistics_target is > 750. > Anne OK, that sounds good. If you haven't run ANALYZE or VACUUM ANALYZE recently, you should do that first and see if it fixes anything. Otherwise, maybe there's a hidden correlation between the deleted column and the folder_id column. We can assess that like this: SELECT SUM(1) FROM item WHERE NOT deleted; SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641'; SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641' AND NOT deleted; Can you try that and send along the results? Thanks, ...Robert
Robert Haas wrote: >On Fri, May 29, 2009 at 5:57 PM, Anne Rosset <arosset@collab.net> wrote: > > >>Robert Haas wrote: >> >> >> >>>On Thu, May 28, 2009 at 6:46 PM, Anne Rosset <arosset@collab.net> wrote: >>> >>> >>> >>>> -> Index Scan using >>>>item_pk on item (cost=0.00..176865.31 rows=97498 width=88) (actual >>>>time=117.304..2405.060 rows=71 loops=1) >>>> Filter: ((NOT >>>>is_deleted) AND ((folder_id)::text = 'tracker3641'::text)) >>>> >>>> >>>> >>>The fact that the estimated row count differs from the actual row >>>count by a factor of more than 1000 is likely the root cause of your >>>problem here. You probably want to figure out why that's happening. >>>How many rows are in that table and what value are you using for >>>default_statistics_target? >>> >>>...Robert >>> >>> >>> >>The table has 468173 rows and the value for default_statistics_target is >>750. >>Anne >> >> > >OK, that sounds good. If you haven't run ANALYZE or VACUUM ANALYZE >recently, you should do that first and see if it fixes anything. >Otherwise, maybe there's a hidden correlation between the deleted >column and the folder_id column. We can assess that like this: > >SELECT SUM(1) FROM item WHERE NOT deleted; >SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641'; >SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641' AND NOT deleted; > >Can you try that and send along the results? > >Thanks, > >...Robert > > Hi Robert, we did a vacuum analyze and the results are the same. Here are the results of the queries : SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1 row) SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum -------- 122412 (1 row) SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted = 'f'; sum ----- 71 (1 row) SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted = 't'; sum -------- 122341 (1 row) Thanks for your help, Anne
On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset@collab.net> wrote: >>> The table has 468173 rows and the value for default_statistics_target is >>> 750. >>> Anne > Hi Robert, > we did a vacuum analyze and the results are the same. > Here are the results of the queries : > > SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1 > row) > SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 > </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum -------- > 122412 (1 row) > SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 > </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted = > 'f'; sum ----- 71 (1 row) > SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 > </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted = > 't'; sum -------- 122341 (1 row) Something's not right here. If the whole table has only 468173 rows, you can't have 1.8 million deleted rows where is_deleted = false. ...Robert
Robert Haas wrote: >On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset@collab.net> wrote: > > >>>>The table has 468173 rows and the value for default_statistics_target is >>>>750. >>>>Anne >>>> >>>> >>Hi Robert, >>we did a vacuum analyze and the results are the same. >>Here are the results of the queries : >> >>SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1 >>row) >>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum -------- >>122412 (1 row) >>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted = >>'f'; sum ----- 71 (1 row) >>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted = >>'t'; sum -------- 122341 (1 row) >> >> > >Something's not right here. If the whole table has only 468173 rows, >you can't have 1.8 million deleted rows where is_deleted = false. > >...Robert > > The item table has 2324829 rows The artifact table has 468173 rows. Thanks, Anne
> -----Original Message----- > From: Anne Rosset > Subject: Re: [PERFORM] Unexpected query plan results > > >> > >>SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum > --------- 1824592 > >>(1 > >>row) > >>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 > >></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum > >>-------- > >>122412 (1 row) > >>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 > >></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND > >>is_deleted = 'f'; sum ----- 71 (1 row) SELECT SUM(1) FROM > item WHERE > >>folder_id = 'tracker3641 > >></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND > >>is_deleted = 't'; sum -------- 122341 (1 row) > >> > >> > > > >Something's not right here. If the whole table has only > 468173 rows, > >you can't have 1.8 million deleted rows where is_deleted = false. > > > >...Robert > > > > > The item table has 2324829 rows > The artifact table has 468173 rows. > Thanks, > Anne I'd been thinking about the sort, but I hadn't thought yet if that index scan on item could be made faster. Could you post the table definition of item including the indexes on it? Dave
Dave Dutcher wrote: >>-----Original Message----- >>From: Anne Rosset >>Subject: Re: [PERFORM] Unexpected query plan results >> >> >> >>>>SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum >>>> >>>> >>--------- 1824592 >> >> >>>>(1 >>>>row) >>>>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>>></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum >>>>-------- >>>>122412 (1 row) >>>>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>>></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND >>>>is_deleted = 'f'; sum ----- 71 (1 row) SELECT SUM(1) FROM >>>> >>>> >>item WHERE >> >> >>>>folder_id = 'tracker3641 >>>></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND >>>>is_deleted = 't'; sum -------- 122341 (1 row) >>>> >>>> >>>> >>>> >>>Something's not right here. If the whole table has only >>> >>> >>468173 rows, >> >> >>>you can't have 1.8 million deleted rows where is_deleted = false. >>> >>>...Robert >>> >>> >>> >>> >>The item table has 2324829 rows >>The artifact table has 468173 rows. >>Thanks, >>Anne >> >> > >I'd been thinking about the sort, but I hadn't thought yet if that index >scan on item could be made faster. Could you post the table definition of >item including the indexes on it? > >Dave > > > > > > Dave: Table "public.item" Column | Type | Modifiers ---------------------+--------------------------+----------- id | character varying(32) | not null name | character varying(128) | title | character varying(255) | version | integer | not null date_created | timestamp with time zone | not null date_last_modified | timestamp with time zone | not null is_deleted | boolean | not null type_id | character varying(32) | folder_id | character varying(32) | planning_folder_id | character varying(32) | created_by_id | character varying(32) | last_modified_by_id | character varying(32) | Indexes: "item_pk" primary key, btree (id) "item_created_by_id" btree (created_by_id) "item_date_created" btree (date_created) "item_folder" btree (folder_id) "item_name" btree (name) "item_planning_folder" btree (planning_folder_id) Thanks, Anne
On Mon, Jun 1, 2009 at 4:53 PM, Anne Rosset <arosset@collab.net> wrote: >> On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset@collab.net> wrote: >>> SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1 >>> row) >>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum -------- >>> 122412 (1 row) >>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted >>> = >>> 'f'; sum ----- 71 (1 row) >>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted >>> = >>> 't'; sum -------- 122341 (1 row) > > The item table has 2324829 rows So 1824592/2324829 = 78.4% of the rows have is_deleted = false, and 0.06709% of the rows have the relevant folder_id. Therefore the planner assumes that there will be 2324829 * 78.4% * 0.06709% =~ 96,000 rows that satisfy both criteria (the original explain had 97,000; there's some variability due to the fact that the analyze only samples a random subset of pages), but the real number is 71, leading it to make a very bad decision. This is a classic "hidden correlation" problem, where two columns are correlated but the planner doesn't notice, and you get a terrible plan. Unfortunately, I'm not aware of any real good solution to this problem. The two obvious approaches are multi-column statistics and planner hints; PostgreSQL supports neither. There are various possible hacks that aren't very satisfying, such as: 1. Redesign the application to put the deleted records in a separate table from the non-deleted records. But if the deleted records still have child records in other tables, this won't fly due to foreign key problems. 2. Inserting a clause that the optimizer doesn't understand to fool it into thinking that the scan on the item table is much more selective than is exactly the case. I think adding (item.id + 0) = (item.id + 0) to the WHERE clause will work; the planner will brilliantly estimate the selectivity of that expression as one in 200. The problem with this is that it will likely lead to a better plan in this particular case, but for other folder_ids it may make things worse. There's also no guarantee that a future version of PostgreSQL won't be smart enough to see through this type of sophistry, though I think you're safe as far as the forthcoming 8.4 release is concerned. 3. A hack that makes me gag, but it actually seems to work... CREATE OR REPLACE FUNCTION item_squash(varchar, boolean) RETURNS varchar[] AS $$ SELECT array[$1, CASE WHEN $2 THEN 'true' ELSE 'false' END] $$ LANGUAGE sql IMMUTABLE; CREATE INDEX item_squash_idx ON item (item_squash(folder_id, is_deleted)); ...and then remove "folder_id = XXX AND is_deleted = YYY" from your query and substitute "item_squash(folder_id, is_deleted) = item_squash(XXX, YYY)". The expresson index forces the planner to gather statistics on the distribution of values for that expression, and if you then write a query using that exact same expression the planner can take advantage of it. ...Robert
2009/6/2 Robert Haas <robertmhaas@gmail.com>
On Mon, Jun 1, 2009 at 4:53 PM, Anne Rosset <arosset@collab.net> wrote:
>> On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset@collab.net> wrote:>>> SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1
>>> row)
>>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
>>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum --------
>>> 122412 (1 row)
>>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
>>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted
>>> =
>>> 'f'; sum ----- 71 (1 row)
>>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
>>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted
>>> =
>>> 't'; sum -------- 122341 (1 row)
>> The item table has 2324829 rowsSo 1824592/2324829 = 78.4% of the rows have is_deleted = false, and
0.06709% of the rows have the relevant folder_id. Therefore the
planner assumes that there will be 2324829 * 78.4% * 0.06709% =~
96,000 rows that satisfy both criteria (the original explain had
97,000; there's some variability due to the fact that the analyze only
samples a random subset of pages), but the real number is 71, leading
it to make a very bad decision. This is a classic "hidden
correlation" problem, where two columns are correlated but the planner
doesn't notice, and you get a terrible plan.
Unfortunately, I'm not aware of any real good solution to this
problem. The two obvious approaches are multi-column statistics and
planner hints; PostgreSQL supports neither.
How about partial index (create index idx on item(folder_id) where not is_deleted)? Won't it have required statistics (even if it is not used in plan)?
On Jun 2, 2009, at 6:20 AM, Віталій Тимчишин <tivv00@gmail.com> wrote:
2009/6/2 Robert Haas <robertmhaas@gmail.com>On Mon, Jun 1, 2009 at 4:53 PM, Anne Rosset <arosset@collab.net> wrote:
>> On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset@collab.net> wrote:>>> SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1
>>> row)
>>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
>>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum --------
>>> 122412 (1 row)
>>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
>>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted
>>> =
>>> 'f'; sum ----- 71 (1 row)
>>> SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641
>>> </sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted
>>> =
>>> 't'; sum -------- 122341 (1 row)
>> The item table has 2324829 rowsSo 1824592/2324829 = 78.4% of the rows have is_deleted = false, and
0.06709% of the rows have the relevant folder_id. Therefore the
planner assumes that there will be 2324829 * 78.4% * 0.06709% =~
96,000 rows that satisfy both criteria (the original explain had
97,000; there's some variability due to the fact that the analyze only
samples a random subset of pages), but the real number is 71, leading
it to make a very bad decision. This is a classic "hidden
correlation" problem, where two columns are correlated but the planner
doesn't notice, and you get a terrible plan.
Unfortunately, I'm not aware of any real good solution to this
problem. The two obvious approaches are multi-column statistics and
planner hints; PostgreSQL supports neither.
How about partial index (create index idx on item(folder_id) where not is_deleted)? Won't it have required statistics (even if it is not used in plan)?
I tried that; doesn't seem to work.
...Robert
Robert Haas wrote: >On Mon, Jun 1, 2009 at 4:53 PM, Anne Rosset <arosset@collab.net> wrote: > > >>>On Mon, Jun 1, 2009 at 2:14 PM, Anne Rosset <arosset@collab.net> wrote: >>> >>> >>>>SELECT SUM(1) FROM item WHERE is_deleted = 'f'; sum --------- 1824592 (1 >>>>row) >>>>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>>></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>'; sum -------- >>>>122412 (1 row) >>>>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>>></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted >>>>= >>>>'f'; sum ----- 71 (1 row) >>>>SELECT SUM(1) FROM item WHERE folder_id = 'tracker3641 >>>></sf/sfmain/do/go/tracker3641?returnUrlKey=1243878161701>' AND is_deleted >>>>= >>>>'t'; sum -------- 122341 (1 row) >>>> >>>> >>The item table has 2324829 rows >> >> > >So 1824592/2324829 = 78.4% of the rows have is_deleted = false, and >0.06709% of the rows have the relevant folder_id. Therefore the >planner assumes that there will be 2324829 * 78.4% * 0.06709% =~ >96,000 rows that satisfy both criteria (the original explain had >97,000; there's some variability due to the fact that the analyze only >samples a random subset of pages), but the real number is 71, leading >it to make a very bad decision. This is a classic "hidden >correlation" problem, where two columns are correlated but the planner >doesn't notice, and you get a terrible plan. > >Unfortunately, I'm not aware of any real good solution to this >problem. The two obvious approaches are multi-column statistics and >planner hints; PostgreSQL supports neither. There are various >possible hacks that aren't very satisfying, such as: > >1. Redesign the application to put the deleted records in a separate >table from the non-deleted records. But if the deleted records still >have child records in other tables, this won't fly due to foreign key >problems. > >2. Inserting a clause that the optimizer doesn't understand to fool it >into thinking that the scan on the item table is much more selective >than is exactly the case. I think adding (item.id + 0) = (item.id + >0) to the WHERE clause will work; the planner will brilliantly >estimate the selectivity of that expression as one in 200. The >problem with this is that it will likely lead to a better plan in this >particular case, but for other folder_ids it may make things worse. >There's also no guarantee that a future version of PostgreSQL won't be >smart enough to see through this type of sophistry, though I think >you're safe as far as the forthcoming 8.4 release is concerned. > >3. A hack that makes me gag, but it actually seems to work... > >CREATE OR REPLACE FUNCTION item_squash(varchar, boolean) RETURNS varchar[] AS $$ >SELECT array[$1, CASE WHEN $2 THEN 'true' ELSE 'false' END] >$$ LANGUAGE sql IMMUTABLE; > >CREATE INDEX item_squash_idx ON item (item_squash(folder_id, is_deleted)); > >...and then remove "folder_id = XXX AND is_deleted = YYY" from your >query and substitute "item_squash(folder_id, is_deleted) = >item_squash(XXX, YYY)". The expresson index forces the planner to >gather statistics on the distribution of values for that expression, >and if you then write a query using that exact same expression the >planner can take advantage of it. > >...Robert > > Thanks a lot Robert. Not sure how we will tackle this but at least now we have an explanation. From what I read, results won't improved in 8.4. Is that correct? Thanks, Anne
On Tue, Jun 2, 2009 at 11:16 AM, Anne Rosset <arosset@collab.net> wrote: > Thanks a lot Robert. Not sure how we will tackle this but at least now we > have an explanation. From what I read, results won't improved in 8.4. Is > that correct? Yes, that's correct. Good luck... ...Robert