Thread: BUG #13869: Right Join query that never ends
The following bug has been logged on the website: Bug reference: 13869 Logged by: Rumato Email address: zx-master@bigmir.net PostgreSQL version: 9.5.0 Operating system: CentOS 7 Description: Hi. On my old installation with pg 8.4 this query executes about 3 seconds, but on 9.5 it never ends and cpu usage rises up to 100% by postgres process. Problem is here near RIGHT OUTER JOIN Query works with LEFT OUTER JOIN and other inner joins If I will create 2 views from "AS info" (about 2k rows) and "AS all_res" (about 200 rows) tables it will not fix anything. BTW separate "AS info" query executes less than in 3 sec, "AS all_res" less than 1 sec. I have solved this problem by using WITH info, all_res queries. Now it works pretty fast. THE QUERY SELECT all_res.res_id reseller_id, 'keys' as descrip, count(info.company_name) as total, all_res.reseller_nm as reseller_name, all_res.email as reseller_email FROM (SELECT k.id AS key_id, (case when uidtype.keytype_id is null then null else k.uid end) AS key_guid, k.type_id as type_id, k.binding_ip_id AS binding_ip_id, ko_cl.parents AS ko_cl_parents, bc.id AS bc_id, ko_key.name AS key_number, ko_type.name AS keytype, kt.class AS keytype_class, ko_r.name AS reseller_name, ko_r.id AS reseller_id, bc.company_name AS company_name, to_char(ko_key.create_date, 'yyyy MM dd HH24 mi') AS create_date, k.terminated AS key_terminated, ku.login AS login, sld.value AS sld, exp_date.value AS expiration_date, kus.email FROM key k INNER JOIN ka_object ko_key ON (k.id = ko_key.id) INNER JOIN ka_object ko_type ON (k.type_id = ko_type.id) INNER JOIN key_type kt ON (k.type_id = kt.id) LEFT OUTER JOIN keytype_uidtype_reference uidtype ON uidtype.keytype_id=kt.id INNER JOIN ka_object ko_c ON (ko_key.parent_id = ko_c.id) INNER JOIN container keys_c ON (keys_c.id = ko_c.id and store_type='Key') INNER JOIN ka_object ko_cl ON (ko_c.parent_id = ko_cl.id) INNER JOIN client client ON (ko_cl.id = client.id) INNER JOIN ka_object ko_clc ON (ko_cl.parent_id = ko_clc.id) INNER JOIN base_client bc ON (client.id = bc.id) INNER JOIN ka_object ko_r ON (ko_clc.parent_id = ko_r.id) INNER JOIN ka_user ku ON (bc.id = ku.id) INNER JOIN ka_user kus ON (ko_r.id = kus.id) LEFT OUTER JOIN (SELECT kpr.keytype_id, kp.value FROM key_property_reference kpr INNER JOIN key_property kp ON (kpr.property_id = kp.id) WHERE kp.name = 'server_license_details' AND kp.value = 'billing' ) AS sld ON (sld.keytype_id = ko_type.id) LEFT OUTER JOIN key_value exp_date ON (k.id = exp_date.key_id AND exp_date.key_value_key_id=55 ) WHERE ko_key.system='f' AND upper(bc.company_name) not like '%TEST%' AND k.terminated='f' ) AS info RIGHT OUTER JOIN (SELECT resellers.id as res_id, res.email as email, resellers.name as reseller_nm FROM ka_object resellers JOIN ka_user res ON (resellers.id = res.id) WHERE resellers.parent_id=33 AND type = 'com.Reseller' AND system='f' AND upper(resellers.name) NOT LIKE '%TEST%' ) AS all_res ON all_res.res_id = info.reseller_id GROUP BY info.reseller_id, reseller_nm, all_res.email, all_res.res_id
On Fri, Jan 15, 2016 at 8:16 AM, <zx-master@bigmir.net> wrote: > The following bug has been logged on the website: > > Bug reference: 13869 > Logged by: Rumato > Email address: zx-master@bigmir.net > PostgreSQL version: 9.5.0 > Operating system: CentOS 7 > Description: > > Hi. > On my old installation with pg 8.4 this query executes about 3 seconds, b= ut > on 9.5 it never ends and cpu usage rises up to 100% by postgres process. > > Problem is here near RIGHT OUTER JOIN > Query works with LEFT OUTER JOIN and other inner joins > > If I will create 2 views from "AS info" (about 2k rows) and "AS all_res" > (about 200 rows) tables it will not fix anything. BTW separate "AS info" > query executes less than in 3 sec, "AS all_res" less than 1 sec. > > I have solved this problem by using WITH info, all_res queries. Now it > works > pretty fast. > > THE QUERY > SELECT > all_res.res_id reseller_id, > 'keys' as descrip, > count(info.company_name) as total, > all_res.reseller_nm as reseller_name, > all_res.email as reseller_email > FROM (SELECT > k.id AS key_id, > (case when uidtype.keytype_id is null then null else k.uid end) A= S > key_guid, > k.type_id as type_id, > k.binding_ip_id AS binding_ip_id, > ko_cl.parents AS ko_cl_parents, > bc.id AS bc_id, > ko_key.name AS key_number, > ko_type.name AS keytype, > kt.class AS keytype_class, > ko_r.name AS reseller_name, > ko_r.id AS reseller_id, > bc.company_name AS company_name, > to_char(ko_key.create_date, 'yyyy MM dd HH24 mi') AS create_date, > k.terminated AS key_terminated, > ku.login AS login, > sld.value AS sld, > exp_date.value AS expiration_date, > kus.email > FROM key k > INNER JOIN ka_object ko_key ON (k.id =3D ko_key.id) > INNER JOIN ka_object ko_type ON (k.type_id =3D ko_type.id) > INNER JOIN key_type kt ON (k.type_id =3D kt.id) > LEFT OUTER JOIN keytype_uidtype_reference uidtype ON > uidtype.keytype_id=3Dkt.id > INNER JOIN ka_object ko_c ON (ko_key.parent_id =3D ko_c.id) > INNER JOIN container keys_c ON (keys_c.id =3D ko_c.id and > store_type=3D'Key') > INNER JOIN ka_object ko_cl ON (ko_c.parent_id =3D ko_cl.id) > INNER JOIN client client ON (ko_cl.id =3D client.id) > INNER JOIN ka_object ko_clc ON (ko_cl.parent_id =3D ko_clc.id) > INNER JOIN base_client bc ON (client.id =3D bc.id) > INNER JOIN ka_object ko_r ON (ko_clc.parent_id =3D ko_r.id) > INNER JOIN ka_user ku ON (bc.id =3D ku.id) > INNER JOIN ka_user kus ON (ko_r.id =3D kus.id) > LEFT OUTER JOIN (SELECT kpr.keytype_id, kp.value > FROM key_property_reference kpr > INNER JOIN key_property kp ON (kpr.property_id = =3D > kp.id) > WHERE kp.name =3D 'server_license_details' AND > kp.value =3D 'billing' > ) AS sld ON (sld.keytype_id =3D ko_type.id) > LEFT OUTER JOIN key_value exp_date ON (k.id =3D exp_date.key_id A= ND > exp_date.key_value_key_id=3D55 ) > > WHERE > ko_key.system=3D'f' > AND upper(bc.company_name) not like '%TEST%' > AND k.terminated=3D'f' > > ) AS info > > RIGHT OUTER JOIN (SELECT resellers.id as res_id, res.email as email, > resellers.name as reseller_nm > FROM ka_object resellers > JOIN ka_user res ON (resellers.id =3D res.id) > WHERE resellers.parent_id=3D33 > AND type =3D 'com.Reseller' > AND system=3D'f' > AND upper(resellers.name) NOT LIKE '%TEST%' > ) AS all_res ON all_res.res_id =3D info.reseller_id > > GROUP BY info.reseller_id, reseller_nm, all_res.email, all_res.res_id =E2=80=8BSo, typically the first thing to do is run "EXPLAIN" (or EXPLAIN A= NALYZE - though if the query never finishes that won't help). Since you haven't provided a self-contained test case you are going to need to do that for us= . David J. =E2=80=8B
Greetings. I can do more to investigate this behavior only if postgresql will consider my claim as bug that has to be fixed in firstorder. Ideally I should give my DB schema and dump or VM wich reproduces this bug, but I can not give away our internal data. Allother options I can not describe at public mailing list. > So, typically the first thing to do is run "EXPLAIN" Explain Analyze do not return any result as select query. So this is just an explain output HashAggregate (cost=962.11..962.12 rows=1 width=70) Group Key: ko_r.id, resellers.name, res.email, resellers.id -> Nested Loop (cost=23.76..962.09 rows=1 width=70) -> Nested Loop Left Join (cost=23.49..957.79 rows=1 width=49) -> Bitmap Heap Scan on ka_object resellers (cost=3.59..212.91 rows=1 width=25) Recheck Cond: (parent_id = 4) Filter: (((type)::text = 'com.plesk.ka.sys.Reseller'::text) AND (system = 'f'::bpchar) AND (upper((name)::text)!~~ '%TEST%'::text)) -> Bitmap Index Scan on ka_object_parent_id_idx (cost=0.00..3.59 rows=174 width=0) Index Cond: (parent_id = 4) -> Nested Loop Left Join (cost=19.90..744.87 rows=1 width=24) -> Nested Loop Left Join (cost=19.61..744.46 rows=1 width=32) -> Nested Loop (cost=19.32..743.97 rows=1 width=40) -> Nested Loop (cost=19.05..743.67 rows=1 width=72) -> Nested Loop (cost=18.77..743.36 rows=1 width=48) Join Filter: (ko_r.id = ko_clc.parent_id) -> Nested Loop (cost=18.49..742.99 rows=1 width=64) -> Nested Loop (cost=18.21..742.69 rows=1 width=56) -> Nested Loop (cost=0.56..4.61 rows=1 width=16) -> Index Only Scan using ka_object_pkey on ka_object ko_r (cost=0.29..4.30 rows=1 width=8) Index Cond: (id = resellers.id) -> Index Only Scan using ka_user_pkey on ka_user kus (cost=0.28..0.29rows=1 width=8) Index Cond: (id = ko_r.id) -> Nested Loop (cost=17.65..737.81 rows=27 width=40) Join Filter: (ko_key.parent_id = keys_c.id) -> Merge Join (cost=17.37..711.31 rows=85 width=56) Merge Cond: (ko_type.id = k.type_id) Join Filter: (ko_key.id = k.id) -> Nested Loop (cost=0.99..1543.85 rows=768 width=64) -> Nested Loop (cost=0.42..40.77 rows=8 width=16) -> Index Only Scan using key_type_pkeyon key_type kt (cost=0.13..6.25 rows =8 width=8) -> Index Only Scan using ka_object_pkeyon ka_object ko_type (cost=0.29..4. 30 rows=1 width=8) Index Cond: (id = kt.id) -> Materialize (cost=0.57..1493.72 rows=96 width=48) -> Nested Loop (cost=0.57..1493.24 rows=96width=48) -> Nested Loop (cost=0.29..1458.21rows=96 width=32) -> Seq Scan on ka_object ko_key (cost=0.00..1149.97 rows=96 wid th=16) Filter: ((system = 'f'::bpchar)AND ((create_date)::date >= date_trunc('month'::text, (now() + '-1 mons'::interval))) AND ((create_date)::date < date_trunc('month'::text, (now() +'00:00:00'::interval)))) -> Index Scan using ka_object_pkeyon ka_object ko_c (cost=0.29 ..3.20 rows=1 width=16) Index Cond: (id = ko_key.parent_id) -> Index Scan using ka_object_pkeyon ka_object ko_cl (cost=0.29..0.3 5 rows=1 width=16) Index Cond: (id = ko_c.parent_id) -> Index Scan using key_type_id_idx on key k (cost=0.29..602.90rows=17049 width=16) -> Index Scan using container_pkey on container keys_c (cost=0.28..0.30rows=1 width=8) Index Cond: (id = ko_c.id) Filter: ((store_type)::text = 'Key'::text) -> Index Only Scan using client_pkey on client (cost=0.27..0.29 rows=1width=8) Index Cond: (id = ko_c.parent_id) -> Index Scan using ka_object_pkey on ka_object ko_clc (cost=0.29..0.35 rows=1width=16) Index Cond: (id = ko_cl.parent_id) -> Index Scan using base_client_pkey on base_client bc (cost=0.28..0.30 rows=1 width=24) Index Cond: (id = ko_c.parent_id) Filter: (upper((company_name)::text) !~~ '%TEST%'::text) -> Index Only Scan using ka_user_pkey on ka_user ku (cost=0.28..0.29 rows=1 width=8) Index Cond: (id = ko_c.parent_id) -> Nested Loop (cost=0.28..0.48 rows=1 width=8) -> Index Only Scan using key_property_reference_keytype_id_property_id_unique_idx on key_property_referencekpr (cost=0.14. .0.16 rows=1 width=16) Index Cond: (keytype_id = ko_type.id) -> Index Scan using key_property_pkey on key_property kp (cost=0.14..0.31 rows=1 width=8) Index Cond: (id = kpr.property_id) Filter: (((name)::text = 'server_license_details'::text) AND ((value)::text = 'billing'::text)) -> Index Only Scan using key_value_key_id_and_kvk_id_idx on key_value exp_date (cost=0.29..0.39 rows=1width=8) Index Cond: ((key_id = k.id) AND (key_value_key_id = 63)) -> Index Scan using ka_user_pkey on ka_user res (cost=0.28..4.29 rows=1 width=29) Index Cond: (id = resellers.id)
On Sat, Jan 16, 2016 at 4:51 AM, Master ZX <zx-master@bigmir.net> wrote: > I can do more to investigate this behavior only if postgresql will consider my claim as bug that has to be fixed in firstorder. > Ideally I should give my DB schema and dump or VM wich reproduces this bug, but I can not give away our internal data.All other options I can not describe at public mailing list. Why don't you use a tool like Linux's perf to determine where time is spent?: https://wiki.postgresql.org/wiki/Profiling_with_perf A simple "perf top" may be all that it takes to give us a better idea of where time is spent. -- Peter Geoghegan
19.01.2016 02:47, Peter Geoghegan <pg@heroku.com> >On Sat, Jan 16, 2016 at 4:51 AM, Master ZX <zx-master@bigmir.net> wrote: > > I can do more to investigate this behavior only if postgresql will consider my claim as bug that has to be fixed in firstorder. > > Ideally I should give my DB schema and dump or VM wich reproduces this bug, but I can not give away our internal data.All other options I can not describe at public mailing list. > > Why don't you use a tool like Linux's perf to determine where time is spent?: > > https://wiki.postgresql.org/wiki/Profiling_with_perf > > A simple "perf top" may be all that it takes to give us a better idea > of where time is spent. > > -- > Peter Geoghegan This is forwarded message from 21 of Jan (looks like previous message lost) I'm not really experienced in bug reporting and perf usage. I've recorded some data via perf record for about 40 sec. and this is what I was able to get via perf report -g --children - 8.98% 8.93% postgres postgres [.] slot_getattr ▒ slot_getattr ▒ - 5.79% 5.75% postgres postgres [.] ExecMergeJoin ▒ ExecMergeJoin ▒ - 3.73% 3.70% postgres postgres [.] ExecQual ▒ ExecQual ▒ - 3.68% 3.65% postgres postgres [.] ExecStoreMinimalTuple ▒ ExecStoreMinimalTuple ▒ - 2.85% 0.00% postgres [unknown] [.] 0x0000000000000001 ▒ 0x1 ▒ - 2.80% 2.78% postgres postgres [.] tuplesort_gettupleslot ▒ tuplesort_gettupleslot ▒ - 2.78% 0.00% postgres postgres [.] 0xffffffffffdb3602 ▒ 0x1b3602 ▒ - 2.76% 2.76% postgres postgres [.] 0x00000000001b3602 ▒ 0x1b3602 ▒ - 2.64% 2.62% postgres postgres [.] MemoryContextReset ▒ MemoryContextReset ▒ - 2.30% 2.28% postgres postgres [.] ExecProcNode ▒ ExecProcNode ▒ - 2.26% 2.25% postgres postgres [.] pgstat_end_function_usage ▒ pgstat_end_function_usage ▒ - 2.24% 0.00% postgres postgres [.] 0xffffffffffdb35f0 ▒ 0x1b35f0 ▒ - 2.23% 2.23% postgres postgres [.] 0x00000000001b35f0 ▒ 0x1b35f0 ▒ - 2.13% 2.12% postgres postgres [.] ExecSort ▒ ExecSort ▒ - 1.87% 1.86% postgres postgres [.] check_stack_depth ▒ + check_stack_depth ▒ - 1.47% 1.46% postgres postgres [.] pgstat_init_function_usage ▒ pgstat_init_function_usage I have about 100Mb of perf.data and as far as I can understand perf report will be most informative on machine where it wascollected. So in a case if you need some other data from perf.dump just let me know.
In new release 9.5.1 this bug is no longer exists. This issue should be closed. 19.01.2016 02:47, Peter Geoghegan <pg@heroku.com> >On Sat, Jan 16, 2016 at 4:51 AM, Master ZX <zx-master@bigmir.net> wrote: > > I can do more to investigate this behavior only if postgresql will consider my claim as bug that has to be fixed in firstorder. > > Ideally I should give my DB schema and dump or VM wich reproduces this bug, but I can not give away our internal data.All other options I can not describe at public mailing list. > > Why don't you use a tool like Linux's perf to determine where time is spent?: > > https://wiki.postgresql.org/wiki/Profiling_with_perf > > A simple "perf top" may be all that it takes to give us a better idea > of where time is spent. > > -- > Peter Geoghegan