Thread: Performance Issue (Not using Index when joining two tables).
Hi,
Good Morning!
Postgres Version : 11.6 (AWS Native Postgres/AWS Aurora tried on both flavours).
When i'm joining two tables the primary index is not being used. While is use in clause with values then the index is being used. I have reindexed all the tables, run the auto vaccum as well.
pgwfc01q=> select count(*) from chr_simple_val;
count
-------
13158
(1 row)
pgwfc01q=> select count(*) from chr_emp_position;
count
-------
228
(1 row)
The primary key for the table chr_Simple_val contains OID. Still not using the index.
I'm sharing the explain plan over here..
pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc. category,ctc.code from chr_emp_position cep inner join chr_Simple_Val ctc on ctc.oid=cep.HOME_DEPT_OID;
QUERY P
LAN
------------------------------ ------------------------------ ------------------------------ --------------------------
------------------------------ ------------------------------ ------------------------------ ----------------------
Hash Join (cost=49299.91..51848.83 rows=651 width=42) (actual time=3512.692..3797.583 rows=228 loops=1)
Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
-> Seq Scan on chr_emp_position cep (cost=0.00..2437.77 rows=436 width=11) (actual time=44.713..329.435 rows=22
8 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[]) ) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep. *)::character varying, prod_locale_code))
Rows Removed by Filter: 3695
-> Hash (cost=49176.40..49176.40 rows=9881 width=31) (actual time=3467.907..3467.908 rows=13158 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 1031kB
-> Seq Scan on chr_simple_val ctc (cost=0.00..49176.40 rows=9881 width=31) (actual time=2.191..3460.929 r
ows=13158 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[]) ) AND f_sel_policy_ty_static(
vpd_key) AND f_sel_policy_prod_locale((ctc. *)::character varying, prod_locale_code))
Rows Removed by Filter: 75771
Planning Time: 0.297 ms
Execution Time: 3797.768 ms
(12 rows)
QUERY P
LAN
------------------------------
------------------------------
Hash Join (cost=49299.91..51848.83 rows=651 width=42) (actual time=3512.692..3797.583 rows=228 loops=1)
Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
-> Seq Scan on chr_emp_position cep (cost=0.00..2437.77 rows=436 width=11) (actual time=44.713..329.435 rows=22
8 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.
Rows Removed by Filter: 3695
-> Hash (cost=49176.40..49176.40 rows=9881 width=31) (actual time=3467.907..3467.908 rows=13158 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 1031kB
-> Seq Scan on chr_simple_val ctc (cost=0.00..49176.40 rows=9881 width=31) (actual time=2.191..3460.929 r
ows=13158 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])
vpd_key) AND f_sel_policy_prod_locale((ctc.
Rows Removed by Filter: 75771
Planning Time: 0.297 ms
Execution Time: 3797.768 ms
(12 rows)
Thank you..
Regards,
Ramesh G
On Sun, Sep 13, 2020 at 02:58:15PM +0000, Gopisetty, Ramesh wrote: >Hi, > >Good Morning! > >Postgres Version : 11.6 (AWS Native Postgres/AWS Aurora tried on both flavours). > >When i'm joining two tables the primary index is not being used. While is use in clause with values then the index isbeing used. I have reindexed all the tables, run the auto vaccum as well. > > >pgwfc01q=> select count(*) from chr_simple_val; > count >------- > 13158 >(1 row) > >pgwfc01q=> select count(*) from chr_emp_position; > count >------- > 228 >(1 row) > > >The primary key for the table chr_Simple_val contains OID. Still not using the index. > >I'm sharing the explain plan over here.. > >pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc.category,ctc.code from chr_emp_position cep inner join chr_Simple_Valctc on ctc.oid=cep.HOME_DEPT_OID; > QUERY P >LAN >-------------------------------------------------------------------------------------------------------------------- >---------------------------------------------------------------------------------------------------------------- > Hash Join (cost=49299.91..51848.83 rows=651 width=42) (actual time=3512.692..3797.583 rows=228 loops=1) > Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text) > -> Seq Scan on chr_emp_position cep (cost=0.00..2437.77 rows=436 width=11) (actual time=44.713..329.435 rows=22 >8 loops=1) > Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH >R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code)) > Rows Removed by Filter: 3695 > -> Hash (cost=49176.40..49176.40 rows=9881 width=31) (actual time=3467.907..3467.908 rows=13158 loops=1) > Buckets: 16384 Batches: 1 Memory Usage: 1031kB > -> Seq Scan on chr_simple_val ctc (cost=0.00..49176.40 rows=9881 width=31) (actual time=2.191..3460.929 r >ows=13158 loops=1) > Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_ty_static( >vpd_key) AND f_sel_policy_prod_locale((ctc.*)::character varying, prod_locale_code)) > Rows Removed by Filter: 75771 > Planning Time: 0.297 ms > Execution Time: 3797.768 ms >(12 rows) > Most of the time (3460ms) is spent in the sequential scan on chr_simple_val, and the seqscan on chr_emp_position is taking ~330ms). Combined that's 3790ms out of 3797ms, so the join is pretty much irrelevant. Either the seqscans are causing a lot of I/O, or maybe the f_sel_* functions in the filter are expensive. Judging by how few rows are in the tables (not sure how large the tables are), I'd guess it's the latter ... Hard to say without knowing what the functions do etc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > Most of the time (3460ms) is spent in the sequential scan on > chr_simple_val, and the seqscan on chr_emp_position is taking ~330ms). > Combined that's 3790ms out of 3797ms, so the join is pretty much > irrelevant. > Either the seqscans are causing a lot of I/O, or maybe the f_sel_* > functions in the filter are expensive. Judging by how few rows are in > the tables (not sure how large the tables are), I'd guess it's the > latter ... Hard to say without knowing what the functions do etc. I think the OP is wishing that the filter functions for the larger table would be postponed till after the join condition is applied. I'm a little dubious that that's going to save anything meaningful; but maybe increasing the cost attributed to those functions would persuade the planner to try it that way. First though, does forcing a nestloop plan (turn off enable_hashjoin, and enable_mergejoin too if needed) produce the shape of plan you want? And if so, is it actually faster? Only if those things are true is it going to be worth messing with costing parameters. regards, tom lane
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
This looks like some stuff for row level security perhaps. My understanding is limited, but perhaps those restrictions are influencing the planners access or reliance on stats.
Also, it would seem like you need the entire table since you don't have an explicit where clause. Why would scanning an index and then also visiting every row in the table be faster than just going directly to the table?
Hi,
Thanks for looking into the problem/issue. Let me give more details about the functions... Yes, we are using row level security.
Actually, we have converted an Oracle VPD database (Virtual Private Databases - In short row level security) into postgresql. We have several functions available to filter or to provide the row level security.
f_sel_policy_ty_static; f_sel_policy_all filters the tables where the vpd_key is provided initially.
f_sel_policy_prod_locale filters the table where the prod_locale_code is provided initially.
Before running any queries in the database, we will set the context settings/row level security based on the function below..
CALLvpd_filter(vpd_key=>'XXXX',mod_user=>'XXXXX',user_locale=>'en_XX',prod_locale=>'XX');
This will set the context variables and provide row level security. All the tables in our database consists of vpd_key which is a filter for to run the queries for a given client.
The tables mentioned below chr_emp_position and chr_simple_val consists of many rows and the functions filter them based on the vpd_key and prod_user_locale_code.
Once after providing the row level security we executed the query joining the tables.. And where the index is not being utlitized/ the query runs slower i.e., greater than 8seconds.
The normal structure of the tables will be like this..
chr_emp_position --- has columns vpd_key,oid, home_Dept_oid, eff_date, start_Date,.....etc., (almost having 200+ columns). -- primary key is vpd_key and oid.
chr_simple_Val --- has columns vpd_key, oid , category, description..et.c, (almost has around 70 columns). (primary key is vpd_key and oid)
The rows mentioned below are after setting the row level security on those tables ..
i.e, after executing the function
CALL vpd_filter(spv_vpd_key=>'XXXX',spv_mod_usr=>'XXXXX',spv_user_locale=>'en_XX',spv_prod_locale=>'XX');
pgwfc01q=> select count(*) from chr_simple_val;
count
-------
13158
(1 row)
pgwfc01q=> select count(*) from chr_emp_position;
count
-------
228
(1 row)
The primary key for the table chr_Simple_val contains OID. Still not using the index.
I'm sharing the explain plan over here..
pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc.category,ctc.code from chr_emp_position cep inner join chr_Simple_Val ctc on ctc.oid=cep.HOME_DEPT_OID;
QUERY P
LAN
--------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------
Hash Join (cost=49299.91..51848.83 rows=651 width=42) (actual time=3512.692..3797.583 rows=228 loops=1)
Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
-> Seq Scan on chr_emp_position cep (cost=0.00..2437.77 rows=436 width=11) (actual time=44.713..329.435 rows=22
8 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
Rows Removed by Filter: 3695
-> Hash (cost=49176.40..49176.40 rows=9881 width=31) (actual time=3467.907..3467.908 rows=13158 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 1031kB
-> Seq Scan on chr_simple_val ctc (cost=0.00..49176.40 rows=9881 width=31) (actual time=2.191..3460.929 r
ows=13158 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_ty_static(
vpd_key) AND f_sel_policy_prod_locale((ctc.*)::character varying, prod_locale_code))
-------
13158
(1 row)
pgwfc01q=> select count(*) from chr_emp_position;
count
-------
228
(1 row)
The primary key for the table chr_Simple_val contains OID. Still not using the index.
I'm sharing the explain plan over here..
pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc.category,ctc.code from chr_emp_position cep inner join chr_Simple_Val ctc on ctc.oid=cep.HOME_DEPT_OID;
QUERY P
LAN
--------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------
Hash Join (cost=49299.91..51848.83 rows=651 width=42) (actual time=3512.692..3797.583 rows=228 loops=1)
Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
-> Seq Scan on chr_emp_position cep (cost=0.00..2437.77 rows=436 width=11) (actual time=44.713..329.435 rows=22
8 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
Rows Removed by Filter: 3695
-> Hash (cost=49176.40..49176.40 rows=9881 width=31) (actual time=3467.907..3467.908 rows=13158 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 1031kB
-> Seq Scan on chr_simple_val ctc (cost=0.00..49176.40 rows=9881 width=31) (actual time=2.191..3460.929 r
ows=13158 loops=1)
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_ty_static(
vpd_key) AND f_sel_policy_prod_locale((ctc.*)::character varying, prod_locale_code))
Planning Time: 0.297 ms
Execution Time: 3797.768 ms
(12 rows)
Execution Time: 3797.768 ms
(12 rows)
If i don't set the context and run as a root user the explain plan is as below.. And it executes in milliseconds even without the index having the full table scan.
- I'm not sure if my filters are time consuming. Most of the queries works except few. We hadn't seen the problem in Oracle. I'm not comparing between Oracle and Postgres here. I see both are two different flavors. but trying to get my query runs less than 8seconds.
- I'm not sure why the index on chr_simple_val is not being used here vpd_key,oid. I'm confident if it uses index, it will/might be faster as it is looking for 2 or 3 home departments based on oid.
- I'm not sure why even having the full scan it worked for the root user.
- I'm not sure why the bitmap heap scan was not followed after setting the row level security. How to make the bitmap heap scan on chr_emp_position as i observed here.
fyi.,
Running as a root user.
pgwfc01q=> explain analyze select cep.HOME_DEPT_OID,ctc.oid,ctc.category,ctc.code from xxxx.chr_emp_position cep inner join wfnsch001.chr_Simple_Val ctc on ctc.oid=cep.HOME_DEPT_OID and (ctc.vpd_key='COMMON' or ctc.vpd_key=cep.vpd_key) and cep.vpd_key='xxxxxxxxxx';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
-------------------
Hash Join (cost=5503.95..6742.82 rows=453 width=42) (actual time=131.241..154.201 rows=228 loops=1)
Hash Cond: ((cep.home_dept_oid)::text = (ctc.oid)::text)
Join Filter: (((ctc.vpd_key)::text = 'NG_COMMON'::text) OR ((ctc.vpd_key)::text = (cep.vpd_key)::text))
Rows Removed by Join Filter: 19770
-> Bitmap Heap Scan on chr_emp_position cep (cost=10.05..362.25 rows=228 width=28) (actual time=0.056..0.253 ro
ws=228 loops=1)
Recheck Cond: ((vpd_key)::text = 'xxxxxxxxxx'::text)
Heap Blocks: exact=26
-> Bitmap Index Scan on uq1_chr_emp_position (cost=0.00..9.99 rows=228 width=0) (actual time=0.041..0.041
rows=228 loops=1)
Index Cond: ((vpd_key)::text = 'xxxxxxxxxx'::text)
-> Hash (cost=3600.29..3600.29 rows=88929 width=48) (actual time=130.826..130.826 rows=88929 loops=1)
Buckets: 65536 (originally 65536) Batches: 4 (originally 2) Memory Usage: 3585kB
-> Seq Scan on chr_simple_val ctc (cost=0.00..3600.29 rows=88929 width=48) (actual time=0.005..33.356 row
s=88929 loops=1)
Planning Time: 3.977 ms
Execution Time: 154.535 ms
(14 rows)
pgwfc01q=> select count(*) from wfnsch001.chr_emp_position;
count
-------
3923
(1 row)
pgwfc01q=> select count(*) from wfnsch001.chr_Simple_Val;
count
-------
88929
(1 row)
I'm not sure if i'm thinking in the right way or not. (As of safety purpose, i have rebuilded indexes, analyzed, did vaccum on those tables). Sorry for the lengthy email and i'm trying to explain my best on this.
Thank you.
Regards,
Ramesh G
From: Michael Lewis <mlewis@entrata.com>
Sent: Sunday, September 13, 2020 10:51 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Tomas Vondra <tomas.vondra@2ndquadrant.com>; Gopisetty, Ramesh <rameshg2@illinois.edu>; pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
Subject: Re: Performance Issue (Not using Index when joining two tables).
Sent: Sunday, September 13, 2020 10:51 PM
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Tomas Vondra <tomas.vondra@2ndquadrant.com>; Gopisetty, Ramesh <rameshg2@illinois.edu>; pgsql-performance@lists.postgresql.org <pgsql-performance@lists.postgresql.org>
Subject: Re: Performance Issue (Not using Index when joining two tables).
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[])) AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
R_EMP_POSITION'::character varying) AND f_sel_policy_prod_locale((cep.*)::character varying, prod_locale_code))
This looks like some stuff for row level security perhaps. My understanding is limited, but perhaps those restrictions are influencing the planners access or reliance on stats.
Also, it would seem like you need the entire table since you don't have an explicit where clause. Why would scanning an index and then also visiting every row in the table be faster than just going directly to the table?
"Gopisetty, Ramesh" <rameshg2@illinois.edu> writes: > Thanks for looking into the problem/issue. Let me give more details about the functions... Yes, we are using rowlevel security. Hm. If those expensive filter functions are being injected by RLS on the target tables (rather than by something like an intermediate view), then the planner is constrained to ensure that they execute before any query conditions that it doesn't know to be "leakproof". So unless your join operator is leakproof, the shape of plan that you're hoping for will not be allowed. Since you haven't mentioned anything about data types, it's hard to know whether that's the issue. (The hash condition seems to be texteq, which is leakproof, but there are also casts involved which might not be.) The two queries you provided explain plans for are not the same, so comparing their plans is a fairly pointless activity. *Of course* the query runs faster when you restrict it to fetch fewer rows. The original query has no restriction clause that corresponds to the clauses being used for index conditions in the second query, so it's hardly a surprise that you do not get that plan, RLS or no RLS. regards, tom lane