Re: How to improve the performance of my SQL query? - Mailing list pgsql-general
From | Charly |
---|---|
Subject | Re: How to improve the performance of my SQL query? |
Date | |
Msg-id | CAD-uDDG1ruPdA45X+0Q+53YUt8R+TBLFBthCsqKs-+it=Fb0aA@mail.gmail.com Whole thread Raw |
In response to | Re: How to improve the performance of my SQL query? (gzh <gzhcoder@126.com>) |
Responses |
Re: How to improve the performance of my SQL query?
|
List | pgsql-general |
Hi "gzh",
Based on the info you provided I'm assuming you are trying to use the TBL_SHA primary key to do an index-only scan as in you mentioned above you have:
> TBL_SHA
> ms_cd character(6) NOT NULL -- PRIMARY KEY
> et_cd character(8)
> etrys character(8)
> ms_cd character(6) NOT NULL -- PRIMARY KEY
> et_cd character(8)
> etrys character(8)
Assuming a composed index here by the 3 columns.
> TBL_INF
> ms_cd character(6) NOT NULL -- PRIMARY KEY
> ry_cd character(8) NOT NULL -- PRIMARY KEY
Here it's more clear that there is a composed index based on those 2 columns.
The problem is none of the explains you sent match with the description above. The last one when you forced the optimizer to go with index scan (SET enable_seqscan TO off;) the chosen index seems to be one composed by ms_cd and cd_ate (following your standard of 2 characters column name). There may have a couple of explanations to this:
- One is that the index may not be exactly the same as described above;
- Another one is the order in the index. Because you have a composed index the order of the columns in the index matters, and it seems the order is (ms_cd, et_cd, etrys). I wonder if you could recreate this index with the following order: (ms_cd, etrys, et_cd) and run the same query;
There may be other problems happening there, but those are the ones I see more evident from the description of the problem.
Giving a closer look to your query I really didn't understand the reasoning to have that subselect as it seems only to validate the two relations have common items or to validate a parent relationship, in this case you can probably use a join to get the same result:
SELECT COUNT(et_cd) FROM tbl_sha tbs
JOIN tbi ON tbi.ry_cd = tbs.etrys AND tbi.ms_cd = tbi.ms_cd
WHERE tbi.ms_cd = 'MLD009'
AND tbl_inf.ry_cd = '00000001';
JOIN tbi ON tbi.ry_cd = tbs.etrys AND tbi.ms_cd = tbi.ms_cd
WHERE tbi.ms_cd = 'MLD009'
AND tbl_inf.ry_cd = '00000001';
You can also try to trick the optimizer, for example, what is the result (and explain) of the below query?
WITH tbi (ry_cd) AS (
SELECT tbl_inf.ry_cd
FROM tbl_inf tbi
WHERE tbi.ms_cd = 'MLD009'
AND tbl_inf.ry_cd = '00000001'
) SELECT COUNT(et_cd) FROM tbl_sha tbs
JOIN tbi ON tbi.ry_cd = tbs .etrys
WHERE tbs .ms_cd = 'MLD009';
SELECT tbl_inf.ry_cd
FROM tbl_inf tbi
WHERE tbi.ms_cd = 'MLD009'
AND tbl_inf.ry_cd = '00000001'
) SELECT COUNT(et_cd) FROM tbl_sha tbs
JOIN tbi ON tbi.ry_cd = tbs .etrys
WHERE tbs .ms_cd = 'MLD009';
You can alternatively try the CTE using the both columns in the JOIN clause.
On Thu, 27 Jul 2023 at 05:10, gzh <gzhcoder@126.com> wrote:
Thank you for your suggestion.The database is hosted on AWS RDS, with the instance having vCPU=2 and RAM=4GB.Could the low performance be due to the low configuration of AWS RDS?We are considering trying a higher configuration instance.
I hope it helps.
--
Regards,
Charly Batista
Shanghai, China
Linux user #391083
“If you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas."
George Bernard Shaw (1856-1950)
pgsql-general by date: