Thread: simple query running for ever

simple query running for ever

From
Nagaraj Raj
Date:
I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN ANALYZE.

select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
from "cms_prospects".PROSPECT T0
--inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" =
T2."individual_entity_proxy_id"
 


"Merge Left Join  (cost=55.96..18147747.08 rows=213620928 width=20)"
"  Merge Cond: (t0.individual_entity_proxy_id = t2.individual_entity_proxy_id)"
"  ->  Index Scan using pk_prospect on prospect t0  (cost=0.57..10831606.89 rows=213620928 width=16)"
"  ->  Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on individual_demographic t2
(cost=0.57..5013756.93rows=260652064 width=12)"
 



Any suggestions or help would be highly appreciated. 




Best regards,
Rj






Re: simple query running for ever

From
Michael Lewis
Date:
On Tue, Jun 16, 2020 at 2:35 PM Nagaraj Raj <nagaraj.sf@yahoo.com> wrote:
I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN ANALYZE.

More context is needed. Please review-

Re: simple query running for ever

From
Justin Pryzby
Date:
On Tue, Jun 16, 2020 at 08:35:31PM +0000, Nagaraj Raj wrote:
> I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN
ANALYZE.

Is this related to last week's question ?
https://www.postgresql.org/message-id/1211705382.726951.1592174752720%40mail.yahoo.com

Was that issue resolved ?

I didn't see answers to a few questions I asked there.

> select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
> from "cms_prospects".PROSPECT T0
> --inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
> left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" =
T2."individual_entity_proxy_id"
 
> 
> "Merge Left Join  (cost=55.96..18147747.08 rows=213620928 width=20)"
> "  Merge Cond: (t0.individual_entity_proxy_id = t2.individual_entity_proxy_id)"
> "  ->  Index Scan using pk_prospect on prospect t0  (cost=0.57..10831606.89 rows=213620928 width=16)"
> "  ->  Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on individual_demographic t2
(cost=0.57..5013756.93rows=260652064 width=12)"
 
> 
> Any suggestions or help would be highly appreciated. 
> 
> Best regards,
> Rj



Re: simple query running for ever

From
Nagaraj Raj
Date:
Hi Justin,

My apologies, I missed that.

Yes, I change work mem to 2GB but didn't see any difference. So, as your suggestion removed the distinct on pk and added a multi-column index so query planner did index-only can that is fixed the issue and query completed in 1Min.

Best regards,
Rj
On Tuesday, June 16, 2020, 01:47:21 PM PDT, Justin Pryzby <pryzby@telsasoft.com> wrote:


On Tue, Jun 16, 2020 at 08:35:31PM +0000, Nagaraj Raj wrote:
> I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN ANALYZE.

Is this related to last week's question ?
https://www.postgresql.org/message-id/1211705382.726951.1592174752720%40mail.yahoo.com

Was that issue resolved ?

I didn't see answers to a few questions I asked there.


> select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
> from "cms_prospects".PROSPECT T0
> --inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
> left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" = T2."individual_entity_proxy_id"
>
> "Merge Left Join  (cost=55.96..18147747.08 rows=213620928 width=20)"
> "  Merge Cond: (t0.individual_entity_proxy_id = t2.individual_entity_proxy_id)"
> "  ->  Index Scan using pk_prospect on prospect t0  (cost=0.57..10831606.89 rows=213620928 width=16)"
> "  ->  Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on individual_demographic t2  (cost=0.57..5013756.93 rows=260652064 width=12)"
>
> Any suggestions or help would be highly appreciated.
>
> Best regards,
> Rj


Re: simple query running for ever

From
Nagaraj Raj
Date:
Hi Michael,

Sorry, I missed table structure,


explain select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
from "cms_prospects".PROSPECT T0
inner join public.t1680035748gcccqqdpmrblxp33_bkp T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" = T2."individual_entity_proxy_id";



"Hash Join (cost=1417.48..21353422.52 rows=213620928 width=20)"
" Hash Cond: ((t0.individual_entity_proxy_id)::numeric = t1.individual_entity_proxy_id)"
" -> Merge Left Join (cost=55.96..18147747.08 rows=213620928 width=20)"
" Merge Cond: (t0.individual_entity_proxy_id = t2.individual_entity_proxy_id)"
" -> Index Scan using pk_prospect on prospect t0 (cost=0.57..10831606.89 rows=213620928 width=16)"
" -> Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on individual_demographic t2 (cost=0.57..5013756.93 rows=260652064 width=12)"
" -> Hash (cost=741.79..741.79 rows=49579 width=8)"
" -> Seq Scan on t1680035748gcccqqdpmrblxp33_bkp t1 (cost=0.00..741.79 rows=49579 width=8)"

--T0

CREATE TABLE cms_prospects.prospect
(
individual_entity_proxy_id bigint NOT NULL,
household_entity_proxy_id bigint,
individual_personal_link_sid bigint NOT NULL,
city_name character varying(100) COLLATE pg_catalog."default",
state_prov_cd character varying(40) COLLATE pg_catalog."default",
pstl_code character varying(40) COLLATE pg_catalog."default",
npa integer,
nxx integer,
email_domain character varying(400) COLLATE pg_catalog."default",
email_preference character varying(40) COLLATE pg_catalog."default",
direct_mail_preference character varying(40) COLLATE pg_catalog."default",
profane_wrd_ind character(1) COLLATE pg_catalog."default",
tmo_ofnsv_name_ind character(1) COLLATE pg_catalog."default",
census_block_id character varying(40) COLLATE pg_catalog."default",
has_first_name character(1) COLLATE pg_catalog."default",
has_middle_name character(1) COLLATE pg_catalog."default",
has_last_name character(1) COLLATE pg_catalog."default",
has_email_address character(1) COLLATE pg_catalog."default",
has_individual_address character(1) COLLATE pg_catalog."default",
email_address_sid bigint,
person_name_sid bigint,
physical_address_sid bigint,
telephone_number_sid bigint,
last_contacted_email_datetime timestamp without time zone,
last_contacted_dm_datetime timestamp without time zone,
last_contacted_digital_datetime timestamp without time zone,
last_contacted_anychannel_dttm timestamp without time zone,
hard_bounce_ind integer,
closest_store_site_id1 character varying(40) COLLATE pg_catalog."default",
distance_1 numeric(5,2),
load_dttm timestamp without time zone NOT NULL,
updt_dttm timestamp without time zone,
md5_chk_sum character varying(200) COLLATE pg_catalog."default",
deld_from_src_ind character(1) COLLATE pg_catalog."default" NOT NULL,
orphan_ind character(1) COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT pk_prospect PRIMARY KEY (individual_entity_proxy_id)
);

--T1
CREATE TABLE public.t1680035748gcccqqdpmrblxp33_bkp(
individual_entity_proxy_id numeric(20,0));

-- T2

CREATE TABLE cms_prospects.individual_demographic
(
individual_entity_proxy_id bigint NOT NULL,
cstmr_prspct_ind character varying(40) COLLATE pg_catalog."default",
last_appnd_dttm timestamp without time zone,
last_sprsn_dttm timestamp without time zone,
infrrd_gender_code character varying(40) COLLATE pg_catalog."default",
govt_prison_ind character(1) COLLATE pg_catalog."default",
tax_bnkrpt_dcsd_ind character(1) COLLATE pg_catalog."default",
underbank_rank_nbr integer,
hvy_txn_rank_nbr integer,
prominence_nbr integer,
ocptn_code character varying(40) COLLATE pg_catalog."default",
educ_lvl_nbr integer,
gender_code character varying(40) COLLATE pg_catalog."default",
infrrd_hh_rank_nbr integer,
econmc_stable_nbr integer,
directv_sbscrbr_propnsty_code character varying(40) COLLATE pg_catalog."default",
amazon_prm_propnsty_code character varying(40) COLLATE pg_catalog."default",
iphone_user_propnsty_code character varying(40) COLLATE pg_catalog."default",
smrt_hm_devc_propnsty_code character varying(40) COLLATE pg_catalog."default",
dog_ownr_propnsty_code character varying(40) COLLATE pg_catalog."default",
cat_ownr_propnsty_code character varying(40) COLLATE pg_catalog."default",
msc_cncrt_propnsty_code character varying(40) COLLATE pg_catalog."default",
dine_out_propnsty_code character varying(40) COLLATE pg_catalog."default",
taco_bell_diner_propnsty_code character varying(40) COLLATE pg_catalog."default",
auto_insrnc_byr_propnsty_code character varying(40) COLLATE pg_catalog."default",
load_dttm timestamp without time zone NOT NULL,
updt_dttm timestamp without time zone,
md5_chk_sum character varying(200) COLLATE pg_catalog."default",
deld_from_src_ind character(1) COLLATE pg_catalog."default" NOT NULL,
orphan_ind character(1) COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT pk_individual_demographic PRIMARY KEY (individual_entity_proxy_id)
);


Server config:
PostgreSQL v10.11RAM: 380GB
vCore: 32
Shared_buffers: 65G
Bwork_mem:104857kB
maintenance_work_mem:256MB
effective_cache_size: 160GB
On Tuesday, June 16, 2020, 01:44:09 PM PDT, Michael Lewis <mlewis@entrata.com> wrote:


On Tue, Jun 16, 2020 at 2:35 PM Nagaraj Raj <nagaraj.sf@yahoo.com> wrote:
I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN ANALYZE.


More context is needed. Please review-

Re: simple query running for ever

From
Nagaraj Raj
Date:
And here is the explain analyze:

https://explain.depesz.com/s/uQGA

Thanks!
On Tuesday, June 16, 2020, 02:13:37 PM PDT, Nagaraj Raj <nagaraj.sf@yahoo.com> wrote:


Hi Michael,

Sorry, I missed table structure,


explain select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
from "cms_prospects".PROSPECT T0
inner join public.t1680035748gcccqqdpmrblxp33_bkp T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" = T2."individual_entity_proxy_id";



"Hash Join (cost=1417.48..21353422.52 rows=213620928 width=20)"
" Hash Cond: ((t0.individual_entity_proxy_id)::numeric = t1.individual_entity_proxy_id)"
" -> Merge Left Join (cost=55.96..18147747.08 rows=213620928 width=20)"
" Merge Cond: (t0.individual_entity_proxy_id = t2.individual_entity_proxy_id)"
" -> Index Scan using pk_prospect on prospect t0 (cost=0.57..10831606.89 rows=213620928 width=16)"
" -> Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on individual_demographic t2 (cost=0.57..5013756.93 rows=260652064 width=12)"
" -> Hash (cost=741.79..741.79 rows=49579 width=8)"
" -> Seq Scan on t1680035748gcccqqdpmrblxp33_bkp t1 (cost=0.00..741.79 rows=49579 width=8)"

--T0

CREATE TABLE cms_prospects.prospect
(
individual_entity_proxy_id bigint NOT NULL,
household_entity_proxy_id bigint,
individual_personal_link_sid bigint NOT NULL,
city_name character varying(100) COLLATE pg_catalog."default",
state_prov_cd character varying(40) COLLATE pg_catalog."default",
pstl_code character varying(40) COLLATE pg_catalog."default",
npa integer,
nxx integer,
email_domain character varying(400) COLLATE pg_catalog."default",
email_preference character varying(40) COLLATE pg_catalog."default",
direct_mail_preference character varying(40) COLLATE pg_catalog."default",
profane_wrd_ind character(1) COLLATE pg_catalog."default",
tmo_ofnsv_name_ind character(1) COLLATE pg_catalog."default",
census_block_id character varying(40) COLLATE pg_catalog."default",
has_first_name character(1) COLLATE pg_catalog."default",
has_middle_name character(1) COLLATE pg_catalog."default",
has_last_name character(1) COLLATE pg_catalog."default",
has_email_address character(1) COLLATE pg_catalog."default",
has_individual_address character(1) COLLATE pg_catalog."default",
email_address_sid bigint,
person_name_sid bigint,
physical_address_sid bigint,
telephone_number_sid bigint,
last_contacted_email_datetime timestamp without time zone,
last_contacted_dm_datetime timestamp without time zone,
last_contacted_digital_datetime timestamp without time zone,
last_contacted_anychannel_dttm timestamp without time zone,
hard_bounce_ind integer,
closest_store_site_id1 character varying(40) COLLATE pg_catalog."default",
distance_1 numeric(5,2),
load_dttm timestamp without time zone NOT NULL,
updt_dttm timestamp without time zone,
md5_chk_sum character varying(200) COLLATE pg_catalog."default",
deld_from_src_ind character(1) COLLATE pg_catalog."default" NOT NULL,
orphan_ind character(1) COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT pk_prospect PRIMARY KEY (individual_entity_proxy_id)
);

--T1
CREATE TABLE public.t1680035748gcccqqdpmrblxp33_bkp(
individual_entity_proxy_id numeric(20,0));

-- T2

CREATE TABLE cms_prospects.individual_demographic
(
individual_entity_proxy_id bigint NOT NULL,
cstmr_prspct_ind character varying(40) COLLATE pg_catalog."default",
last_appnd_dttm timestamp without time zone,
last_sprsn_dttm timestamp without time zone,
infrrd_gender_code character varying(40) COLLATE pg_catalog."default",
govt_prison_ind character(1) COLLATE pg_catalog."default",
tax_bnkrpt_dcsd_ind character(1) COLLATE pg_catalog."default",
underbank_rank_nbr integer,
hvy_txn_rank_nbr integer,
prominence_nbr integer,
ocptn_code character varying(40) COLLATE pg_catalog."default",
educ_lvl_nbr integer,
gender_code character varying(40) COLLATE pg_catalog."default",
infrrd_hh_rank_nbr integer,
econmc_stable_nbr integer,
directv_sbscrbr_propnsty_code character varying(40) COLLATE pg_catalog."default",
amazon_prm_propnsty_code character varying(40) COLLATE pg_catalog."default",
iphone_user_propnsty_code character varying(40) COLLATE pg_catalog."default",
smrt_hm_devc_propnsty_code character varying(40) COLLATE pg_catalog."default",
dog_ownr_propnsty_code character varying(40) COLLATE pg_catalog."default",
cat_ownr_propnsty_code character varying(40) COLLATE pg_catalog."default",
msc_cncrt_propnsty_code character varying(40) COLLATE pg_catalog."default",
dine_out_propnsty_code character varying(40) COLLATE pg_catalog."default",
taco_bell_diner_propnsty_code character varying(40) COLLATE pg_catalog."default",
auto_insrnc_byr_propnsty_code character varying(40) COLLATE pg_catalog."default",
load_dttm timestamp without time zone NOT NULL,
updt_dttm timestamp without time zone,
md5_chk_sum character varying(200) COLLATE pg_catalog."default",
deld_from_src_ind character(1) COLLATE pg_catalog."default" NOT NULL,
orphan_ind character(1) COLLATE pg_catalog."default" NOT NULL,
CONSTRAINT pk_individual_demographic PRIMARY KEY (individual_entity_proxy_id)
);


Server config:
PostgreSQL v10.11RAM: 380GB
vCore: 32
Shared_buffers: 65G
Bwork_mem:104857kB
maintenance_work_mem:256MB
effective_cache_size: 160GB
On Tuesday, June 16, 2020, 01:44:09 PM PDT, Michael Lewis <mlewis@entrata.com> wrote:


On Tue, Jun 16, 2020 at 2:35 PM Nagaraj Raj <nagaraj.sf@yahoo.com> wrote:
I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN ANALYZE.


More context is needed. Please review-

Re: simple query running for ever

From
Justin Pryzby
Date:
On Tue, Jun 16, 2020 at 08:35:31PM +0000, Nagaraj Raj wrote:
> I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN
ANALYZE.
> 
> select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
> from "cms_prospects".PROSPECT T0
> --inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
> left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" =
T2."individual_entity_proxy_id"
 

Pardon me for saying so, but this query seems silly.

It's self-joining a table on its PK, which I don't think could ever be useful.

You do maybe more than 2x as much work, to get 2x as many columns, which are
all redundant.

Can't you just change 
T2."infrrd_hh_rank_nbr" to T0, and avoid the join ?

-- 
Justin



Re: simple query running for ever

From
Andreas Joseph Krogh
Date:
På onsdag 17. juni 2020 kl. 00:05:26, skrev Justin Pryzby <pryzby@telsasoft.com>:
On Tue, Jun 16, 2020 at 08:35:31PM +0000, Nagaraj Raj wrote:
> I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN ANALYZE.
>
> select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
> from "cms_prospects".PROSPECT T0
> --inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
> left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" = T2."individual_entity_proxy_id"

Pardon me for saying so, but this query seems silly.

It's self-joining a table on its PK, which I don't think could ever be useful.
 
Where is the self-join?
 
--
Andreas Joseph Krogh

Re: simple query running for ever

From
Justin Pryzby
Date:
On Wed, Jun 17, 2020 at 12:10:37AM +0200, Andreas Joseph Krogh wrote:
> På onsdag 17. juni 2020 kl. 00:05:26, skrev Justin Pryzby <pryzby@telsasoft.com>: 
> On Tue, Jun 16, 2020 at 08:35:31PM +0000, Nagaraj Raj wrote:
>  > I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN
ANALYZE.
>  >
>  > from "cms_prospects".PROSPECT T0
>  > left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" =
T2."individual_entity_proxy_id"
> 
>  Pardon me for saying so, but this query seems silly.
> 
>  It's self-joining a table on its PK, which I don't think could ever be useful.
> 
> Where is the self-join? 

Sorry, I misread.

I see now that "cms_prospects" refers to the database.

-- 
Justin



Re: simple query running for ever

From
Laurenz Albe
Date:
On Tue, 2020-06-16 at 20:35 +0000, Nagaraj Raj wrote:
> I wrote a simple query, and it is taking too long, not sure what is wrong in it, even its not giving EXPLAIN
ANALYZE.
> 
> select T0."physical_address_sid", T0."individual_entity_proxy_id", T2."infrrd_hh_rank_nbr"
> from "cms_prospects".PROSPECT T0
> --inner join "sas_prs_tmp".DEDUPE3583E3F18 T1 on T0."individual_entity_proxy_id" = T1."individual_entity_proxy_id"
> left join "cms_prospects".INDIVIDUAL_DEMOGRAPHIC T2 on T0."individual_entity_proxy_id" =
T2."individual_entity_proxy_id"
 
> 
> 
> "Merge Left Join  (cost=55.96..18147747.08 rows=213620928 width=20)"
> "  Merge Cond: (t0.individual_entity_proxy_id = t2.individual_entity_proxy_id)"
> "  ->  Index Scan using pk_prospect on prospect t0  (cost=0.57..10831606.89 rows=213620928 width=16)"
> "  ->  Index Only Scan using indxp_individual_demo_infrrd_hh_rank_nbr on individual_demographic t2
(cost=0.57..5013756.93rows=260652064 width=12)"
 
> 
> 
> 
> Any suggestions or help would be highly appreciated. 

The only potential improvement I can see is to strive for an
"index only scan" on "prospect".

For that, you'd have to add and INCLUDE clause to "pk_prospect"
so that "physical_address_sid" and "individual_entity_proxy_id"
are included and VACUUM the table.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com