question about efficiency - Mailing list pgsql-general

From Johnson, Shaunn
Subject question about efficiency
Date
Msg-id 73309C2FDD95D11192E60008C7B1D5BB04C741EE@snt452.corp.bcbsm.com
Whole thread Raw
Responses Re: question about efficiency
List pgsql-general

Howdy:

Running PostgreSQL 7.2.1 on RedHat Linux 7.2.

Have a question about a query and it's efficiency:

One of the users created a query that looks like this:

[snip]
"create table dev_lbpclaimsum as
SELECT
  claimsum2001.c_contract_num,
  claimsum2001.c_mbr_num,
  claimsum2001.c_proc_cd,
  claimsum2001.c_proc_mod,
  claimsum2001.d_from_dt,
  claimsum2001.d_thru_dt,
  claimsum2001.i_pd,
  claimsum2001.c_serv_prov
FROM
  claimsum2001 a join dev_pb_proc b on a.c_proc_cd = substr (b.c_proc_cd,1,6)

[/snip]

After looking monitoring the query, it seems that the job took over a day
and still never completed.  I looked at it and replaced the
'claimsum2001' with 'a' and did an explain on both:

[snip]
[with alias 'a' ]
Nested Loop  (cost=0.00..64051744.91 rows=15892944 width=84)
  ->  Seq Scan on dev_pb_proc b  (cost=0.00..20.00 rows=1000 width=10)
  ->  Index Scan using stateclaim01_proc_cd_i on claimsum2001 a  (cost=0.00..63813.33 rows=15893 width=74)

[/with alias]

[original way]

psql:./jans_stuff.sql:14: NOTICE:  Adding missing FROM-clause entry for table "claimsum2001"
psql:./jans_stuff.sql:14: NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..19629582277130.89 rows=214697805924711 width=95)
  ->  Nested Loop  (cost=0.00..64051744.91 rows=15892944 width=21)
        ->  Seq Scan on dev_pb_proc b  (cost=0.00..20.00 rows=1000 width=10)
        ->  Index Scan using stateclaim01_proc_cd_i on claimsum2001 a  (cost=0.00..63813.33 rows=15893 width=11)
  ->  Seq Scan on claimsum2001  (cost=0.00..1100019.02 rows=13509002 width=74)

EXPLAIN
[/original way]

[/snip]

I created the table in the modified way, it took maybe an hour or so (which is great!)

But, my question is:

Can someone explain why there's a difference between using
claimsum2001.(whatever) and a.(whatever)?  

I wouldn't have thought that it would have mattered in efficiency, just been more convenient.

Thanks!

-X

pgsql-general by date:

Previous
From: "scott.marlowe"
Date:
Subject: Re: Hardware estimation
Next
From: Stephan Szabo
Date:
Subject: Re: question about efficiency