Re: Join runs for > 10 hours and then fills up >1.3TB of disk space - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Join runs for > 10 hours and then fills up >1.3TB of disk space
Date
Msg-id 482D4344.2060607@archonet.com
Whole thread Raw
In response to Join runs for > 10 hours and then fills up >1.3TB of disk space  (kevin kempter <kevin@kevinkempterllc.com>)
List pgsql-performance
kevin kempter wrote:
> Hi List;
>
> I have a table with 9,961,914 rows in it (see the describe of
> bigtab_stats_fact_tmp14 below)
>
> I also have a table with 7,785 rows in it (see the describe of
> xsegment_dim below)

Something else is puzzling me with this - you're joining over four fields.

> from
> bigtab_stats_fact_tmp14 f14,
> xsegment_dim segdim
> where
> f14.customer_id = segdim.customer_srcid
> and f14.show_id = segdim.show_srcid
> and f14.season_id = segdim.season_srcid
> and f14.episode_id = segdim.episode_srcid
> and segdim.segment_srcid is NULL;

>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

>
> Merge Join  (cost=1757001.74..73569676.49 rows=3191677219 width=118)

> ->  Sort  (cost=1570.35..1579.46 rows=3643 width=40)

> ->  Sort  (cost=1755323.26..1780227.95 rows=9961874 width=126)

Here it's still expecting 320 matches against each row from the large
table. That's ~ 10% of the small table (or that fraction of it that PG
expects) which seems very high for four clauses ANDed together.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Join runs for > 10 hours and then fills up >1.3TB of disk space
Next
From: "Luke Lonergan"
Date:
Subject: Re: Join runs for > 10 hours and then fills up >1.3TB of disk space