Re: hash joins are causing no space left error - Mailing list pgsql-general

From Tom Lane
Subject Re: hash joins are causing no space left error
Date
Msg-id 328867.1597277184@sss.pgh.pa.us
Whole thread Raw
In response to hash joins are causing no space left error  (Ayub M <hiayub@gmail.com>)
Responses Re: hash joins are causing no space left error  (Ayub M <hiayub@gmail.com>)
List pgsql-general
Ayub M <hiayub@gmail.com> writes:
> This is PostgreSQL 11 on AWS, there is a mview query in this OLAP database,
> the tables involved are huge - 50-100m records on average records hundreds
> of columns in most cases. The query runs for a while and then errors out
> saying "No space left on device". I could see it generating around 500gb of
> temp file data. At times it goes thru and at times it fails - probably due
> to other queries running at the same time and causing failure.

Are you sure that these queries are actually producing the answers you
want?  It sounds suspiciously like you are computing underconstrained
joins.

> The joins are happening on around 10 tables and all are joining on the PK
> and FK columns. I see partition pruning happening but the hash joins are
> killing the query.
> Is there any way to avoid hash joins?

TBH, you are asking the wrong question.  A merge join would take about as
much temporary space, and a nestloop join over so much data would probably
not finish in an amount of time you're willing to wait.  Indexes are NOT
a magic solution here.  What you need to be thinking about is how to not
need to process so much data.

If you really need to have this proven to you, you can try "set
enable_hashjoin = off", but I don't think you'll find that better.

            regards, tom lane



pgsql-general by date:

Previous
From: Ayub M
Date:
Subject: hash joins are causing no space left error
Next
From: mohand oubelkacem makhoukhene
Date:
Subject: Implement a new data type