Re: UPDATE using 3 medium-sized tables causes runaway hash table and fills disk - Mailing list pgsql-general

From Thom Brown
Subject Re: UPDATE using 3 medium-sized tables causes runaway hash table and fills disk
Date
Msg-id CAA-aLv5AELEx08L4MEJPDhUkpHs2Ymu=aCgGbcqg3ubSOCRHgg@mail.gmail.com
Whole thread Raw
In response to UPDATE using 3 medium-sized tables causes runaway hash table and fills disk  (Rowan Collins <rowan.collins@gmail.com>)
List pgsql-general
On 26 April 2013 15:39, Rowan Collins <rowan.collins@gmail.com> wrote:
> Hi All,
>
> I've come upon some very strange behaviour with an UPDATE query which causes
> Postgres to consume all the disk space on the server for no apparent reason.
>
> Basically, I'm trying to run an UPDATE involving three medium-sized tables
> (~60k rows each), which generates a query plan with three Hash Joins. But
> when I try to execute it, Postgres appears to go into some kind of loop,
> gradually filling up the disk partition. After a long wait it responds with
> "ERROR: could not write to hash-join temporary file: No space left on
> device; SQL state: 53100"; the partition in question has 9.5GB free at the
> beginning of the query - that's a lot of hash file!
>
> If I run "ANALYZE temp_fares_mappings;" - the table which is being Updated,
> and is the outermost in the query plan - the problem goes away *even though
> the Query Plan hasn't changed*.
>
> I'm not entirely sure how to simplify the query and still reproduce the
> issue, but I've produced an anonymized data sample and SQL query at
> http://rwec.co.uk/x/break_postgres.zip In this case, "Analyze" *does* change
> the query plan (as well as fixing the issue), but I can consistently
> reproduce the disk-filling behaviour using this sample on Postgres 9.0 and
> 9.2.
>
> Note that it doesn't appear to matter if one or all of the tables are
> permanent, as I have been using various combinations for debugging, and
> always get the same behaviour. Trying to write a corresponding SELECT
> statement doesn't seem to generate the same issue, at least with the queries
> I've tried.
>
> - The plan for the real query is here: http://explain.depesz.com/s/WTP
> - Sample data and SQL query: http://rwec.co.uk/x/break_postgres.zip
> - Query plan for sample data, without running Analyze, is here:
> http://explain.depesz.com/s/qsH
> - Postgres version strings: "PostgreSQL 9.0.5 on x86_64-pc-linux-gnu,
> compiled by GCC x86_64-pc-linux-gnu-gcc (Gentoo 4.5.2 p1.1, pie-0.4.5)
> 4.5.2, 64-bit" and "PostgreSQL 9.2.1 on x86_64-pc-linux-gnu, compiled by
> x86_64-pc-linux-gnu-gcc (Gentoo Hardened 4.5.4 p1.0, pie-0.4.7) 4.5.4,
> 64-bit"
>
> Does anyone have any idea what's going on here, and whether it is in fact a
> bug? It certainly feels like one...
> Thanks for your time.

You done gone broke Postgres.  The same occurs when converting your
UPDATE query into a SELECT when I try it:

SELECT
  *
FROM
  test_mappings AS LFM, test_low_fares AS LF
JOIN
  test_cob_offers AS CO
  ON
    CO.num_adults_searched   = LF.number_in_party
  AND
    CO.num_children_searched = 0
  AND
    CO.num_infants_searched  = 0
WHERE
  LFM.low_fares_row_id     = LF.low_fares_row_id
AND
  CO.central_offer_id  = LFM.central_offer_id
AND
  CO.credential_set_id = LFM.credential_set_id
AND
  LFM.cob_offer_id IS NULL;

Well something appears to be tangling up the executor.
--
Thom


pgsql-general by date:

Previous
From: Willy-Bas Loos
Date:
Subject: Re: don't i need a -modules package for londiste3?
Next
From: Shaun Thomas
Date:
Subject: Re: UPDATE using 3 medium-sized tables causes runaway hash table and fills disk