strange parallel query behavior after OOM crashes - Mailing list pgsql-hackers

From Tomas Vondra
Subject strange parallel query behavior after OOM crashes
Date
Msg-id 6dd5675f-ef4c-fb3c-3b0c-c2a759fd631e@2ndquadrant.com
Whole thread Raw
List pgsql-hackers
Hi,

While doing some benchmarking, I've ran into a fairly strange issue with 
OOM breaking LaunchParallelWorkers() after the restart. What I see 
happening is this:

1) a query is executed, and at the end of LaunchParallelWorkers we get
    nworkers=8 nworkers_launched=8

2) the query does a Hash Aggregate, but ends up eating much more memory 
due to n_distinct underestimate (see [1] from 2015 for details), and 
gets killed by OOM

3) the server restarts, the query is executed again, but this time we 
get in LaunchParallelWorkers
    nworkers=8 nworkers_launched=0

There's nothing else running on the server, and there definitely should 
be free parallel workers.

4) The query gets killed again, and on the next execution we get
    nworkers=8 nworkers_launched=8

again, although not always. I wonder whether the exact impact depends on 
OOM killing the leader or worker, for example.

regards


[1] 
https://www.postgresql.org/message-id/flat/CAFWGqnsxryEevA5A_CqT3dExmTaT44mBpNTy8TWVsSVDS71QMg%40mail.gmail.com

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Mithun Cy
Date:
Subject: Re: [POC] A better way to expand hash indexes.
Next
From: Simon Riggs
Date:
Subject: Re: Logical decoding on standby