Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query - Mailing list pgsql-bugs

From Maksim Milyutin
Subject Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query
Date
Msg-id 458c691a-1b2d-7a70-1c04-58361b894011@gmail.com
Whole thread Raw
In response to Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query
List pgsql-bugs


On 1/11/19 8:16 AM, Thomas Munro wrote:
On Fri, Jan 11, 2019 at 10:04 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
I am trying to reproduce it.
I ran squillions of concurrent parallel queries today, making sure
they would allocate and free entire segments a lot (based on the
assumption that something along those lines must be required).  I made
sure to use Linux, GCC, -O2, no asserts (since both reports came from
that environment, and I'd previously failed to reproduce this on my
usual tool chain/platforms), and I used a multi-socket box (in case
some cache coherency effect was not occurring on my development
laptops).  I did learn some interesting things about parallel query
performance that I plan to follow up on, but I had no luck in
reproducing this error.  Rats.

One observation is that the other report involved a fairly simple
Parallel Hash Join query (on 11), and this report involves Parallel
Index Scan and Parallel Bitmap Index Scan (on 10), so that suggests
that it's probably not a bug in the parallel executor code (for
example an access-after-free, whose undefined behaviour could in
theory look like this with unlucky timing, I speculate) but rather
something lower level.


Recently one of our customers encountered the same problem on PG 11.1. The query structure and error messages are similar to one that was described in [1].


OS/PG settings:

Ubuntu 16.04

mem 128 GB
cpu 16 cores
huge_pages is turned off

# sudo cat /proc/sys/kernel/shmmax
18446744073692774399

max_connections = 200
shared_buffers = 33GB
dynamic_shared_memory_type = posix
max_worker_processes = 16
max_parallel_maintenance_workers = 6
max_parallel_workers_per_gather = 6
max_parallel_workers = 12


The query was:

SELECT
  ev.date,
  r.indicator_id,
  r.service_reg_id,
  count(r.value),
  count(CASE WHEN r.value::int > 3 THEN 1 ELSE NULL END),
  avg(r.value::int)
FROM base.hershel_events AS ev
  INNER JOIN base.hershel_rates AS r ON r.event_id = ev.id
WHERE ev.date between '2018-09-15' AND '2018-12-15'
GROUP BY ev.date, r.indicator_id, r.service_reg_id
ORDER BY ev.date ASC

Its plan with record statistics:

 Finalize GroupAggregate (actual rows=81332937 loops=1)
   Group Key: ev.date, r.indicator_id, r.service_reg_id
   ->  Gather Merge (actual rows=81332937 loops=1)
         Workers Planned: 7
         Workers Launched: 7
         ->  Partial GroupAggregate (actual rows=10166617 loops=8)
               Group Key: ev.date, r.indicator_id, r.service_reg_id
               ->  Sort (actual rows=10166617 loops=8)
                     Sort Key: ev.date, r.indicator_id, r.service_reg_id
                     Sort Method: external merge  Disk: 549392kB
                     Worker 0:  Sort Method: external merge  Disk: 516104kB
                     Worker 1:  Sort Method: external merge  Disk: 533200kB
                     Worker 2:  Sort Method: external merge  Disk: 540680kB
                     Worker 3:  Sort Method: external merge  Disk: 540760kB
                     Worker 4:  Sort Method: external merge  Disk: 530512kB
                     Worker 5:  Sort Method: external merge  Disk: 472240kB
                     Worker 6:  Sort Method: external merge  Disk: 539256kB
                     ->  Parallel Hash Join (actual rows=10166617 loops=8)
                           Hash Cond: (r.event_id = ev.id)
                           ->  Parallel Seq Scan on hershel_rates r (actual rows=15000000 loops=8)
                           ->  Parallel Hash (actual rows=9319175 loops=8)
                                 Buckets: 4194304  Batches: 32  Memory Usage: 124064kB
                                 ->  Parallel Seq Scan on hershel_events ev (actual rows=9319175 loops=8)
                                       Filter: ((date >= '2018-09-15'::date) AND (date <= '2018-12-15'::date))
                                       Rows Removed by Filter: 4430825


At some moment one of parallel workers emitted to log the error message: dsa_area could not attach to segment. After terminating all workers (the leader also bumped into the same error) the last one began to emit to log the following type of messages: LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp5027.0.sharedfileset/i55of128.p1.0", size 458752. The summary size of temp files that was mentioned in the log was about *14GB*. And eventually that last worker got the fatal error: cannot unpin a segment that is not pinned. After that the DynamicSharedMemoryControlLock left unreleased and soon new backends hanged on it.

Another time some backend (more likely, it was some utility worker) bumped into the error: could not open shared memory segment "/PostgreSQL.1948558832": No such file or directory. After that other backends stopped on DynamicSharedMemoryControlLock. Before this issue there were parallel aggregate queries.


Unfortunately our customer doesn't want to catch up this error on production system again. He turned off the query parallelism. And this issue doesn't reproduce on test environment. AFAIK this is heisenbug and is reproducible on heavy load. I'll try to run some stress tests to catch up stack trace.


[1] https://www.postgresql.org/message-id/flat/20181231221734.GB25379%40telsasoft.com

-- 
Regards, Maksim Milyutin
Postgres Professional:http://www.postgrespro.ru
The Russian Postgres Company

pgsql-bugs by date:

Previous
From: Amit Langote
Date:
Subject: Re: BUG #15587: Partitions with ALTER TABLE ADD CONSTRAINT
Next
From: PG Bug reporting form
Date:
Subject: BUG #15594: Unstable tests in contrib/test_decoding/output_iso/