Concurrency Issue with pg_ctl causing processes to hang in PostgreSQL 15.12 - Mailing list pgsql-hackers

From Srirama Kucherlapati
Subject Concurrency Issue with pg_ctl causing processes to hang in PostgreSQL 15.12
Date
Msg-id SJ4PPFB8177832614D8066E19550FC5E585DB06A@SJ4PPFB81778326.namprd15.prod.outlook.com
Whole thread Raw
List pgsql-hackers

Hi all,

We are encountering a concurrency issue with pg_ctl in PostgreSQL 15.12, two separate processes  are invoking pg_ctl simultaneously on different databases directories:

-   One process is attempting to stop a database.

-   Another process is concurrently executing pg_ctl status to check the PG server state.

 

 

Below are the running pg_ctl processes. Here to mention both the DB directories are entirely different.

 

    pgadmin 25559346 26870100  0  Aug 28   - 0:00 -ksh -c pg_ctl status -D /pgmount/PG/DB/

    pgadmin 25690410 27591028  0  Aug 28   - 0:00 -ksh -c pg_ctl stop -D /pgmount/MS/DB/

 

 

 

This concurrent usage results in both processes hanging indefinitely.

Below are the stacks of both the processes

 

    # dbx -a 25559346

    Waiting to attach to process 25559346 ...

    Successfully attached to ksh.

    warning: Directory containing ksh could not be determined.

    Apply ‘use’ command to initialize source path.Type ‘help’ for help.

    reading symbolic information ...warning: no source compiled with -gstopped in pause at 0xd01b7104

    0xd01b7104 (pause+0x204) 80410014      lwz  r2,0x14(r1)

 

    (dbx) t

    pause() at 0xd01b7104

    IPRA.(??, ??, ??, ??, ??) at 0x100387f4

    sh_exec(??, ??, ??) at 0x1003a120

    IPRA.(??) at 0x10009328

    IPRA.(??) at 0x1000a98c

    copyto(??, ??, ??) at 0x1000af0c

    mac_expand(??) at 0x1000b960

    mac_trim(??, ??) at 0x1000b5dc

    env_setlist(??, ??) at 0x10022034

    IPRA.(??, ??, ??, ??, ??) at 0x10038c90

    sh_exec(??, ??, ??) at 0x1003a120

    IPRA.() at 0x10001b48

    main(??, ??) at 0x10000e3c

 

 

    # dbx -a 25690410

    Waiting to attach to process 25690410 ...

    Successfully attached to ksh.

    warning: Directory containing ksh could not be determined.

    Apply ‘use’ command to initialize source path.Type ‘help’ for help.

    reading symbolic information ...warning: no source compiled with -gstopped in pause at 0xd01b7104

    0xd01b7104 (pause+0x204) 80410014      lwz  r2,0x14(r1)

    (dbx) th

    (dbx) t

    pause() at 0xd01b7104

    IPRA.(??, ??, ??, ??, ??) at 0x100387f4

    sh_exec(??, ??, ??) at 0x1003a120

    IPRA.(??) at 0x10009328

    IPRA.(??) at 0x1000a98c

    copyto(??, ??, ??) at 0x1000af0c

    mac_expand(??) at 0x1000b960

    mac_trim(??, ??) at 0x1000b5dc

    env_setlist(??, ??) at 0x10022034

    IPRA.(??, ??, ??, ??, ??) at 0x10038c90

    sh_exec(??, ??, ??) at 0x1003a120

    IPRA.() at 0x10001b48

    main(??, ??) at 0x10000e3c

 

 

Is this a known issue in PostgreSQL 15.12?

Are there recommended workarounds or configuration changes to avoid this behaviour?

Is there any way for pg_ctl to internally lock, to prevent such hangs?

Do we need to serialize pg_ctl stop and status, even if we are running different database?

 

 

 

Warm regards,

Sriram.

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Refactoring: Use soft error reporting for *_opt_error functions
Next
From: Michael Paquier
Date:
Subject: Re: Refactoring: Use soft error reporting for *_opt_error functions