BUG #18487: libpq: Race condition in PQsetdbLogin/emitHostIdentityInfo/libpq_gettext - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18487: libpq: Race condition in PQsetdbLogin/emitHostIdentityInfo/libpq_gettext
Date
Msg-id 18487-dc904fcf37a7b0b4@postgresql.org
Whole thread Raw
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18487
Logged by:          Christian Maurer
Email address:      c.maurer@gmx.at
PostgreSQL version: 16.3
Operating system:   Windows 10 Pro
Description:

Hello all!

When calling PQsetdbLogin() concurrently the program (see main.cpp snippet,
same as in [1]) sometimes crashes with RC=3 in emitHostIdentityInfo() when
using gettext-0.19.8 which is packaged by EnterpriseDB [2].

Server: PostgreSQL 16.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
20231218 (Red Hat 11.4.1-3), 64-bit

libpq: postgresql-16.3-1-windows-x64-binaries.zip (coming from [2])

-- Output start --
PQisthreadsafe: 1
Connect 1
Connect 2
Connect 3
Connect 0
Connected 3
Finished 3
Connected 0
Finished 0
Connected 1
Finished 1
-- Output end --
=> 'End' was not reached
=> %ERRORLEVEL% is 3
=> Thread 2 did not connect/finish

When calling PQsetdbLogin() and PQfinish() once before the multi-threading
starts, there is no crash.


When we get a backtrace it looks like this in Visual studio:

    msvcrt.dll!00007ff9f57ef26d()
     libintl-9.dll!_nl_find_domain(volatile char * dirname, char * locale,
volatile char * domainname, binding * domainbinding) Line 87
     libintl-9.dll!libintl_dcigettext(volatile char * domainname, volatile char
* msgid1, volatile char * msgid2, int plural, unsigned int n, int
category)
     libintl-9.dll!libintl_dcgettext(volatile char * domainname, volatile char
* msgid, int category) Line 48
     libpq.dll!emitHostIdentityInfo(pg_conn * conn, const char * host_addr)
Line 2028
     libpq.dll!PQconnectPoll(pg_conn * conn) Line 2920
     libpq.dll!connectDBStart(pg_conn * conn) Line 2383
     libpq.dll!PQsetdbLogin(const char * pghost, const char * pgport, const
char * pgoptions, const char * pgtty, const char * dbName, const char *
login, const char * pwd) Line 1914

And like this in WinDbg

Frame Index      Call Site                                   Child-SP
Return Address
[0x0]            ntdll!NtTerminateProcess+0x14               0x9c464fde18
0x7ff9f5aeda98   
[0x1]            ntdll!RtlExitUserProcess+0xb8               0x9c464fde20
0x7ff9f4fbe3bb   
[0x2]            KERNEL32!FatalExit+0xb                      0x9c464fde50
0x7ff9f57fa155   
[0x3]            msvcrt!exit+0x85                            0x9c464fde80
0x7ff9f57fa7c5   
[0x4]            msvcrt!initterm_e+0x245                     0x9c464fdeb0
0x7ff9f57ef26d   
[0x5]            msvcrt!abort+0x8d                           0x9c464fdf20
0x68281bcd   
[0x6]            libintl_9!_nl_find_domain+0x28d             0x9c464fe4d0
0x68284cbb   
[0x7]            libintl_9!libintl_dcigettext+0x31b          0x9c464fe5b0
0x6828190c   
[0x8]            libintl_9!libintl_dcgettext+0x1c            0x9c464fe6c0
0x7ff99e0691d3   
[0x9]            LIBPQ!emitHostIdentityInfo+0x103            0x9c464fe700
0x7ff99e063d7a   
[0xa]            LIBPQ!PQconnectPoll+0x8ba                   0x9c464feb80
0x7ff99e06697c   
[0xb]            LIBPQ!connectDBStart+0x9c                   0x9c464ff5e0
0x7ff99e06509b   
[0xc]            LIBPQ!PQsetdbLogin+0x19b                    0x9c464ff610
0x7ff6bd794f28   

It comes down to
* PostgreSQL calling libpq_gettext("connection to server at \"%s\" (%s),
port %s failed: ")
* libintl_9 libintl_dcgettext() aborting in _nl_find_domain()

WinDbg pointed me to gettext-0.19.8 which seems to be packaged in
EnterpriseDB.

Further google-search revealed [3] which offers a final explanation/solution
to the discussion seen in [4] 
* There is a bug in gettext-0.19.8 causing the problems
* EnterpriseDB provides gettext-0.19.8, because according to [5] a newer
gettext() Version caused performance issues in UDFs
* Getting libintl-8.dll out of gettext-0.21 and renaming it to libintl-9.dll
solves this issue

My questions:
* Does anyone know the status of these performance issues with newer
gettext() versions?
* Is there a better way to achieve a stable concurrent connect when using
Windows?

Regards,
Christian Maurer

[1]
https://www.postgresql.org/message-id/18312-bbbabc8113592b78%40postgresql.org
[2] https://www.enterprisedb.com/download-postgresql-binaries
[3]
https://github.com/diesel-rs/diesel/discussions/2947#discussioncomment-2025857
[4]
https://www.postgresql.org/message-id/CAE7q7Eit4Eq2%3Dbxce%3DFm8HAStECjaXUE%3DWBQc-sDDcgJQ7s7eg%40mail.gmail.com
[5]
https://www.postgresql.org/message-id/9bc5f1df-e847-29a4-a907-7a5d643b7d0d%40dunslane.net


-- main.cpp start --
#include <iostream>
#include <thread>
#include <vector>
#include <pgsql/libpq-fe.h>

void test(int i)
{
    try {
        std::cout << "Connect " << i << std::endl;
        PGconn *pgConn = PQsetdbLogin("myHost", (const char*)NULL, (const
char*)NULL, (const char*)NULL, "myDatabase", "myUser", "myPassword");
        std::cout << "Connected " << i << std::endl;
        PQfinish(pgConn);
        std::cout << "Finished " << i << std::endl;
    }
    catch (...)
    {
        std::cout << "Exception occurred in " << i << std::endl;
    }
}

int main()
{
    std::cout << "PQisthreadsafe: " << PQisthreadsafe() << std::endl;

    try {

        const std::size_t maxThreads = 4U;
        std::vector<std::thread> myThreads;
        for (std::size_t i = 0; i < maxThreads; ++i)
            myThreads.push_back(std::thread(test, i));

        for (std::thread& myThread : myThreads)
            myThread.join();

        std::cout << "End" << std::endl;
    }
    catch (...)
    {
        std::cout << "Exception occurred in main" << std::endl;
    }

    return 0;
}
-- main.cpp end --


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18486: Is there something wrong with the calculation in ReorderBufferChangeSize()?
Next
From: Marcin Barczyński
Date:
Subject: Re: BUG #18334: Segfault when running a query with parallel workers