Re: PL/TCL Patch to prevent postgres from becoming multithreaded - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: PL/TCL Patch to prevent postgres from becoming multithreaded
Date
Msg-id 200709141720.l8EHKwX13792@momjian.us
Whole thread Raw
In response to PL/TCL Patch to prevent postgres from becoming multithreaded  ("Marshall, Steve" <smarshall@wsi.com>)
Responses Re: PL/TCL Patch to prevent postgres from becoming multithreaded  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-patches
This has been saved for the 8.4 release:

    http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Marshall, Steve wrote:
> There is a problem in PL/TCL that can cause the postgres backend to
> become multithreaded.   Postgres is not designed to be multithreaded, so
> this causes downstream errors in signal handling.  We have seen this
> cause a number of "unexpected state" errors associated with notification
> handling; however, unpredictable signal handling would be likely to
> cause other errors as well.
>
> Some sample scripts are attached which will reproduce this problem when
> running against a multithreaded version of TCL, but will work without
> error with single-threaded TCL library.  The scripts are a combination
> of Unix shell, perl DBI, and SQL commands.  The postgres process can be
> seen to have multiple threads using the Linux command "ps -Lwfu
> postgres".  In this command the NLWP columns will be 2 for multithreaded
> backend processes.  The threaded/non-threaded state of the TCL library
> can be ascertained on Linux using ldd to determine if libpthread.so is
> linked to the TCL library (e.g. "ldd /usr/lib/libtcl8.4.so").
>
> The multithreaded behavior occurs the first time PL/TCL is used in a
> postgres backend, but only when postgres is linked against a
> multithread-enabled version of libtcl.  Thus, this problem can be
> side-stepped by linking against the proper TCL library.  However
> multithreaded TCL libraries are becoming the norm in Linux distributions
> and seems ubiquitous in the Windows world.  Therefore a fix to the
> PL/TCL code is warrented.
>
> We determined that postgres became multithreaded during the creation of
> the TCL interpreter in a function called tcl_InitNotifier.  This
> function is part of TCL's Notifier subsystem, which is used to monitor
> for events asynchronously from the TCL event loop.  Although initialized
> when an interpreter is created, the Notifier subsystem is not used until
> a process enters the TCL event loop.  This never happens within a
> postgres process, because postgres implements its own event loop.
> Therefore the initialization of the Notifier subsystem is not necessary
> within the context of PL/TCL.
>
> Our solution was to disable the Notifier subsystem by overriding the
> functions associated with it using the Tcl_SetNotifier function.  This
> allows 8 functions related to the Notifier to overriden.  Even though we
> found only two of the functions were ever called within postgres, we
> overrode 8 functions with no-op versions, just for completeness.  A
> patch file containing the changes to pltcl.c from its 8.2.4 version is
> also attached.
>
> We tested this patch with PostgreSQL 8.2.4 on both RedHat Enterprise 4.0
> usingTCL 8.4 (single threaded) and RHE 5.0 using TCL 8.4.13
> (multithreaded).  We expect this solution to work with Windows as well,
> although we have not tested it.  There may be some problems using this
> solution with old versions of TCL that pre-date the Tcl_SetNotifier
> function.  However this function has been around for quite a while; it
> was added in in the TCL 8.2 release, circa 2000.
>
> We hope this patch will be considered for a future PostgreSQL release.
>
> Steve Marshall
> Paul Bayer
> Doug Knight
> WSI Corporation
>
>

[ application/x-gzip is not supported, skipping... ]

> *** pltcl.c.orig    2007-09-10 12:58:34.000000000 -0400
> --- pltcl.c    2007-09-11 11:37:33.363222114 -0400
> ***************
> *** 163,168 ****
> --- 163,258 ----
>   static void pltcl_build_tuple_argument(HeapTuple tuple, TupleDesc tupdesc,
>                              Tcl_DString *retval);
>
> + /**********************************************************************
> +  *  Declarations for functions overriden using Tcl_SetNotifier.
> +  **********************************************************************/
> + static int fakeThreadKey;   /* To give valid address for ClientData */
> +
> + static ClientData
> + pltcl_InitNotifier _ANSI_ARGS_((void));
> +
> + static void
> + pltcl_FinalizeNotifier _ANSI_ARGS_((ClientData clientData));
> +
> + static void
> + pltcl_SetTimer _ANSI_ARGS_((Tcl_Time *timePtr));
> +
> + static void
> + pltcl_AlertNotifier _ANSI_ARGS_((ClientData clientData));
> +
> + static void
> + pltcl_CreateFileHandler _ANSI_ARGS_((int fd, int mask, Tcl_FileProc *proc, ClientData clientData));
> +
> + static void
> + pltcl_DeleteFileHandler _ANSI_ARGS_((int fd));
> +
> + static void
> + pltcl_ServiceModeHook _ANSI_ARGS_((int mode));
> +
> + static int
> + pltcl_WaitForEvent _ANSI_ARGS_((Tcl_Time *timePtr));
> +
> + /**********************************************************************
> +  *  Definitions for functions overriden using Tcl_SetNotifier.
> +  *  These implementations effectively disable the TCL Notifier subsystem.
> +  *  This is okay because we never enter the TCL event loop from postgres,
> +  *  so the notifier capabilities are initialized, but never used.
> +  *
> +  *  NOTE: Only InitNotifier and DeleteFileHandler ever seem to get called
> +  *        by postgres, but we implement all the functions for completeness.
> +  **********************************************************************/
> +
> + ClientData
> + pltcl_InitNotifier()
> + {
> +     return (ClientData) &(fakeThreadKey);
> + }
> +
> + void
> + pltcl_FinalizeNotifier(clientData)
> +     ClientData clientData;      /* Not used. */
> + {
> + }
> +
> + void
> + pltcl_SetTimer(timePtr)
> +     Tcl_Time *timePtr;
> + {
> + }
> +
> + void
> + pltcl_AlertNotifier(clientData)
> +     ClientData clientData;
> + {
> + }
> +
> + void
> + pltcl_CreateFileHandler(fd, mask, proc, clientData)
> +     int fd;
> +     int mask;
> +     Tcl_FileProc *proc;
> +     ClientData clientData;
> + {
> + }
> +
> + void
> + pltcl_DeleteFileHandler(fd)
> +     int fd;
> + {
> + }
> +
> + void
> + pltcl_ServiceModeHook(mode)
> +     int mode;
> + {
> + }
> +
> + int
> + pltcl_WaitForEvent(timePtr)
> +     Tcl_Time *timePtr;      /* Maximum block time, or NULL. */
> + {
> +     return 0;
> + }
>
>   /*
>    * This routine is a crock, and so is everyplace that calls it.  The problem
> ***************
> *** 189,194 ****
> --- 279,287 ----
>   void
>   _PG_init(void)
>   {
> +     /*  Notifier structure used to override functions in Notifier subsystem*/
> +     Tcl_NotifierProcs notifier;
> +
>       /* Be sure we do initialization only once (should be redundant now) */
>       if (pltcl_pm_init_done)
>           return;
> ***************
> *** 199,204 ****
> --- 292,316 ----
>   #endif
>
>       /************************************************************
> +      * Override the functions in the Notifier subsystem.
> +      *
> +      * We do this to prevent the postgres backend from becoming
> +      * multithreaded, which happens in the default version of
> +      * Tcl_InitNotifier if the TCL library has been compiled with
> +      * multithreading support (i.e. when TCL_THREADS is defined
> +      * under Unix, and in all cases under Windows).
> +      ************************************************************/
> +     notifier.setTimerProc          = pltcl_SetTimer;
> +     notifier.waitForEventProc      = pltcl_WaitForEvent;
> +     notifier.createFileHandlerProc = pltcl_CreateFileHandler;
> +     notifier.deleteFileHandlerProc = pltcl_DeleteFileHandler;
> +     notifier.initNotifierProc      = pltcl_InitNotifier;
> +     notifier.finalizeNotifierProc  = pltcl_FinalizeNotifier;
> +     notifier.alertNotifierProc     = pltcl_AlertNotifier;
> +     notifier.serviceModeHookProc   = pltcl_ServiceModeHook;
> +     Tcl_SetNotifier(¬ifier);
> +
> +     /************************************************************
>        * Create the dummy hold interpreter to prevent close of
>        * stdout and stderr on DeleteInterp
>        ************************************************************/

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq

--
  Bruce Momjian  <bruce@momjian.us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: XML binary I/O (was Re: tsearch refactorings)
Next
From: Gregory Stark
Date:
Subject: Re: PL/TCL Patch to prevent postgres from becoming multithreaded