Re: Bug in ecpg lib ? - Mailing list pgsql-general

From Albe Laurenz
Subject Re: Bug in ecpg lib ?
Date
Msg-id D960CB61B694CF459DCFB4B0128514C203937E11@exadv11.host.magwien.gv.at
Whole thread Raw
In response to Re: Bug in ecpg lib ?  (leif@crysberg.dk)
List pgsql-general
lj@crysberg.dk wrote:
>     I have been trying to figure this thing out myself too, 
> breakpointing and single stepping my way through some of the 
> ecpg code, but without much clarification. (More that I 
> learned new things about pthread). I have been trying to 
> figure out whether this is a real thing or more a mudflapth 
> "mis-judgement". Also on most (the faster ones) machines 
> mudflap complains either about "invalid pointer in free()" or 
> "double free() or corruption". I haven't been able to verify 
> this yet. Specifically on one (slower) machine, I have only 
> seen this mudflapth complaint once, though I have been both 
> running and debugging it on that many times.
> 
>     Are you sure what you suggest is nonsense ? In the light 
> of the sqlca struct being "local" to each thread ? I tried to 
> put the open and close connection within the thread, but I 
> was still able to get the mudflap complaint. Theoretically, I 
> guess one could use just 1 connection for all db access in 
> all threads just having them enclosed within 
> pthread_mutex_[un]lock()s !? (Not what I do, though.)

The sqlca is local to each thread, but that should not be a problem.
On closer scrutiny of the source, it works like this:

Whenever a thread performs an SQL operation, it will allocate
an sqlca in its thread-specific data area (TSD) in the ECPG function
ECPGget_sqlca(). When the thread exits or is cancelled, the
sqlca is freed by pthread by calling the ECPG function
ecpg_sqlca_key_destructor(). pthread makes sure that each
destructor function is only called once per thread.

So when several threads use a connection, there will be
several sqlca's around, but that should not matter as they get
freed when the thread exits.

After some experiments, I would say that mudflap's complaint
is a mistake.

I've compiled your program against a debug-enabled PostgreSQL 8.4.0 with

$ ecpg crashex

$ gcc -Wall -O0 -g -o crashex crashex.c -I /magwien/postgres-8.4.0/include \
-L/magwien/postgres-8.4.0/lib -lecpg -Wl,-rpath,/magwien/postgres-8.4.0/lib

and run a gdb session:

$ gdb
GNU gdb Red Hat Linux (6.3.0.0-1.138.el3rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".

   Set the program to be debugged:

(gdb) file crashex
Reading symbols from /home/laurenz/ecpg/crashex...done.
Using host libthread_db library "/lib/tls/libthread_db.so.1".

   This is where the source of libecpg is:

(gdb) dir /home/laurenz/rpmbuild/BUILD/postgresql-8.4.0/src/interfaces/ecpg/ecpglib
Source directories searched: /home/laurenz/rpmbuild/BUILD/postgresql-8.4.0/src/interfaces/ecpg/ecpglib:$cdir:$cwd

   Start the program (main thread):

(gdb) break main
Breakpoint 1 at 0x804892c: file crashex.pgc, line 54.
(gdb) run
Starting program: /home/laurenz/ecpg/crashex 
[Thread debugging using libthread_db enabled]
[New Thread -1218572160 (LWP 29290)]
[Switching to Thread -1218572160 (LWP 29290)]

Breakpoint 1, main (argc=1, argv=0xbfffce44) at crashex.pgc:54
54      PerformTask( 25 );
(gdb) delete
Delete all breakpoints? (y or n) y

   Set breakpoint #2 in the function where sqlca is freed:

(gdb) break ecpg_sqlca_key_destructor
Breakpoint 2 at 0x457a27: file misc.c, line 124.
(gdb) list misc.c:124
119    
120    #ifdef ENABLE_THREAD_SAFETY
121    static void
122    ecpg_sqlca_key_destructor(void *arg)
123    {
124        free(arg);                    /* sqlca structure allocated in ECPGget_sqlca */
125    }
126    
127    static void
128    ecpg_sqlca_key_init(void)

   Set breakpoint #3 where a new sqlca is allocated in ECPGget_sqlca():

(gdb) break misc.c:147
Breakpoint 3 at 0x457ad2: file misc.c, line 147.
(gdb) list misc.c:134,misc.c:149
134    struct sqlca_t *
135    ECPGget_sqlca(void)
136    {
137    #ifdef ENABLE_THREAD_SAFETY
138        struct sqlca_t *sqlca;
139    
140        pthread_once(&sqlca_key_once, ecpg_sqlca_key_init);
141    
142        sqlca = pthread_getspecific(sqlca_key);
143        if (sqlca == NULL)
144        {
145            sqlca = malloc(sizeof(struct sqlca_t));
146            ecpg_init_sqlca(sqlca);
147            pthread_setspecific(sqlca_key, sqlca);
148        }
149        return (sqlca);
(gdb) cont
Continuing.

   Breakpoint #3 is hit when the main thread allocates an sqlca during connect:

Breakpoint 3, ECPGget_sqlca () at misc.c:147
147            pthread_setspecific(sqlca_key, sqlca);
(gdb) where
#0  ECPGget_sqlca () at misc.c:147
#1  0x00456d57 in ECPGconnect (lineno=41, c=0, name=0x9bf2008 "test@localhost:1238", 
    user=0x8048a31 "laureny", passwd=0x0, connection_name=0x8048a14 "dbConn", autocommit=0)
    at connect.c:270
#2  0x080488a3 in PerformTask (TaskId=25) at crashex.pgc:41
#3  0x08048936 in main (argc=1, argv=0xbfffce44) at crashex.pgc:54

   This is the address of the main thread's sqlca:

(gdb) print sqlca
$1 = (struct sqlca_t *) 0x9bf2028
(gdb) cont
Continuing.
[New Thread 27225008 (LWP 29343)]
[Switching to Thread 27225008 (LWP 29343)]

   Breakpoint #3 is hit again when the new thread allocates its sqlca when it executes the SELECT statement:

Breakpoint 3, ECPGget_sqlca () at misc.c:147
147            pthread_setspecific(sqlca_key, sqlca);
(gdb) where
#0  ECPGget_sqlca () at misc.c:147
#1  0x004579aa in ecpg_init (con=0x0, connection_name=0x8048a14 "dbConn", lineno=22) at misc.c:107
#2  0x00451a97 in ECPGdo (lineno=22, compat=0, force_indicator=1, 
    connection_name=0x8048a14 "dbConn", questionmarks=0 '\0', st=0, query=0x8048a1b "select 2 + 2")
    at execute.c:1470
#3  0x080487f7 in Work () at crashex.pgc:22
#4  0x00c8cdd8 in start_thread () from /lib/tls/libpthread.so.0
#5  0x003e5fca in clone () from /lib/tls/libc.so.6

   This is the address of the new thread's sqlca:

(gdb) print sqlca
$2 = (struct sqlca_t *) 0x9c16ee8
(gdb) cont
Continuing.
2+2=0.

   Breakpoint #2 is hit when the new thread is canceled:

Breakpoint 2, ecpg_sqlca_key_destructor (arg=0x9c16ee8) at misc.c:124
124        free(arg);                    /* sqlca structure allocated in ECPGget_sqlca */
(gdb) where
#0  ecpg_sqlca_key_destructor (arg=0x9c16ee8) at misc.c:124
#1  0x00c8d799 in deallocate_tsd () from /lib/tls/libpthread.so.0
#2  0x00c8cde6 in start_thread () from /lib/tls/libpthread.so.0
#3  0x003e5fca in clone () from /lib/tls/libc.so.6

   The freed pointer is the sqlca of the new thread:

(gdb) print arg
$3 = (void *) 0x9c16ee8

   And the program terminates with no problems.

(gdb) cont
Continuing.
[Thread 27225008 (zombie) exited]

Program exited normally.
(gdb) quit


This all looks just like it should, doesn't it?

Yours,
Laurenz Albe

pgsql-general by date:

Previous
From: Andreas Wenk
Date:
Subject: [Re: Password?]
Next
From: Alban Hertroys
Date:
Subject: Re: ZFS prefetch considered evil?