Thread: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
Hello,

There appears to be a bug in the code used by DBD::Pg's 'lo_read'
function. I presume the code with the bug is a part of libpq, and
thus the postgresql-libs RPM binary distribution.

The nature of the bug is that an 'lo_read' operation performed
with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
"hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
hang is actually a read() loop on the socket generating EAGAIN
error on each read().

I've attached a short perl script that demonstrates the bug.
The database server does not seem to matter; I get the same results
using 7.2 as I do with 7.3.x servers. I also get the same results
when I vary the perl-DBD-Pg and perl-DBI module versions. The bug
seems to be following libpg.

All of my testing has been on the i686 platform using linux 2.4.20
kernels.

Test case #1:
 - client = postgresql-libs-7.2.3-5 (redhat-7.3)
            perl-DBI-1.21-1 (redhat-7.3)
            perl-DBD-Pg-1.01-8 (redhat-7.3)
            perl-5.6.1
 - server = postgresql-server-7.3.2-3 (redhat-9)
 - transport = tcpip + ssl

 $ perl test.pl
 - reading testfile '/bin/ls'
 - inserting testfile contents
 - oid = '16265435'
 - reading large object
 success.. removing large object

Test case #2:
 - client = postgresql-libs-7.3.2-3 (redhat-9)
            perl-DBI-1.32-5 (redhat-9)
            perl-DBD-Pg-1.21-2 (redhat-9)
 -server = postgresql-server-7.3.2-3 (redhat-9)
 -transport = tcpip + ssl

 $ perl test.pl
 - reading testfile '/bin/ls'
 - inserting testfile contents
 - oid = '16265436'
 - reading large object
 Segmentation fault

 ... after ~32768 bytes, process loops reading the socket
  read(3, 0x81cc938, 5) = -1 EAGAIN (Resource temporarily unavailable)
 and eventually segfaults

Test case #3:
 - same as #2 except
 - transport = tcpip (localhost)

 $ perl test.pl
 - reading testfile '/bin/ls'
 - inserting testfile contents
 - oid = '16265436'
 - reading large object
 Segmentation fault

Test case #4:
 - same as #2 and #3 except
 - postgresql-libs-7.3.4 PGDG used

 $ perl test.pl
 - reading testfile '/bin/ls'
 - inserting testfile contents
 - oid = '16265437'
 - reading large object

 ... after ~32768 bytes, process loops reading the socket
  read(3, 0x81cc938, 5) = -1 EAGAIN (Resource temporarily unavailable)
 but doesn't appear to segfault (unless I am not patient enough)

This problem is a bastard as it makes getting data out of an
otherwise perfectly good database rather difficult. In other
words, its a show-stopper when migrating from 7.2 -> 7.3.

Regards,
Kevin

Attachment

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Tom Lane
Date:
Kevin Houle <kjh@cert.org> writes:
> The nature of the bug is that an 'lo_read' operation performed
> with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
> "hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
> hang is actually a read() loop on the socket generating EAGAIN
> error on each read().

Can you get us a gdb stack trace from the segfault cases?  (And from
the loop cases too; a stack trace back from the repeated read() call
would be useful.)

Also, does adding/removing SSL encryption affect your results?

            regards, tom lane

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Tom Lane
Date:
Kevin Houle <kjh@cert.org> writes:
>> Can you get us a gdb stack trace from the segfault cases?

> #36 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> #37 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> #38 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> #39 0x40421712 in pqsecure_read () from /usr/lib/libpq.so.3
> ...
> cut thousands of lines; you get the idea

In 7.3.2 pqsecure_read will recurse to self when SSL_read returns
SSL_ERROR_WANT_READ.  I changed the recursion to a loop in 7.3.4.
Evidently, in 7.3.2 it's possible for the recursion to overflow your
alloted stack space before the process uses up its timeslice :-(.
In 7.3.4 the loop simply doesn't exit.

I don't understand why, though.  What I expected would happen is that
the process would busy-wait until more data becomes available from the
connection.  That's not ideal but the full fix seemed safer to postpone
to 7.4.  Can you set up this situation, then attach with gdb to the
connected backend and see what it thinks it's doing?  A stacktrace from
that side of the connection might shed some light.

            regards, tom lane

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Tom Lane
Date:
Kevin Houle <kjh@cert.org> writes:
> Is it just me, or are both sides reading waiting for the other
> side to send data?

Sure looks like it.  Could it be an OpenSSL bug?

            regards, tom lane

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
--On Friday, August 15, 2003 08:43:23 AM -0400 Tom Lane <tgl@sss.pgh.pa.us>
wrote:

> Kevin Houle <kjh@cert.org> writes:
>> The nature of the bug is that an 'lo_read' operation performed
>> with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
>> "hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
>> hang is actually a read() loop on the socket generating EAGAIN
>> error on each read().
>
> Can you get us a gdb stack trace from the segfault cases?  (And from
> the loop cases too; a stack trace back from the repeated read() call
> would be useful.)

The attached gdb-7.3.2.log is from the segfault case using
postgresql-libs-7.3.2 and the gdb-7.3.4.log is the loop case
from postgresql-libs-7.3.4. I had to SIGINT the loop case to
get the stack trace.

> Also, does adding/removing SSL encryption affect your results?

Yes. It works flawlessly without SSL. I thought I had tested that
previously but in fact did not correctly; I re-tested and it works
without SSL and fails with SSL. The stack traces seem to point to
pqsecure_read().

Let me know if I can provide anything else.

Kevin



Attachment

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
--On Friday, August 15, 2003 02:09:32 PM -0400 Tom Lane <tgl@sss.pgh.pa.us>
wrote:

> In 7.3.2 pqsecure_read will recurse to self when SSL_read returns
> SSL_ERROR_WANT_READ.  I changed the recursion to a loop in 7.3.4.
> Evidently, in 7.3.2 it's possible for the recursion to overflow your
> alloted stack space before the process uses up its timeslice :-(.
> In 7.3.4 the loop simply doesn't exit.

I'll consider 7.3.2 a bad choice.

> I don't understand why, though.  What I expected would happen is that
> the process would busy-wait until more data becomes available from the
> connection.  That's not ideal but the full fix seemed safer to postpone
> to 7.4.  Can you set up this situation, then attach with gdb to the
> connected backend and see what it thinks it's doing?  A stacktrace from
> that side of the connection might shed some light.

Okay. Here's stack traces from both sides, to put the information
in the same bucket. The gdb info on the client side...

(gdb) info stack
#0  0xffffe002 in ?? ()
#1  0x404dd242 in BIO_read () from /lib/libcrypto.so.4
#2  0x40455bf0 in ssl3_alert_code () from /lib/libssl.so.4
#3  0x40455d8c in ssl3_alert_code () from /lib/libssl.so.4
#4  0x4045705e in ssl3_read_bytes () from /lib/libssl.so.4
#5  0x40454619 in ssl3_write () from /lib/libssl.so.4
#6  0x404546a7 in ssl3_read () from /lib/libssl.so.4
#7  0x4045cd2b in SSL_read () from /lib/libssl.so.4
#8  0x404216d1 in pqsecure_read () from /usr/lib/libpq.so.3
#9  0x4041d76a in pqReadData () from /usr/lib/libpq.so.3
#10 0x4041c224 in PQfn () from /usr/lib/libpq.so.3
#11 0x4041f773 in lo_read () from /usr/lib/libpq.so.3
#12 0x4014c194 in pg_db_lo_read (dbh=0xfffffff5, fd=5,
    buf=0x5 <Address 0x5 out of bounds>, len=5) at dbdimp.c:545
#13 0x40144881 in XS_DBD__Pg__st_bind_param_inout (my_perl=0x804bcb8,
    cv=0x81b1794) at Pg.xs:282
#14 0x40400247 in XS_DBI_dispatch (my_perl=0x804bcb8, cv=0x8187ee0)
    at DBI.xs:2320
#15 0x400a34e5 in Perl_pp_entersub (my_perl=0x804bcb8) at pp_hot.c:2781
#16 0x40086c9a in Perl_runops_debug (my_perl=0x804bcb8) at dump.c:1414
#17 0x4003a9ab in S_run_body (my_perl=0x804bcb8, oldscope=-11) at
perl.c:1705
#18 0x4003a635 in perl_run (my_perl=0x804bcb8) at perl.c:1624
#19 0x080493a3 in main ()
#20 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6

(gdb) info reg
eax            0xfffffff5       -11
ecx            0x81cc880        136104064
edx            0x5      5
ebx            0x3      3
esp            0xbfffeaa4       0xbfffeaa4
ebp            0xbfffeac8       0xbfffeac8
esi            0x0      0
edi            0x81cc880        136104064
eip            0xffffe002       0xffffe002
eflags         0x246    582
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x0      0
gs             0x33     51


And here is the gbd info from the attached backend process.

(gdb) info stack
#0  0xffffe002 in ?? ()
#1  0x400d5242 in BIO_read () from /lib/libcrypto.so.4
#2  0x4004dbf0 in ssl3_alert_code () from /lib/libssl.so.4
#3  0x4004dd8c in ssl3_alert_code () from /lib/libssl.so.4
#4  0x4004f05e in ssl3_read_bytes () from /lib/libssl.so.4
#5  0x4004c619 in ssl3_write () from /lib/libssl.so.4
#6  0x4004c6a7 in ssl3_read () from /lib/libssl.so.4
#7  0x40054d2b in SSL_read () from /lib/libssl.so.4
#8  0x080eff16 in secure_read ()
#9  0x080f43d2 in StreamClose ()
#10 0x080f4495 in pq_getbyte ()
#11 0x0813a194 in HandleFunctionRequest ()
#12 0x0813a21a in HandleFunctionRequest ()
#13 0x0813b8ba in PostgresMain ()
#14 0x0811f37b in ClosePostmasterPorts ()
#15 0x0811ee03 in ClosePostmasterPorts ()
#16 0x0811dc09 in PostmasterMain ()
#17 0x0811d3a6 in PostmasterMain ()
#18 0x080f5113 in main ()
#19 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6

(gdb) info reg
eax            0xfffffe00       -512
ecx            0x8282740        136849216
edx            0x5      5
ebx            0x8      8
esp            0xbfffe0b4       0xbfffe0b4
ebp            0xbfffe0d8       0xbfffe0d8
esi            0x0      0
edi            0x8282740        136849216
eip            0xffffe002       0xffffe002
eflags         0x246    582
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x0      0
gs             0x33     51

Is it just me, or are both sides reading waiting for the other
side to send data?

Kevin

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
Tom Lane wrote:
> Kevin Houle <kjh@cert.org> writes:
>
>>Is it just me, or are both sides reading waiting for the other
>>side to send data?
>
> Sure looks like it.  Could it be an OpenSSL bug?

Well, redhat-9 ships with openssl-0.9.7a so I tried it
with openssl-0.9.7b and lo_read still caused a read()
loop after 32768 bytes. Then I used a Debian sid machine
for testing using the same test case. It uses

   postgresql 7.3.3
   openssl 0.9.7b

The lo_read worked flawlessly on the Debian box. So I grabbed
the postgresql 7.3.3 packages from the PGDG website and tried
those on redhat-9. Still experienced the loop after 32768 bytes.

I'm at a bit of a loss here. It would be nice if someone else
could run the test script (see original message in this thread)
on a redhat-9 box to eliminate my environment from the equation.

Kevin

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
Tom Lane wrote:

> Kevin Houle <kjh@cert.org> writes:
>
>>Is it just me, or are both sides reading waiting for the other
>>side to send data?
>
> Sure looks like it.  Could it be an OpenSSL bug?

One more data point. The DBD::Pg 'lo_extract' function works
fine across SSL. There is no issue with large objects >= 32K
using 'lo_extract'. So that casts doubt on it being an OpenSSL
issue. Is there a different code path within libpq.so to move
data from the server to the client via SSL for lo_extract than
for lo_read that we can learn from? I'm looking at the code,
but for the first time.

Kevin

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
Kevin Houle wrote:

> One more data point. The DBD::Pg 'lo_extract' function works
> fine across SSL. There is no issue with large objects >= 32K
> using 'lo_extract'. So that casts doubt on it being an OpenSSL
> issue. Is there a different code path within libpq.so to move
> data from the server to the client via SSL for lo_extract than
> for lo_read that we can learn from? I'm looking at the code,
> but for the first time.

This looks suspicious.

$ pwd; find . -type f | xargs grep inv_read
/local/kjh/pgRPM/postgresql-7.3.4/src
./backend/libpq/be-fsstubs.c:   status = inv_read(cookies[fd], buf, len);
./backend/libpq/be-fsstubs.c:   while ((nbytes = inv_read(lobj, buf,
BUFSIZE)) > 0)
./backend/storage/large_object/inv_api.c:inv_read(LargeObjectDesc
*obj_desc, char *buf, int nbytes)
./include/storage/large_object.h:extern int     inv_read(LargeObjectDesc
*obj_desc, char *buf, int nbytes);

inv_read(cookies[fd], buf, len) is in lo_read()
inv_read(lobj, buf, BUFSIZE) is in lo_export()

./backend/libpq/be-fsstubs.c:#define BUFSIZE                    8192

The lo_export() function loops calling inv_read() until the
entire object is read. The lo_read() function simply passes
the number of bytes to be read to a single invocation of the
inv_read() function. So if I use a length >= 32768 for
lo_read(), it is happily passed to inv_read in an int datatype,
which is 32k.

Seems to me lo_read() should loop around inv_read() using
BUFSIZE like lo_export() does.

Kevin

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
Kevin Houle wrote:

> Tom Lane wrote:
>
>> Kevin Houle <kjh@cert.org> writes:
>>
>>> Is it just me, or are both sides reading waiting for the other
>>> side to send data?
>>
>>
>> Sure looks like it.  Could it be an OpenSSL bug?
>
> One more data point. The DBD::Pg 'lo_extract' function works
> fine across SSL. There is no issue with large objects >= 32K
> using 'lo_extract'. So that casts doubt on it being an OpenSSL
> issue. Is there a different code path within libpq.so to move
> data from the server to the client via SSL for lo_extract than
> for lo_read that we can learn from? I'm looking at the code,
> but for the first time.

s/lo_extract/lo_export/

Must  work  less.

Kevin

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Tom Lane
Date:
Kevin Houle <kjh@cert.org> writes:
>> One more data point. The DBD::Pg 'lo_extract' function works
>> fine across SSL. There is no issue with large objects >= 32K
>> using 'lo_extract'. So that casts doubt on it being an OpenSSL
>> issue.

No, it does not, because lo_export writes the data to a file on the
server side --- the data does not pass across the encrypted connection.
It still appears that there's an issue with moving a large chunk of
data across the link.

Whether this is OpenSSL's fault or libpq's is not clear.  I'm prepared
to believe that we are not using SSL_read correctly, but if so, what
is the correct way?  The OpenSSL docs are not very clear on how one
does asynchronous reads.

            regards, tom lane

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Tom Lane
Date:
Kevin Houle <kjh@cert.org> writes:
> The nature of the bug is that an 'lo_read' operation performed
> with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
> "hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
> hang is actually a read() loop on the socket generating EAGAIN
> error on each read().

I finally realized what's going on here.  7.3 branch CVS tip should fix
it.

            regards, tom lane

Re: DBD::Pg 'lo_read' fails on >= 32768 byte large objects

From
Kevin Houle
Date:
Tom Lane wrote:

> Kevin Houle <kjh@cert.org> writes:
>
>>The nature of the bug is that an 'lo_read' operation performed
>>with DBD::Pg caused a segfault with postgresql-libs-7.3.2 and
>>"hangs" on files >= 32768 bytes with postgresql-libs-7.3.4. The
>>hang is actually a read() loop on the socket generating EAGAIN
>>error on each read().
>
> I finally realized what's going on here.  7.3 branch CVS tip should fix
> it.

Good news. I'll test as soon as I'm able and report back
only if I still see an issue. No news is good news.

Thanks for the effort, Tom.

Kevin