Thread: pg_dump / copy bugs with "big lines" ?

pg_dump / copy bugs with "big lines" ?

From
Ronan Dunklau
Date:
Hello hackers,

I've tried my luck on pgsql-bugs before, with no success, so I report these
problem here.

The documentation mentions the following limits for sizes:

Maximum Field Size      1 GB
Maximum Row Size        1.6 TB

However, it seems like rows bigger than 1GB can't be COPYed out:

ro=# create table test_text (c1 text, c2 text);
CREATE TABLE
ro=# insert into test_text (c1) VALUES (repeat('a', 536870912));
INSERT 0 1
ro=# update test_text set c2 = c1;
UPDATE 1

Then, trying to dump or copy that results in the following error:

ro=# COPY test_text TO '/tmp/test';
ERROR:  out of memory
DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by 536870912
more bytes.

In fact, the same thing happens when using a simple SELECT:

ro=# select * from test_text ;
ERROR:  out of memory
DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by 536870912
more bytes.

In the case of COPY, the server uses a StringInfo to output the row. The
problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row
should be able to hold much more than that.

So, is this a bug ? Or is there a caveat I would have missed in the
documentation ?

We also hit a second issue, this time related to bytea encoding.

This test case is a bit more complicated, since I had to use an external
(client) program to insert my data. It involves inserting a string that fit
into 1GB when encoded in escape format, but is larger than that in hex, and
another string which fits in 1GB using the hex format, but is larger than that
in escape:

from psycopg2 import connect
from io import BytesIO

conn = connect(dbname="ro")
cur = conn.cursor()
fullcontent = BytesIO()

# Write a binary string that weight less
# than 1 GB when escape encoded, but more than
# that if hex encoded
for i in range(200):   fullcontent.write(b"aaa" * 1000000)
fullcontent.seek(0)
cur.copy_from(fullcontent, "test_bytea")

fullcontent.seek(0)
fullcontent.truncate()

# Write another binary string that weight
# less than 1GB when hex encoded, but more than
# that if escape encoded
cur.execute("SET bytea_output = 'hex'")
fullcontent.write(b"\\\\x")
for i in range(300):   fullcontent.write(b"00" * 1000000)
fullcontent.seek(0)
cur.copy_from(fullcontent, "test_bytea")

cur.execute("COMMIT;")
cur.close()

I couldn't find an invocation of pg_dump which would allow me to dump both
lines:

ro@ronan_laptop /tmp % PGOPTIONS="-c bytea_output=escape" pg_dump  -Fc  >
/dev/null
pg_dump: Dumping the contents of table "test_bytea" failed: PQgetResult()
failed.
pg_dump: Error message from server: ERROR:  invalid memory alloc request size
1200000001
pg_dump: The command was: COPY public.test_bytea (c1) TO stdout;
ro@ronan_laptop /tmp % PGOPTIONS="-c bytea_output=hex" pg_dump  -Fc  >
/dev/null
pg_dump: Dumping the contents of table "test_bytea" failed: PQgetResult()
failed.
pg_dump: Error message from server: ERROR:  invalid memory alloc request size
1200000003
pg_dump: The command was: COPY public.test_bytea (c1) TO stdout;


Using a COPY with binary format works:

ro=# COPY test_bytea TO '/tmp/test' WITH BINARY;


There seems to be a third issue, with regards to escape encoding: the
backslash character is escaped, by adding another backslash. This means that a
field which size is less than 1GB using the escape  sequence will not be able
to be output once the backslash are escaped.

For example, lets consider a string consisting of 300000000 '\' characters:


ro=# select length(c1) from test_bytea; length
-----------300000000
(1 ligne)

ro=# select length(encode(c1, 'escape')) from test_bytea ; length
-----------600000000
(1 ligne)

ro=# set bytea_output to escape;
SET
ro=# copy test_bytea to '/tmp/test.csv' ;

ERROR:  out of memory
DÉTAIL : Cannot enlarge string buffer containing 1073741822 bytes by 1 more
bytes.

I think pg_dump should not error out on any data which was valid upon
insertion. It seems the fix would be non-trivial, since StringInfo structures
are relying on a limit of MaxAllocSize. Or am I missing something ?

Thank you.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Re: pg_dump / copy bugs with "big lines" ?

From
Jim Nasby
Date:
On 3/30/15 5:46 AM, Ronan Dunklau wrote:
> Hello hackers,
>
> I've tried my luck on pgsql-bugs before, with no success, so I report these
> problem here.
>
> The documentation mentions the following limits for sizes:
>
> Maximum Field Size      1 GB
> Maximum Row Size        1.6 TB
>
> However, it seems like rows bigger than 1GB can't be COPYed out:
>
> ro=# create table test_text (c1 text, c2 text);
> CREATE TABLE
> ro=# insert into test_text (c1) VALUES (repeat('a', 536870912));
> INSERT 0 1
> ro=# update test_text set c2 = c1;
> UPDATE 1
>
> Then, trying to dump or copy that results in the following error:
>
> ro=# COPY test_text TO '/tmp/test';
> ERROR:  out of memory
> DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by 536870912
> more bytes.
>
> In fact, the same thing happens when using a simple SELECT:
>
> ro=# select * from test_text ;
> ERROR:  out of memory
> DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by 536870912
> more bytes.
>
> In the case of COPY, the server uses a StringInfo to output the row. The
> problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row
> should be able to hold much more than that.

Yeah, shoving a whole row into one StringInfo is ultimately going to 
limit a row to 1G, which is a far cry from what the docs claim. There's 
also going to be problems with FE/BE communications, because things like 
pq_sendbyte all use StringInfo as a buffer too. So while Postgres can 
store a 1.6TB row, you're going to find a bunch of stuff that doesn't 
work past around 1GB.

> So, is this a bug ? Or is there a caveat I would have missed in the
> documentation ?

I suppose that really depends on your point of view. The real question 
is whether we think it's worth fixing, or a good idea to change the 
behavior of StringInfo.

StringInfo uses int's to store length, so it could possibly be changed, 
but then you'd just error out due to MaxAllocSize.

Now perhaps those could both be relaxed, but certainly not to the extent 
that you can shove an entire 1.6TB row into an output buffer.

The other issue is that there's a LOT of places in code that blindly 
copy detoasted data around, so while we technically support 1GB toasted 
values you're probably going to be quite unhappy with performance. I'm 
actually surprised you haven't already seen this with 500MB objects.

So long story short, I'm not sure how worthwhile it would be to try and 
fix this. We probably should improve the docs though.

Have you looked at using large objects for what you're doing? (Note that 
those have their own set of challenges and limitations.)

> We also hit a second issue, this time related to bytea encoding.

There's probably several other places this type of thing could be a 
problem. I'm thinking of conversions in particular.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: pg_dump / copy bugs with "big lines" ?

From
Ronan Dunklau
Date:
Le lundi 30 mars 2015 18:45:41 Jim Nasby a écrit :
> On 3/30/15 5:46 AM, Ronan Dunklau wrote:
> > Hello hackers,
> >
> > I've tried my luck on pgsql-bugs before, with no success, so I report
> > these
> > problem here.
> >
> > The documentation mentions the following limits for sizes:
> >
> > Maximum Field Size      1 GB
> > Maximum Row Size        1.6 TB
> >
> > However, it seems like rows bigger than 1GB can't be COPYed out:
> >
> > ro=# create table test_text (c1 text, c2 text);
> > CREATE TABLE
> > ro=# insert into test_text (c1) VALUES (repeat('a', 536870912));
> > INSERT 0 1
> > ro=# update test_text set c2 = c1;
> > UPDATE 1
> >
> > Then, trying to dump or copy that results in the following error:
> >
> > ro=# COPY test_text TO '/tmp/test';
> > ERROR:  out of memory
> > DÉTAIL : Cannot enlarge string buffer containing 536870913 bytes by
> > 536870912 more bytes.
> >
> > In fact, the same thing happens when using a simple SELECT:
> >
> > ro=# select * from test_text ;
> > ERROR:  out of memory
> > DÉTAIL : Cannot enlarge string buffer containing 536870922 bytes by
> > 536870912 more bytes.
> >
> > In the case of COPY, the server uses a StringInfo to output the row. The
> > problem is, a StringInfo is capped to MAX_ALLOC_SIZE (1GB - 1), but a row
> > should be able to hold much more than that.
>
> Yeah, shoving a whole row into one StringInfo is ultimately going to
> limit a row to 1G, which is a far cry from what the docs claim. There's
> also going to be problems with FE/BE communications, because things like
> pq_sendbyte all use StringInfo as a buffer too. So while Postgres can
> store a 1.6TB row, you're going to find a bunch of stuff that doesn't
> work past around 1GB.
>
> > So, is this a bug ? Or is there a caveat I would have missed in the
> > documentation ?
>
> I suppose that really depends on your point of view. The real question
> is whether we think it's worth fixing, or a good idea to change the
> behavior of StringInfo.
>

> StringInfo uses int's to store length, so it could possibly be changed,
> but then you'd just error out due to MaxAllocSize.
>
> Now perhaps those could both be relaxed, but certainly not to the extent
> that you can shove an entire 1.6TB row into an output buffer.

Another way to look at it would be to work in small chunks. For the first test
case (rows bigger than 1GB), maybe the copy command could be rewritten to work
in chunks, flushing the output more often if needed.

For the conversion related issues, I don't really see any other solution than
extending StrinigInfo to allow for more than 1GB of data. On the other hand,
those one can easily be circumvented by using a COPY ... WITH binary.

>
> The other issue is that there's a LOT of places in code that blindly
> copy detoasted data around, so while we technically support 1GB toasted
> values you're probably going to be quite unhappy with performance. I'm
> actually surprised you haven't already seen this with 500MB objects.
>
> So long story short, I'm not sure how worthwhile it would be to try and
> fix this. We probably should improve the docs though.
>

I think that having data that can't be output by pg_dump is quite surprising,
and if this is not fixable, I agree that it should clearly be documented.

> Have you looked at using large objects for what you're doing? (Note that
> those have their own set of challenges and limitations.)

Yes I do. This particular customer of ours did not mind the performance
penalty of using bytea objects as long as it was convenient to use.

>
> > We also hit a second issue, this time related to bytea encoding.
>
> There's probably several other places this type of thing could be a
> problem. I'm thinking of conversions in particular.

Yes, thats what the two other test cases I mentioned are about: any conversion
leadng to a size greater than 1GB results in an error, even implicit
conversions like doubling antislashes in the output.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Re: pg_dump / copy bugs with "big lines" ?

From
Jim Nasby
Date:
On 3/31/15 3:46 AM, Ronan Dunklau wrote:
>> >StringInfo uses int's to store length, so it could possibly be changed,
>> >but then you'd just error out due to MaxAllocSize.
>> >
>> >Now perhaps those could both be relaxed, but certainly not to the extent
>> >that you can shove an entire 1.6TB row into an output buffer.
> Another way to look at it would be to work in small chunks. For the first test
> case (rows bigger than 1GB), maybe the copy command could be rewritten to work
> in chunks, flushing the output more often if needed.

Possibly; I'm not sure how well the FE/BE protocol or code would 
actually support that.

>> >The other issue is that there's a LOT of places in code that blindly
>> >copy detoasted data around, so while we technically support 1GB toasted
>> >values you're probably going to be quite unhappy with performance. I'm
>> >actually surprised you haven't already seen this with 500MB objects.
>> >
>> >So long story short, I'm not sure how worthwhile it would be to try and
>> >fix this. We probably should improve the docs though.
>> >
> I think that having data that can't be output by pg_dump is quite surprising,
> and if this is not fixable, I agree that it should clearly be documented.
>
>> >Have you looked at using large objects for what you're doing? (Note that
>> >those have their own set of challenges and limitations.)
> Yes I do. This particular customer of ours did not mind the performance
> penalty of using bytea objects as long as it was convenient to use.

What do they do when they hit 1GB? Presumably if they're this close to 
the limit they're already hitting 1GB, no? Or is this mostly hypothetical?

>> >
>>> > >We also hit a second issue, this time related to bytea encoding.
>> >
>> >There's probably several other places this type of thing could be a
>> >problem. I'm thinking of conversions in particular.
> Yes, thats what the two other test cases I mentioned are about: any conversion
> leadng to a size greater than 1GB results in an error, even implicit
> conversions like doubling antislashes in the output.

I think the big issue with encoding is going to be the risk of changing 
encoding and ending up with something too large to fit back into 
storage. They might need to consider using something like bytea(990MB).

In any case, I don't think it would be terribly difficult to allow a bit 
more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR 
there's some 1GB limits there too.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: pg_dump / copy bugs with "big lines" ?

From
Robert Haas
Date:
On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> In any case, I don't think it would be terribly difficult to allow a bit
> more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
> some 1GB limits there too.

The point is, those limits are there on purpose.  Changing things
arbitrarily wouldn't be hard, but doing it in a principled way is
likely to require some thought.  For example, in the COPY OUT case,
presumably what's happening is that we palloc a chunk for each
individual datum, and then palloc a buffer for the whole row.  Now, we
could let the whole-row buffer be bigger, but maybe it would be better
not to copy all of the (possibly very large) values for the individual
columns over into a row buffer before sending it.  Some refactoring
that avoids the need for a potentially massive (1.6TB?) whole-row
buffer would be better than just deciding to allow it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pg_dump / copy bugs with "big lines" ?

From
Michael Paquier
Date:
On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>> In any case, I don't think it would be terribly difficult to allow a bit
>> more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
>> some 1GB limits there too.
>
> The point is, those limits are there on purpose.  Changing things
> arbitrarily wouldn't be hard, but doing it in a principled way is
> likely to require some thought.  For example, in the COPY OUT case,
> presumably what's happening is that we palloc a chunk for each
> individual datum, and then palloc a buffer for the whole row.  Now, we
> could let the whole-row buffer be bigger, but maybe it would be better
> not to copy all of the (possibly very large) values for the individual
> columns over into a row buffer before sending it.  Some refactoring
> that avoids the need for a potentially massive (1.6TB?) whole-row
> buffer would be better than just deciding to allow it.

I think that something to be aware of is that this is as well going to
require some rethinking of the existing libpq functions that are here
to fetch a row during COPY with PQgetCopyData, to make them able to
fetch chunks of data from one row.
-- 
Michael



Re: pg_dump / copy bugs with "big lines" ?

From
Jim Nasby
Date:
On 4/7/15 10:29 PM, Michael Paquier wrote:
> On Wed, Apr 8, 2015 at 11:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>>> In any case, I don't think it would be terribly difficult to allow a bit
>>> more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
>>> some 1GB limits there too.
>>
>> The point is, those limits are there on purpose.  Changing things
>> arbitrarily wouldn't be hard, but doing it in a principled way is
>> likely to require some thought.  For example, in the COPY OUT case,
>> presumably what's happening is that we palloc a chunk for each
>> individual datum, and then palloc a buffer for the whole row.  Now, we
>> could let the whole-row buffer be bigger, but maybe it would be better
>> not to copy all of the (possibly very large) values for the individual
>> columns over into a row buffer before sending it.  Some refactoring
>> that avoids the need for a potentially massive (1.6TB?) whole-row
>> buffer would be better than just deciding to allow it.
>
> I think that something to be aware of is that this is as well going to
> require some rethinking of the existing libpq functions that are here
> to fetch a row during COPY with PQgetCopyData, to make them able to
> fetch chunks of data from one row.

The discussion about upping the StringInfo limit was for cases where a 
change in encoding blows up because it's now larger. My impression was 
that these cases don't expand by a lot, so we wouldn't be significantly 
expanding StringInfo.

I agree that buffering 1.6TB of data would be patently absurd. Handling 
the case of COPYing a row that's >1GB clearly needs work than just 
bumping up some size limits. That's why I was wondering whether this was 
a real scenario or just hypothetical... I'd be surprised if someone 
would be happy with the performance of 1GB tuples, let alone even larger 
than that.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
A customer of ours was hit by this recently and I'd like to get it fixed
for good.

Robert Haas wrote:
> On Mon, Apr 6, 2015 at 1:51 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> > In any case, I don't think it would be terribly difficult to allow a bit
> > more than 1GB in a StringInfo. Might need to tweak palloc too; ISTR there's
> > some 1GB limits there too.
>
> The point is, those limits are there on purpose.  Changing things
> arbitrarily wouldn't be hard, but doing it in a principled way is
> likely to require some thought.  For example, in the COPY OUT case,
> presumably what's happening is that we palloc a chunk for each
> individual datum, and then palloc a buffer for the whole row.

Right.  There's a serious problem here already, and users are being
bitten by it in existing releases.  I think we should provide a
non-invasive fix for back-branches which we could apply as a bug fix.
Here's a proposed patch for the localized fix; it just adds a flag to
StringInfo allowing the string to grow beyond the 1GB limit, but only
for strings which are specifically marked.  That way we avoid having to
audit all the StringInfo-using code.  In this patch, only the COPY path
is allowed to grow beyond 1GB, which is enough to close the current
problem -- or at least my test case for it.

If others can try this patch to ensure it enables pg_dump to work on
their databases, it would be great.

(In this patch there's a known buglet which is that the UINT64_FORMAT
patched in the error message doesn't allow for translatability.)

Like Robert, I don't like this approach for the long term, however.  I
think it would be saner to have CopySendData not keep a single gigantic
string in memory; it would be better to get the bytes out to the client
sooner than end-of-row.  To avoid causing a performance hit, we would
only flush when the size of the output buffer is about to reach the
allocation limit (MaxAllocSize); so for regular-sized tuples, we would
only do it at end-of-row, keeping the current behavior.  I don't have a
patch for this yet; I'm going to try that next.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
Alvaro Herrera wrote:

> If others can try this patch to ensure it enables pg_dump to work on
> their databases, it would be great.

It doesn't seem to help if one field exceeds 1Gb, for instance when
inflated by a bin->hex translation.

postgres=# create table big as   select pg_read_binary_file('data') as binarycol;

postgres=# select octet_length(binarycol) from big;octet_length
--------------  1073700000

postgres=# copy big to '/var/tmp/big.copy';
ERROR:    XX000: invalid memory alloc request size 2147400003
LOCATION:  palloc, mcxt.c:903

Same problem with pg_dump.

OTOH, it improves the case where the cumulative size of field contents
for a row exceeds 1 Gb, but not  any single field exceeds that size.

If splitting the table into 3 fields, each smaller than 512MB:

postgres=# create table big2 as selectsubstring(binarycol from 1 for 300*1024*1024) as b1,substring(binarycol from
1+300*1024*1024for 300*1024*1024) as b2 ,substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3 
from big;

postgres=# copy big2 to '/var/tmp/big.copy';
COPY 1

then that works, producing a single line of 2097152012 chars
in the output file.

By contrast, it fails with an unpatched 9.5:

postgres=# copy big2 to '/var/tmp/big.copy';
ERROR:    54000: out of memory
DETAIL:  Cannot enlarge string buffer containing 629145605 bytes by 629145602
more bytes.
LOCATION:  enlargeStringInfo, stringinfo.c:260

If setting bytea_output to 'escape', it also fails with the patch applied,
as it tries to allocate 4x the binary field size, and it exceeds 1GB again.

postgres=# set bytea_output =escape;
SET
postgres=# copy big2 to '/var/tmp/big.copy';
ERROR:    invalid memory alloc request size 1258291201
LOCATION:  palloc, mcxt.c:821

1258291201 = 300*1024*1024*4+1

Also, the COPY of both tables work fine if using (FORMAT BINARY),
on both the patched and unpatched server.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
>     Alvaro Herrera wrote:
>> If others can try this patch to ensure it enables pg_dump to work on
>> their databases, it would be great.

> It doesn't seem to help if one field exceeds 1Gb, for instance when
> inflated by a bin->hex translation.

It's not going to be possible to fix that without enormously invasive
changes (affecting individual datatype I/O functions, for example).
And in general, fields approaching that size are going to give you
problems in all kinds of ways, not only COPY.

I think we should be satisfied if we can make COPY deal with the sum
of a line's fields exceeding 1GB.
        regards, tom lane



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
I wrote:

> If splitting the table into 3 fields, each smaller than 512MB:
>
> postgres=# create table big2 as select
>  substring(binarycol from 1 for 300*1024*1024) as b1,
>  substring(binarycol from 1+300*1024*1024 for 300*1024*1024) as b2 ,
>  substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3
> from big;

I've tried adding another large field to see what happens if the whole row
exceeds 2GB, and data goes to the client rather than to a file.

postgres=# alter table big2 add b4 bytea;
postgres=# update big2 set b4=b1;

So big2 has 4 bytea columns with 300+300+400+300 MB on a single row.

Then I'm trying to \copy this from a 9.5.1 backend with patch applied,
configured with --enable-debug, on Debian8 x86-64 with 64GB of RAM
and all defaults in postgresql.conf

My idea was to check if the client side was OK with that much data on
a single COPY row, but it turns out that the server is not OK anyway.

postgres=# \copy big2 to /dev/null
lost synchronization with server: got message type "d", length -1568669676

The backend aborts with the following backtrace:

Program terminated with signal 6, Aborted.
#0  0x00007f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00000000007b5a89 in AllocSetDelete (context=0xffffffffffffffff)   at aset.c:643
#6  0x00000000007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
mcxt.c:229
#7  0x000000000055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967
#8  DoCopyTo (cstate=cstate@entry=0x1fb1050) at copy.c:1778
#9  0x0000000000562ea9 in DoCopy (stmt=stmt@entry=0x1f796d0,   queryString=queryString@entry=0x1f78c60 "COPY  big2 TO
STDOUT",   processed=processed@entry=0x7fff22103338) at copy.c:961 
#10 0x00000000006b4440 in standard_ProcessUtility (parsetree=0x1f796d0,   queryString=0x1f78c60 "COPY  big2 TO STDOUT
",context=<optimized out>,   params=0x0, dest=0x1f79a30, completionTag=0x7fff22103680 "")   at utility.c:541 
#11 0x00000000006b1937 in PortalRunUtility (portal=0x1f26680,   utilityStmt=0x1f796d0, isTopLevel=1 '\001',
dest=0x1f79a30,  completionTag=0x7fff22103680 "") at pquery.c:1183 
#12 0x00000000006b2535 in PortalRunMulti (portal=portal@entry=0x1f26680,   isTopLevel=isTopLevel@entry=1 '\001',
dest=dest@entry=0x1f79a30,  altdest=altdest@entry=0x1f79a30,   completionTag=completionTag@entry=0x7fff22103680 "") at
pquery.c:1314
#13 0x00000000006b3022 in PortalRun (portal=portal@entry=0x1f26680,   count=count@entry=9223372036854775807,
isTopLevel=isTopLevel@entry=1'\001', dest=dest@entry=0x1f79a30,   altdest=altdest@entry=0x1f79a30,
completionTag=completionTag@entry=0x7fff22103680"") at pquery.c:812 
#14 0x00000000006b0393 in exec_simple_query (   query_string=0x1f78c60 "COPY  big2 TO STDOUT ") at postgres.c:1104
#15 PostgresMain (argc=<optimized out>, argv=argv@entry=0x1f0d240,   dbname=0x1f0d0f0 "postgres", username=<optimized
out>)at postgres.c:4030 
#16 0x000000000047072b in BackendRun (port=0x1f2d230) at postmaster.c:4239
#17 BackendStartup (port=0x1f2d230) at postmaster.c:3913
#18 ServerLoop () at postmaster.c:1684
#19 0x000000000065b8cd in PostmasterMain (argc=argc@entry=3,   argv=argv@entry=0x1f0c330) at postmaster.c:1292
#20 0x0000000000471161 in main (argc=3, argv=0x1f0c330) at main.c:223



The server log has this:

STATEMENT:  COPY  big2 TO STDOUT
*** glibc detected *** postgres: postgres postgres [local] COPY: double free
or corruption (out): 0x00007f8234929010 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x75be6)[0x7f82ee6d1be6]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x6c)[0x7f82ee6d698c]
postgres: postgres postgres [local] COPY[0x7b5a89]
postgres: postgres postgres [local] COPY(MemoryContextDelete+0x43)[0x7b63e3]
postgres: postgres postgres [local] COPY[0x55fa25]
postgres: postgres postgres [local] COPY(DoCopy+0x479)[0x562ea9]
postgres: postgres postgres [local]
COPY(standard_ProcessUtility+0x590)[0x6b4440]
postgres: postgres postgres [local] COPY[0x6b1937]
postgres: postgres postgres [local] COPY[0x6b2535]
postgres: postgres postgres [local] COPY(PortalRun+0x202)[0x6b3022]
postgres: postgres postgres [local] COPY(PostgresMain+0x1493)[0x6b0393]
postgres: postgres postgres [local] COPY[0x47072b]
postgres: postgres postgres [local] COPY(PostmasterMain+0xe7d)[0x65b8cd]
postgres: postgres postgres [local] COPY(main+0x3e1)[0x471161]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7f82ee67aead]
postgres: postgres postgres [local] COPY[0x4711c9]


I will try other tests without bytea exported in text format but with
several huge text columns.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
> I've tried adding another large field to see what happens if the whole row
> exceeds 2GB, and data goes to the client rather than to a file.
> My idea was to check if the client side was OK with that much data on
> a single COPY row, but it turns out that the server is not OK anyway.

BTW, is anyone checking the other side of this, ie "COPY IN" with equally
wide rows?  There doesn't seem to be a lot of value in supporting dump
if you can't reload ...
        regards, tom lane



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    I wrote:

> postgres=# \copy big2 to /dev/null
> lost synchronization with server: got message type "d", length -1568669676
>
> The backend aborts with the following backtrace:
>
> Program terminated with signal 6, Aborted.
> #0  0x00007f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x00007f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x00007f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #3  0x00007f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #4  0x00007f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
> #5  0x00000000007b5a89 in AllocSetDelete (context=0xffffffffffffffff)
>    at aset.c:643
> #6  0x00000000007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
> mcxt.c:229
> #7  0x000000000055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967

The cause of the crash turns out to be, in enlargeStringInfo():

needed += str->len + 1;     /* total space required now */

needed is an int and str->len is an int64, so it overflows when the
size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
overwrites memory after it.

When fixing it with a local int64 copy of the variable, the backend
no longer crashes and COPY big2 TO 'file' appears to work.

However, getting it to the client with \copy big2 to 'file'
still produces the error in psql:  lost synchronization with server: got message type "d"
and leaves an empty file, so there are more problems to solve to
go beyond 2GB text per row.

Or maybe another approach would be to advertise that this is the maximum
for a row in text mode, and limit the backend's string buffer to this size
for the time being? Notwithstanding that there's still the other direction
client->server to test.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Tomas Vondra
Date:
Hi,

On 03/02/2016 03:18 PM, Daniel Verite wrote:
>     I wrote:
>
>> postgres=# \copy big2 to /dev/null
>> lost synchronization with server: got message type "d", length -1568669676
>>
>> The backend aborts with the following backtrace:
>>
>> Program terminated with signal 6, Aborted.
>> #0  0x00007f82ee68e165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>> #1  0x00007f82ee6913e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
>> #2  0x00007f82ee6c839b in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>> #3  0x00007f82ee6d1be6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>> #4  0x00007f82ee6d698c in free () from /lib/x86_64-linux-gnu/libc.so.6
>> #5  0x00000000007b5a89 in AllocSetDelete (context=0xffffffffffffffff)
>>     at aset.c:643
>> #6  0x00000000007b63e3 in MemoryContextDelete (context=0x1fa58c8) at
>> mcxt.c:229
>> #7  0x000000000055fa25 in CopyTo (cstate=0x1fb1050) at copy.c:1967
>
> The cause of the crash turns out to be, in enlargeStringInfo():
>
> needed += str->len + 1;     /* total space required now */
>
> needed is an int and str->len is an int64, so it overflows when the
> size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
> overwrites memory after it.
>
> When fixing it with a local int64 copy of the variable, the backend
> no longer crashes and COPY big2 TO 'file' appears to work.
>
> However, getting it to the client with \copy big2 to 'file'
> still produces the error in psql:
>     lost synchronization with server: got message type "d"
> and leaves an empty file, so there are more problems to solve to
> go beyond 2GB text per row.

My guess is this is a problem at the protocol level - the 'd' message is 
CopyData, and all the messages use int32 to define length. So if there's 
a 2GB row, it's likely to overflow.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 03/02/2016 03:18 PM, Daniel Verite wrote:
>> However, getting it to the client with \copy big2 to 'file'
>> still produces the error in psql:
>> lost synchronization with server: got message type "d"
>> and leaves an empty file, so there are more problems to solve to
>> go beyond 2GB text per row.

> My guess is this is a problem at the protocol level - the 'd' message is 
> CopyData, and all the messages use int32 to define length. So if there's 
> a 2GB row, it's likely to overflow.

I'm too lazy to check the exact wording, but I don't think there's a hard
and fast promise in the protocol doc that one CopyData message == one row.
So we could probably subdivide a very wide line into multiple messages.
        regards, tom lane



Re: pg_dump / copy bugs with "big lines" ?

From
Tomas Vondra
Date:

On 03/02/2016 04:23 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On 03/02/2016 03:18 PM, Daniel Verite wrote:
>>> However, getting it to the client with \copy big2 to 'file'
>>> still produces the error in psql:
>>> lost synchronization with server: got message type "d"
>>> and leaves an empty file, so there are more problems to solve to
>>> go beyond 2GB text per row.
>
>> My guess is this is a problem at the protocol level - the 'd' message is
>> CopyData, and all the messages use int32 to define length. So if there's
>> a 2GB row, it's likely to overflow.
>
> I'm too lazy to check the exact wording, but I don't think there's a hard
> and fast promise in the protocol doc that one CopyData message == one row.
> So we could probably subdivide a very wide line into multiple messages.

Well, actually we claim this [1]:
    Data that forms part of a COPY data stream. Messages sent from the    backend will always correspond to single data
rows,but messages    sent by frontends might divide the data stream arbitrarily.
 

So that suggests 1:1 messages to rows, although I'm not sure how 
difficult would it be to relax this (or how much the clients might rely 
on this).

[1] http://www.postgresql.org/docs/9.5/static/protocol-message-formats.html

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> The cause of the crash turns out to be, in enlargeStringInfo():
> 
> needed += str->len + 1;     /* total space required now */
> 
> needed is an int and str->len is an int64, so it overflows when the
> size has to grow beyond 2^31 bytes, fails to enlarge the buffer and
> overwrites memory after it.
> 
> When fixing it with a local int64 copy of the variable, the backend
> no longer crashes and COPY big2 TO 'file' appears to work.

Great, thanks for debugging.

> However, getting it to the client with \copy big2 to 'file'
> still produces the error in psql:
>    lost synchronization with server: got message type "d"
> and leaves an empty file, so there are more problems to solve to
> go beyond 2GB text per row.

Well, the CopyData message has an Int32 field for the message length.
I don't know the FE/BE protocol very well but I suppose each row
corresponds to one CopyData message, or perhaps each column corresponds
to one CopyData message.  In either case, it's not possible to go beyond
2GB without changing the protocol ...

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
Tomas Vondra wrote:

> My guess is this is a problem at the protocol level - the 'd' message is
> CopyData, and all the messages use int32 to define length. So if there's
> a 2GB row, it's likely to overflow.

Yes. Besides, the full message includes a negative length:

> postgres=# \copy big2 to /dev/null
> lost synchronization with server: got message type "d", length -1568669676

which happens to be the correct size if interpreted as an unsigned int32

-1568669676 = (int) (1300UL*1024*1024*2 + 3 + 3*4 + 1 + 4)

One interpretation would be that putting an unsigned length in
CopyData message is a protocol violation.

However it's not clear to me that Int32 in the doc necessarily designates
a signed integer.

Int32 is defined as:  Intn(i)
  An n-bit integer in network byte order (most significant byte  first). If i is specified it is the exact value that
willappear,  otherwise the value is variable. Eg. Int16, Int32(42). 

There's a least one example when we use Int16 as unsigned:
the number of parameters in Bind (F) can be up to 65535.
This maximum is tested explicitly and refered to at several
places in fe-exec.

In some instances, Int32 is clearly signed, because -1 is accepted
to indicate NULLness, such as again in Bind (F) for the length of
the parameter value.

From this it seems to me that Intn is to be interpreted as
signed or unsigned on a case by case basis.

Back to CopyData (F & B), it's documented as:
 Byte1('d') Identifies the message as COPY data.
 Int32 Length of message contents in bytes, including self.
 Byten Data that forms part of a COPY data stream. Messages sent from the backend will always correspond to single data
rows,but messages sent by frontends might divide the data stream arbitrarily. 

I don't see any hint that this length is signed, nor any reason of having
it signed.

I guess before the patch it didn't matter, for the B case at least,
because the backend never sent more than 1GB.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Michael Paquier
Date:
On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Well, the CopyData message has an Int32 field for the message length.
> I don't know the FE/BE protocol very well but I suppose each row
> corresponds to one CopyData message, or perhaps each column corresponds
> to one CopyData message.  In either case, it's not possible to go beyond
> 2GB without changing the protocol ...

Based on what I know from this stuff (OOM libpq and other stuff
remnants), one 'd' message means one row. fe-protocol3.c and
CopySendEndOfRow in backend's copy.c are confirming that as well. I am
indeed afraid that having extra logic to get chunks of data will
require extending the protocol with a new message type for this
purpose.
-- 
Michael



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> BTW, is anyone checking the other side of this, ie "COPY IN" with equally
> wide rows?  There doesn't seem to be a lot of value in supporting dump
> if you can't reload ...

Indeed, the lines bigger than 1GB can't be reloaded :(

My test: with a stock 9.5.1 plus Alvaro's patch with my additional
int64 fix mentioned upthread, creating this table and filling
all columns with 200MB of text each:

create table bigtext(t1 text ,t2 text, t3 text, t4 text,                 t5 text, t6 text, t7 text);

# \copy bigtext to '/var/tmp/bigtext.sql'

That part does work as expected, producing this huge single line:
$ wc /var/tmp/bigtext.sql 1        7 1468006407 /var/tmp/bigtext.sql

But reloading it fails:

# create table bigtext2 (like bigtext);
CREATE TABLE

# copy bigtext2 from '/var/tmp/bigtext.sql';
ERROR:    54000: out of memory
DETAIL:  Cannot enlarge string buffer containing 1073676288 bytes by 65536
more bytes.
CONTEXT:  COPY bigtext2, line 1
LOCATION:  enlargeStringInfo, stringinfo.c:278


# \copy bigtext2 from '/var/tmp/bigtext.sql'
ERROR:    54000: out of memory
DETAIL:  Cannot enlarge string buffer containing 1073741808 bytes by 8191
more bytes.
CONTEXT:  COPY bigtext2, line 1
LOCATION:  enlargeStringInfo, stringinfo.c:278


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Daniel Verite wrote:

> # \copy bigtext2 from '/var/tmp/bigtext.sql'
> ERROR:  54000: out of memory
> DETAIL:  Cannot enlarge string buffer containing 1073741808 bytes by 8191
> more bytes.
> CONTEXT:  COPY bigtext2, line 1
> LOCATION:  enlargeStringInfo, stringinfo.c:278

To go past that problem, I've tried tweaking the StringInfoData
used for COPY FROM, like the original patch does in CopyOneRowTo.

It turns out that it fails a bit later when trying to make a tuple
from the big line, in heap_form_tuple():
 tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);

which fails because (HEAPTUPLESIZE + len) is again considered
an invalid size, the  size being 1468006476 in my test.

At this point it feels like a dead end, at least for the idea that extending
StringInfoData might suffice to enable COPYing such large rows.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> To go past that problem, I've tried tweaking the StringInfoData
> used for COPY FROM, like the original patch does in CopyOneRowTo. 
> 
> It turns out that it fails a bit later when trying to make a tuple
> from the big line, in heap_form_tuple():
> 
>   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
> 
> which fails because (HEAPTUPLESIZE + len) is again considered
> an invalid size, the  size being 1468006476 in my test.

Um, it seems reasonable to make this one be a huge-zero-alloc:
MemoryContextAllocExtended(CurrentMemoryContext, HEAPTUPLESIZE + len,               MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO)

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Alvaro Herrera wrote:

> >   tuple = (HeapTuple) palloc0(HEAPTUPLESIZE + len);
> >
> > which fails because (HEAPTUPLESIZE + len) is again considered
> > an invalid size, the  size being 1468006476 in my test.
>
> Um, it seems reasonable to make this one be a huge-zero-alloc:
>
>     MemoryContextAllocExtended(CurrentMemoryContext,
>                  HEAPTUPLESIZE + len,
>        MCXT_ALLOC_HUGE | MCXT_ALLOC_ZERO)

Good, this allows the tests to go to completion! The tests in question
are dump/reload of a row with several fields totalling 1.4GB (deflated),
with COPY TO/FROM file and psql's \copy in both directions, as well as
pg_dump followed by pg_restore|psql.

The modified patch is attached.

It provides a useful mitigation to dump/reload databases having
rows in the 1GB-2GB range, but it works under these limitations:

- no single field has a text representation exceeding 1GB.
- no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
  could push this to 4GB with limited changes to libpq, by
  interpreting the Int32 field in the CopyData message as unsigned).

It's also possible to go beyond 4GB per row with this patch, but
when not using the protocol. I've managed to get a 5.6GB single-row
file with COPY TO file. That doesn't help with pg_dump, but that might
be useful in other situations.

In StringInfo, I've changed int64 to Size, because on 32 bits platforms
the downcast from int64 to Size is problematic, and as the rest of the
allocation routines seems to favor Size, it seems more consistent
anyway.

I couldn't test on 32 bits though, as I seem to never have enough
free contiguous memory available on a 32 bits VM to handle
that kind of data.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

Re: pg_dump / copy bugs with "big lines" ?

From
Bruce Momjian
Date:
On Thu, Mar  3, 2016 at 10:31:26AM +0900, Michael Paquier wrote:
> On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > Well, the CopyData message has an Int32 field for the message length.
> > I don't know the FE/BE protocol very well but I suppose each row
> > corresponds to one CopyData message, or perhaps each column corresponds
> > to one CopyData message.  In either case, it's not possible to go beyond
> > 2GB without changing the protocol ...
> 
> Based on what I know from this stuff (OOM libpq and other stuff
> remnants), one 'd' message means one row. fe-protocol3.c and
> CopySendEndOfRow in backend's copy.c are confirming that as well. I am
> indeed afraid that having extra logic to get chunks of data will
> require extending the protocol with a new message type for this
> purpose.

Is there any documentation that needs updating based on this research?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +



Re: pg_dump / copy bugs with "big lines" ?

From
Michael Paquier
Date:
On Sat, Apr 23, 2016 at 9:58 AM, Bruce Momjian <bruce@momjian.us> wrote:
> On Thu, Mar  3, 2016 at 10:31:26AM +0900, Michael Paquier wrote:
>> On Thu, Mar 3, 2016 at 12:47 AM, Alvaro Herrera
>> <alvherre@2ndquadrant.com> wrote:
>> > Well, the CopyData message has an Int32 field for the message length.
>> > I don't know the FE/BE protocol very well but I suppose each row
>> > corresponds to one CopyData message, or perhaps each column corresponds
>> > to one CopyData message.  In either case, it's not possible to go beyond
>> > 2GB without changing the protocol ...
>>
>> Based on what I know from this stuff (OOM libpq and other stuff
>> remnants), one 'd' message means one row. fe-protocol3.c and
>> CopySendEndOfRow in backend's copy.c are confirming that as well. I am
>> indeed afraid that having extra logic to get chunks of data will
>> require extending the protocol with a new message type for this
>> purpose.
>
> Is there any documentation that needs updating based on this research?

Perhaps. On the docs the two sections referring to the CopyData
messages arein protocol.sgml
- with this portion for the 'd' message itself:
<term>
CopyData (F & B)
</term>
- and a more precise description here: <sect2 id="protocol-copy">  <title>COPY Operations</title>
We could precise in one of them that the maximum size of a CopyData
message can be up to 1GB. Thoughts?
-- 
Michael



Re: pg_dump / copy bugs with "big lines" ?

From
Craig Ringer
Date:
On 24 March 2016 at 01:14, Daniel Verite <daniel@manitou-mail.org> wrote:
 

It provides a useful mitigation to dump/reload databases having
rows in the 1GB-2GB range, but it works under these limitations:

- no single field has a text representation exceeding 1GB.
- no row as text exceeds 2GB (\copy from fails beyond that. AFAICS we
  could push this to 4GB with limited changes to libpq, by
  interpreting the Int32 field in the CopyData message as unsigned).

This seems like worthwhile mitigation for an issue multiple people have hit in the wild, and more will.

Giving Pg more generally graceful handling for big individual datums is going to be a bit of work, though. Support for wide-row, big-Datum COPY in and out. Efficient lazy fetching of large TOASTed data by follow-up client requests. Range fetching of large compressed TOASTed values (possibly at the price of worse compression) without having to decompress the whole thing up to the start of the desired range. Lots of fun.

At least we have lob / pg_largeobject.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: pg_dump / copy bugs with "big lines" ?

From
Tomas Vondra
Date:
Hi,

I've subscribed to review this patch in the current CF, so let me start 
with a brief summary of the thread as it started more than a year ago.

In general the thread is about hitting the 1GB allocation size limit 
(and 32-bit length in FE/BE protocol) in various ways:


1) attributes are smaller than 1GB, but row exceeds the limit

This causes issues in COPY/pg_dump, as we allocate a single StringInfo 
for the whole row, and send it as a single message (which includes int32 
length, just like most FE/BE messages).


2) attribute values that get over the 1GB due to encoding

For example it's possible to construct values that are OK in hex 
encoding but >1GB in escape (and vice versa). Those values get stored 
just fine, but there's no way to dump / COPY them. And if you happen to 
have both, you can't do pg_dump :-/


I think it's OK not to be able to store extremely large values, but only 
if we make it predictable (ideally by rejecting the value before storing 
in in the database). The cases discussed in the thread are particularly 
annoying exactly because we do the opposite - we allow the value to be 
stored, but then fail when retrieving the value or trying to backup the 
database (which is particularly nasty, IMNSHO).

So although we don't have that many reports about this, it'd be nice to 
improve the behavior a bit.


The trouble is, this rabbit hole is fairly deep - wherever we palloc or 
detoast a value, we're likely to hit those issues with wide values/rows.

Honestly, I don't think it's feasible to make all the code paths work 
with such wide values/rows - as TL points out, particularly with the 
wide values it would be enormously invasive.

So the patch aims to fix just the simplest case, i.e. when the values 
are smaller than 1GB but the total row is larger. It does so by allowing 
StringInfo to exceed the MaxAllocSize in special cases (currently only 
COPY FROM/TO), and using MCXT_ALLOC_HUGE when forming the heap tuple (to 
make it possible to load the data).

It seems to work and I do think it's a reasonable first step to make 
things work.

A few minor comments regarding the patch:

1) CopyStartSend seems pretty pointless - It only has one function call 
in it, and is called on exactly one place (and all other places simply 
call allowLongStringInfo directly). I'd get rid of this function and 
replace the call in CopyOneRowTo(() with allowLongStringInfo().

2) allowlong seems awkward, allowLong or allow_long would be better

3) Why does resetStringInfo reset the allowLong flag? Are there any 
cases when we want/need to forget the flag value? I don't think so, so 
let's simply not reset it and get rid of the allowLongStringInfo() 
calls. Maybe it'd be better to invent a new makeLongStringInfo() method 
instead, which would make it clear that the flag value is permanent.

4) HandleParallelMessage needs a tweak, as it uses msg->len in a log 
message, but with '%d' and not '%ld' (as needed after changing the type 
to Size).


I however wonder whether it wouldn't be better to try Robert's proposal, 
i.e. not building the huge StringInfo for the whole row, but sending the 
attribute data directly. I don't mean to send messages for each 
attribute - the current FE/BE protocol does not allow that (it says that 
each 'd' message is a whole row), but just sending the network message 
in smaller chunks.

That would make the StringInfo changes unnecessary, reduce the memory 
consumption considerably (right now we need 2x the memory, and we know 
we're dealing with gigabytes of data).

Of course, it'd require significant changes of how copy works internally 
(instead of accumulating data for the whole row, we'd have to send them 
right in smaller chunks). Which would allow us getting rid of the 
StringInfo changes, but we'd not be able to backpatch this.


Regarding interpreting the Int32 length field in FE/BE protocol as 
unsigned, I'm a bit worried it might qualify as breaking the protocol. 
It's true we don't really say whether it's signed or unsigned, and we 
handle it differently depending on the message type, but I wonder how 
many libraries simply use int32. OTOH those clients are unlikely to 
handle even the 2GB we might send them without breaking the protocol, so 
I guess this is fine. And 4GB seems like the best we can do.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
Tomas Vondra
Date:
On 09/03/2016 02:21 AM, Tomas Vondra wrote:
>
> A few minor comments regarding the patch:
>
> 1) CopyStartSend seems pretty pointless - It only has one function call
> in it, and is called on exactly one place (and all other places simply
> call allowLongStringInfo directly). I'd get rid of this function and
> replace the call in CopyOneRowTo(() with allowLongStringInfo().
>
> 2) allowlong seems awkward, allowLong or allow_long would be better
>
> 3) Why does resetStringInfo reset the allowLong flag? Are there any
> cases when we want/need to forget the flag value? I don't think so, so
> let's simply not reset it and get rid of the allowLongStringInfo()
> calls. Maybe it'd be better to invent a new makeLongStringInfo() method
> instead, which would make it clear that the flag value is permanent.
>
> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log
> message, but with '%d' and not '%ld' (as needed after changing the type
> to Size).
>

5) The comment at allowLongStringInfo talks about allocLongStringInfo 
(i.e. wrong function name).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Tomas Vondra wrote:

> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log
> message, but with '%d' and not '%ld' (as needed after changing the type
> to Size).

This could be %zu for the Size type. It's supported by elog().

However, it occurs to me that if we don't claim the 2GB..4GB range for
the CopyData message, because its Int32 length is not explicitly
unsigned as mentioned upthread, then it should follow that we don't
need to change StringInfo.{len,maxlen} from int type to Size type.

We should just set the upper limit for StringInfo.maxlen to
(0x7fffffff-1) instead of MaxAllocHugeSize for stringinfos with the
allow_long flag set, and leave the len and maxlen fields to their
original, int type.

Does that make sense?


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Tomas Vondra wrote:

> A few minor comments regarding the patch:
>
> 1) CopyStartSend seems pretty pointless - It only has one function call
> in it, and is called on exactly one place (and all other places simply
> call allowLongStringInfo directly). I'd get rid of this function and
> replace the call in CopyOneRowTo(() with allowLongStringInfo().
>
> 2) allowlong seems awkward, allowLong or allow_long would be better
>
> 3) Why does resetStringInfo reset the allowLong flag? Are there any
> cases when we want/need to forget the flag value? I don't think so, so
> let's simply not reset it and get rid of the allowLongStringInfo()
> calls. Maybe it'd be better to invent a new makeLongStringInfo() method
> instead, which would make it clear that the flag value is permanent.
>
> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log
> message, but with '%d' and not '%ld' (as needed after changing the type
> to Size).
>
> 5) The comment at allowLongStringInfo talks about allocLongStringInfo
> (i.e. wrong function name).

Here's an updated patch. Compared to the previous version:

- removed CopyStartSend (per comment #1 in review)

- renamed flag to allow_long (comment #2)

- resetStringInfo no longer resets the flag (comment #3).

- allowLongStringInfo() is removed (comment #3 and #5),
makeLongStringInfo() and initLongStringInfo() are provided
instead, as alternatives to makeStringInfo() and initStringInfo().

- StringInfoData.len is back to int type, 2GB max.
(addresses comment #4 incidentally).
This is safer because  many routines peeking
into StringInfoData use int for offsets into the buffer,
for instance most of the stuff in backend/libpq/pqformat.c
Altough these routines are not supposed to be called on long
buffers, this assertion was not enforced in the code, and
with a 64-bit length effectively over 2GB, they would misbehave
pretty badly.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org

Attachment

Re: pg_dump / copy bugs with "big lines" ?

From
Michael Paquier
Date:
On Fri, Sep 30, 2016 at 10:12 PM, Daniel Verite <daniel@manitou-mail.org> wrote:
>         Tomas Vondra wrote:
>
>> A few minor comments regarding the patch:
>>
>> 1) CopyStartSend seems pretty pointless - It only has one function call
>> in it, and is called on exactly one place (and all other places simply
>> call allowLongStringInfo directly). I'd get rid of this function and
>> replace the call in CopyOneRowTo(() with allowLongStringInfo().
>>
>> 2) allowlong seems awkward, allowLong or allow_long would be better
>>
>> 3) Why does resetStringInfo reset the allowLong flag? Are there any
>> cases when we want/need to forget the flag value? I don't think so, so
>> let's simply not reset it and get rid of the allowLongStringInfo()
>> calls. Maybe it'd be better to invent a new makeLongStringInfo() method
>> instead, which would make it clear that the flag value is permanent.
>>
>> 4) HandleParallelMessage needs a tweak, as it uses msg->len in a log
>> message, but with '%d' and not '%ld' (as needed after changing the type
>> to Size).
>>
>> 5) The comment at allowLongStringInfo talks about allocLongStringInfo
>> (i.e. wrong function name).
>
> Here's an updated patch. Compared to the previous version:
>
> - removed CopyStartSend (per comment #1 in review)
>
> - renamed flag to allow_long (comment #2)
>
> - resetStringInfo no longer resets the flag (comment #3).
>
> - allowLongStringInfo() is removed (comment #3 and #5),
> makeLongStringInfo() and initLongStringInfo() are provided
> instead, as alternatives to makeStringInfo() and initStringInfo().
>
> - StringInfoData.len is back to int type, 2GB max.
> (addresses comment #4 incidentally).
> This is safer because  many routines peeking
> into StringInfoData use int for offsets into the buffer,
> for instance most of the stuff in backend/libpq/pqformat.c
> Altough these routines are not supposed to be called on long
> buffers, this assertion was not enforced in the code, and
> with a 64-bit length effectively over 2GB, they would misbehave
> pretty badly.

The patch status is "Waiting on Author" in the CF app, but a new patch
has been sent 2 days ago, so this entry has been moved to next CF with
"Needs Review".
-- 
Michael



Re: pg_dump / copy bugs with "big lines" ?

From
Haribabu Kommi
Date:
Hi Tomas and Gerdan,

This is a gentle reminder.

you both are assigned as reviewers to the current patch in the 11-2016 commitfest.
But you haven't shared any reviews yet, can you please try to share your views
about the patch. This will help us in smoother operation of commitfest.

Please Ignore if you already shared your review.

Regards,
Hari Babu
Fujitsu Australia

Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> Here's an updated patch. Compared to the previous version:
> 
> - removed CopyStartSend (per comment #1 in review)
> 
> - renamed flag to allow_long (comment #2)
> 
> - resetStringInfo no longer resets the flag (comment #3).
> 
> - allowLongStringInfo() is removed (comment #3 and #5),
> makeLongStringInfo() and initLongStringInfo() are provided
> instead, as alternatives to makeStringInfo() and initStringInfo().
> 
> - StringInfoData.len is back to int type, 2GB max.
> (addresses comment #4 incidentally).
> This is safer because  many routines peeking
> into StringInfoData use int for offsets into the buffer,
> for instance most of the stuff in backend/libpq/pqformat.c
> Altough these routines are not supposed to be called on long
> buffers, this assertion was not enforced in the code, and
> with a 64-bit length effectively over 2GB, they would misbehave
> pretty badly.

Hmm, this looks a lot better I think.  One change we overlooked and I
just noticed is that we need to change appendStringInfoVA() to return
size_t rather than int.  This comment at the bottom of it gave that up:
/* * Return pvsnprintf's estimate of the space needed.  (Although this is * given as a size_t, we know it will fit in
intbecause it's not more * than MaxAllocSize.) */return (int) nprinted;
 

The parenthical comment is no longer true.

This means we need to update all its callers to match.  I suppose it
won't be a problem for most of them since they are not using long
stringinfos anyway, but it seems better to keep everything consistent.
I hope that callers that do not adjust the type of the declared variable
would get a compiler warning.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
I propose to rename allow_long to huge_ok.  "Huge" is the terminology
used by palloc anyway.  I'd keep makeLongStringInfo() and
initLongStringInfo() though as interface, because using Huge instead of
Long there looks strange.  Not wedded to that, though (particularly as
it's a bit inconsistent).


I'm not terribly sure about enlargeStringInfo().  I think I would
propose that it takes Size instead of int.  That has rather more fallout
than I'd like, but if we don't do that, then I'm not sure that
appendStringInfo continues to makes sense -- considering that it uses
the return value from appendStringInfoVA (which I'm redefining as
returning Size) to pass to enlargeStringInfo.

I'm not really sure how to ensure that the values passed fit both in an
int and Size ... which they would, given that all the callers use
not-huge stringinfos anyway.  But it'd be better if the compiler knew.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
Alvaro Herrera wrote:

> I propose to rename allow_long to huge_ok.  "Huge" is the terminology
> used by palloc anyway.  I'd keep makeLongStringInfo() and
> initLongStringInfo() though as interface, because using Huge instead of
> Long there looks strange.  Not wedded to that, though (particularly as
> it's a bit inconsistent).

"Long" makes sense to me as qualifying a limit greater than
MaxAllocSize but lower (or equal) than MaxAllocHugeSize.

In memutils.h we have these definitions:

#define MaxAllocSize   ((Size) 0x3fffffff)   /* 1 gigabyte - 1 */
#define MaxAllocHugeSize   ((Size) -1 >> 1) /* SIZE_MAX / 2 */

And in enlargeStringInfo() the patch adds this:/* * Maximum size for strings with allow_long=true. * It must not exceed
INT_MAX,as the StringInfo routines * expect offsets into the buffer to fit into an int. */const int max_alloc_long =
0x7fffffff;

On a 32-bit arch, we can expect max_alloc_long == MaxAllocHugeSize,
but on a 64-bit arch, it will be much smaller with MaxAllocHugeSize
at (2^63)-1.

IOW, the patch only doubles the maximum size of StringInfo
whereas we could say that it should multiply it by 2^32 to pretend to
the "huge" qualifier.


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> And in enlargeStringInfo() the patch adds this:
>     /*
>      * Maximum size for strings with allow_long=true.
>      * It must not exceed INT_MAX, as the StringInfo routines
>      * expect offsets into the buffer to fit into an int.
>      */
>     const int max_alloc_long = 0x7fffffff;
>
> On a 32-bit arch, we can expect max_alloc_long == MaxAllocHugeSize,
> but on a 64-bit arch, it will be much smaller with MaxAllocHugeSize
> at (2^63)-1.

Yeah, you're right.  Here's a v4 of this patch, in which I've done some
further very small adjustments to your v3:

* I changed the 0x7fffffff hardcoded value with some arithmetic which
sholud have the same result on most architectures:

    /*
     * Determine the upper size limit.  The fact that the StringInfo API uses
     * "int" everywhere limits us to int's width even for "long_ok" strings.
     */
    limit = str->long_ok ?
        (((Size) 1) << (Min(sizeof(int), sizeof(Size)) * 8 - 1)) - 1 :
        MaxAllocSize;

Initially I just had "sizeof(int)" instead of the Min(), and a
StaticAssert for sizeof(int) <= sizeof(Size), on the grounds that no
actual platform is likely to break that (though I think the C standard
does allow it).  But I realized that doing it this way is simple enough;
and furthermore, in any platforms where int is 8 bytes (ILP64), this
would automatically allow for 63-bit-sized stringinfos.  I don't think
this exists today, but wikipedia[1] lists two obsolete ones, [2] and [3].

[1] https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models
[2] https://en.wikipedia.org/wiki/HAL_SPARC64
[3] https://en.wikipedia.org/wiki/UNICOS

* I reverted the enlargement looping logic in enlargeStringInfo() to
pretty much the original code (minus the cast).

* I tweaked a few comments.

The big advantage of your v3 patch is that it can be backpatched without
fear of breaking ABI, so I've struggled to maintain that property in my
changes.  We can further tweak in HEAD-only; for example change the API
to use Size instead of int.  I think that would be desirable, but let's
not do it until we have backpatched this one.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
I just wrote:

> The big advantage of your v3 patch is that it can be backpatched without
> fear of breaking ABI, so I've struggled to maintain that property in my
> changes.  We can further tweak in HEAD-only; for example change the API
> to use Size instead of int.  I think that would be desirable, but let's
> not do it until we have backpatched this one.

One thing I just noticed while trying to backpatch this is that we can
do so only to 9.5, because older branches do not have
MemoryContextAllocExtended().  They do have MemoryContextAllocHuge(),
but the caller in heaptuple.c wants zeroed memory too, so we'd need to
memset; I think that could get us back to 9.4.

9.3 and older is not possible because we don't have "huge alloc" there.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Alvaro Herrera wrote:

> But I realized that doing it this way is simple enough;
> and furthermore, in any platforms where int is 8 bytes (ILP64), this
> would automatically allow for 63-bit-sized stringinfos

On such platforms there's the next problem that we can't
send COPY rows through the wire protocol when they're larger
than 2GB.

Based on the tests with previous iterations of the patch that used
int64 for the StringInfo length, the COPY backend code does not
gracefully fail in that case.

What happened when trying (linux x86_64) with a 2GB-4GB row
is that the size seems to overflow and is sent as a 32-bit integer
with the MSB set, which is confusing for the client side (at least
libpq cannot deal with it).

If we consider what would happen with the latest patch on a platform
with sizeof(int)=8 and a \copy invocation like this:

\copy (select big,big,big,big,big,big,big,big,...... FROM   (select lpad('', 1024*1024*200) as big) s) TO /dev/null

if we put enough copies of "big" in the select-list to grow over 2GB,
and then over 4GB.

On a platform with sizeof(int)=4 this should normally fail over 2GB with
"Cannot enlarge string buffer containing $X bytes by $Y more bytes"

I don't have an ILP64 environment myself to test, but I suspect
it would malfunction instead of cleanly erroring out on such
platforms.

One advantage of hardcoding the StringInfo limit to 2GB independantly
of sizeof(int) is to basically avoid the problem.

Also, without this limit, we can "COPY FROM/TO file" really huge rows, 4GB
and beyond, like I tried successfully during the tests mentioned upthread
(again with len as int64 on x86_64).
So such COPYs would succeed or fail depending on whether they deal with
a file or a network connection.
Do we want this difference in behavior?
(keeping in mind that they will both fail anyway if any individual field
in the row is larger than 1GB)


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> If we consider what would happen with the latest patch on a platform
> with sizeof(int)=8 and a \copy invocation like this:
> 
> \copy (select big,big,big,big,big,big,big,big,...... FROM
>     (select lpad('', 1024*1024*200) as big) s) TO /dev/null
> 
> if we put enough copies of "big" in the select-list to grow over 2GB,
> and then over 4GB.

Oh, right, I was forgetting that.

> On a platform with sizeof(int)=4 this should normally fail over 2GB with
> "Cannot enlarge string buffer containing $X bytes by $Y more bytes"
> 
> I don't have an ILP64 environment myself to test, but I suspect
> it would malfunction instead of cleanly erroring out on such
> platforms.

I suspect nobody has such platforms, as they are mostly defunct.  But I
see an easy way to fix it, which is to use sizeof(int32).

> Also, without this limit, we can "COPY FROM/TO file" really huge rows, 4GB
> and beyond, like I tried successfully during the tests mentioned upthread
> (again with len as int64 on x86_64).
> So such COPYs would succeed or fail depending on whether they deal with
> a file or a network connection.
> Do we want this difference in behavior?

Such a patch would be for master only.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
I have now pushed this to 9.5, 9.6 and master.  It could be backpatched
to 9.4 with ease (just a small change in heap_form_tuple); anything
further back would require much more effort.

I used a 32-bit limit using sizeof(int32).  I tested and all the
mentioned cases seem to work sanely; if you can spare some more time to
test what was committed, I'd appreciate it.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Alvaro Herrera wrote:

> I have now pushed this to 9.5, 9.6 and master.  It could be backpatched
> to 9.4 with ease (just a small change in heap_form_tuple); anything
> further back would require much more effort.
>
> I used a 32-bit limit using sizeof(int32).  I tested and all the
> mentioned cases seem to work sanely; if you can spare some more time to
> test what was committed, I'd appreciate it.

My tests are OK too but I see an issue with the code in
enlargeStringInfo(), regarding integer overflow.
The bit of comment that says:

  Note we are assuming here that limit <= INT_MAX/2, else the above
  loop could overflow.

is obsolete, it's now INT_MAX instead of INT_MAX/2.

There's a related problem here:
    newlen = 2 * str->maxlen;
    while (needed > newlen)
        newlen = 2 * newlen;
str->maxlen is an int going up to INT_MAX so [2 * str->maxlen] now
*will* overflow when [str->maxlen > INT_MAX/2].
Eventually it somehow works because of this:
    if (newlen > limit)
        newlen = limit;
but newlen is wonky (when resulting from int overflow)
before being brought back to limit.

PFA a minimal fix.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Attachment

Re: pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> My tests are OK too but I see an issue with the code in
> enlargeStringInfo(), regarding integer overflow.

Rats.  I'll have a look later.  You're probably right.

In the meantime I added this problem to the Open Items wiki page for
pg10, which I just created:  https://wiki.postgresql.org/wiki/Open_Items

I noticed that there's one open item in the 9.6 page that wasn't closed,
so I carried it forward.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> 
> I have now pushed this to 9.5, 9.6 and master.  It could be backpatched
> to 9.4 with ease (just a small change in heap_form_tuple); anything
> further back would require much more effort.

I had to revert this on 9.5 and 9.6 -- it is obvious (in hindsight) that
changing StringInfoData is an ABI break, so we can't do it in back
branches; see
https://www.postgresql.org/message-id/27737.1480993846@sss.pgh.pa.us
The patch still remains in master, with the bugs you pointed out.  I
suppose if somebody is desperate about getting data out from a table
with large tuples, they'd need to use pg10's pg_dump for it.

We could use the highest-order bit in StringInfoData->maxlen to
represent the boolean flag instead, if we really cared.  But I'm not
going to sweat over it ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> My tests are OK too but I see an issue with the code in
> enlargeStringInfo(), regarding integer overflow.
> The bit of comment that says:
> 
>   Note we are assuming here that limit <= INT_MAX/2, else the above
>   loop could overflow.
> 
> is obsolete, it's now INT_MAX instead of INT_MAX/2.

Hmm, I think what it really needs to say there is UINT_MAX/2, which is
what we really care about.  I may be all wet, but what I see is that the
expression(((Size) 1) << (sizeof(int32) * 8 - 1)) - 1
which is what we use as limit is exactly half the unsigned 32-bit
integer range.  So I would just update the constant in that comment
instead of removing it completely.  (We're still relying on the loop not
to overflow in 32-bit machines, surely).

> There's a related problem here:
>     newlen = 2 * str->maxlen;
>     while (needed > newlen)
>         newlen = 2 * newlen;
> str->maxlen is an int going up to INT_MAX so [2 * str->maxlen] now
> *will* overflow when [str->maxlen > INT_MAX/2].
> Eventually it somehow works because of this:
>     if (newlen > limit)
>         newlen = limit;
> but newlen is wonky (when resulting from int overflow)
> before being brought back to limit.

Yeah, this is bogus and your patch looks correct to me.

I propose this:

diff --git a/src/backend/lib/stringinfo.c b/src/backend/lib/stringinfo.c
index b618b37..a1d786d 100644
--- a/src/backend/lib/stringinfo.c
+++ b/src/backend/lib/stringinfo.c
@@ -313,13 +313,13 @@ enlargeStringInfo(StringInfo str, int needed)     * for efficiency, double the buffer size each
timeit overflows.     * Actually, we might need to more than double it if 'needed' is big...     */
 
-    newlen = 2 * str->maxlen;
+    newlen = 2 * (Size) str->maxlen;    while (needed > newlen)        newlen = 2 * newlen;    /*     * Clamp to the
limitin case we went past it.  Note we are assuming here
 
-     * that limit <= INT_MAX/2, else the above loop could overflow.  We will
+     * that limit <= UINT_MAX/2, else the above loop could overflow.  We will     * still have newlen >= needed.
*/   if (newlen > limit)
 

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Daniel Verite wrote:

> My tests are OK too but I see an issue with the code in
> enlargeStringInfo(), regarding integer overflow.
> The bit of comment that says:
> 
>   Note we are assuming here that limit <= INT_MAX/2, else the above
>   loop could overflow.
> 
> is obsolete, it's now INT_MAX instead of INT_MAX/2.

I would keep this comment but use UINT_MAX/2 instead.

> There's a related problem here:
>     newlen = 2 * str->maxlen;
>     while (needed > newlen)
>         newlen = 2 * newlen;
> str->maxlen is an int going up to INT_MAX so [2 * str->maxlen] now
> *will* overflow when [str->maxlen > INT_MAX/2].
> Eventually it somehow works because of this:
>     if (newlen > limit)
>         newlen = limit;
> but newlen is wonky (when resulting from int overflow)
> before being brought back to limit.

Yeah, you're right.  We also need to cast "needed" to Size in the while
test; and the repalloc_huge() call no longer needs a cast.

I propose the attached.

Not sure if we also need to cast the assignment to str->maxlen in the
last line.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Daniel Verite wrote:
> 
> > My tests are OK too but I see an issue with the code in
> > enlargeStringInfo(), regarding integer overflow.
> > The bit of comment that says:
> > 
> >   Note we are assuming here that limit <= INT_MAX/2, else the above
> >   loop could overflow.
> > 
> > is obsolete, it's now INT_MAX instead of INT_MAX/2.
> 
> I would keep this comment but use UINT_MAX/2 instead.

I rephrased the comment instead.  As you say, the current code no longer
depends on that, but we will depend on something similar when we enlarge
the other variables.

With that, pushed and I hope this is closed for good.

If you or anyone else want to revisit things so that pg10 can load&dump
even larger tuples, be my guest.  There are quite a few places that will
need to be touched, though (in particular, as I recall, the FE/BE
protocol.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

From
"Daniel Verite"
Date:
    Alvaro Herrera wrote:

> With that, pushed and I hope this is closed for good.

Great!
I understand the fix couldn't be backpatched because
we are not allowed to change the StringInfo struct in
existing releases. But it occurs to me that the new "long_ok"
flag might not be necessary after all now that it's settled that
the length remains an int32.
The flag is used to enforce a rule that the string can't normally grow
past 1GB, and can reach 2GB only as an exception that we choose to
exercise for COPY starting with v10.
But that 1GB rule is never mentioned in stringinfo.[ch] AFAICS.
I know that 1GB is the varlena size and is a limit for various things,
but I don't know to what extent StringInfo is concerned by that.

Is it the case that users of StringInfo in existing releases
and extensions are counting on its allocator to fail and abort
if the buffer grows over the varlena range?

If it's true, then we're stuck with the current situation
for existing releases.
OTOH if this seems like a nonlegit expectation, couldn't we say that
the size limit is 2^31 for all, and suppress the flag introduced by
the fix?


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: [HACKERS] pg_dump / copy bugs with "big lines" ?

From
Alvaro Herrera
Date:
FWIW I ended up reverting the whole thing, even from master.  A more
complete solution would have to be researched.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services