Thread: bytea columns and large values

bytea columns and large values

From

David North

Date:

27 September 2011, 18:46:36

My application uses a bytea column to store some fairly large binary
values (hundreds of megabytes).

Recently I've run into a problem as my values start to approach the 1GB
limit on field size:

When I write a 955MB byte array from Java into my table from JDBC, the
write succeeds and the numbers look about right:

testdb=# select count(*) from problem_table;
  count
-------
      1
(1 row)

testdb=# select pg_size_pretty(pg_total_relation_size('problem_table'));
  pg_size_pretty
----------------
  991 MB
(1 row)

However, any attempt to read this row back fails:

testdb=# select * from problem_table;
ERROR:  invalid memory alloc request size 2003676411

The same error occurs when reading from JDBC (even using getBinaryStream).

Is there some reason why my data can be stored in <1GB but triggers the
allocation of 2GB of memory when I try to read it back? Is there any
setting I can change or any alternate method of reading I can use to get
around this?

Thanks,

--
David North, Software Developer, CoreFiling Limited
http://www.corefiling.com
Phone: +44-1865-203192

Re: bytea columns and large values

From

Craig Ringer

Date:

27 September 2011, 21:56:37

On 09/28/2011 01:01 AM, David North wrote:

> testdb=# select * from problem_table;
> ERROR: invalid memory alloc request size 2003676411

What Pg version are you using?

On which platform?

32-bit or 64-bit OS? If 64-bit, with a 32-bit or 64-bit build of PostgreSQL?

--
Craig Ringer

Re: bytea columns and large values

From

David North

Date:

28 September 2011, 05:28:47

On 28/09/11 01:50, Craig Ringer wrote:
> On 09/28/2011 01:01 AM, David North wrote:
>
>> testdb=# select * from problem_table;
>> ERROR: invalid memory alloc request size 2003676411
>
> What Pg version are you using?
>
> On which platform?
>
> 32-bit or 64-bit OS? If 64-bit, with a 32-bit or 64-bit build of
> PostgreSQL?

psql (PostgreSQL) 9.0.4

64 bit fedora:

Fedora release 14 (Laughlin)
Linux mybox 2.6.35.14-95.fc14.x86_64 #1 SMP Tue Aug 16 21:01:58 UTC 2011
x86_64 x86_64 x86_64 GNU/Linux

I presume my build of PostgreSQL is x64 - ldd `which psql` spits out
lots of references to lib64.

David


--
David North, Software Developer, CoreFiling Limited
http://www.corefiling.com
Phone: +44-1865-203192

Re: bytea columns and large values

From

Merlin Moncure

Date:

28 September 2011, 12:18:36

On Wed, Sep 28, 2011 at 3:28 AM, David North <dtn@corefiling.co.uk> wrote:
> On 28/09/11 01:50, Craig Ringer wrote:
>>
>> On 09/28/2011 01:01 AM, David North wrote:
>>
>>> testdb=# select * from problem_table;
>>> ERROR: invalid memory alloc request size 2003676411
>>
>> What Pg version are you using?
>>
>> On which platform?
>>
>> 32-bit or 64-bit OS? If 64-bit, with a 32-bit or 64-bit build of
>> PostgreSQL?
>
> psql (PostgreSQL) 9.0.4
>
> 64 bit fedora:
>
> Fedora release 14 (Laughlin)
> Linux mybox 2.6.35.14-95.fc14.x86_64 #1 SMP Tue Aug 16 21:01:58 UTC 2011
> x86_64 x86_64 x86_64 GNU/Linux
>
> I presume my build of PostgreSQL is x64 - ldd `which psql` spits out lots of
> references to lib64.

sending ~1GB bytea values is borderline crazy, and is completely crazy
if you are not absolutely sure the transmission is not 100% binary.  I
don't know if the JDBC sends/receives bytea as binary, but it may not.
 If not, you might have better luck with the large object interface.

merlin

Re: bytea columns and large values

From

Marti Raudsepp

Date:

29 September 2011, 07:18:35

On Tue, Sep 27, 2011 at 20:01, David North <dtn@corefiling.co.uk> wrote:
> testdb=# select * from problem_table;
> ERROR:  invalid memory alloc request size 2003676411

> Is there some reason why my data can be stored in <1GB but triggers the
> allocation of 2GB of memory when I try to read it back? Is there any setting
> I can change or any alternate method of reading I can use to get around
> this?

I guess that it's converting the whole value to the hex-escaped bytea
format so that doubles its size. The JDBC driver probably doesn't
support tarnsferring bytea values in binary.

I've heard that some people are using substr() to read bytea values in
small chunks. Theoretically TOAST can support this in constant time
(independent of total value size or offset), but I don't know about
the implementation. In any case, it's worth a try.

It *might* help to ALTER column SET STORAGE EXTERNAL, to disable TOAST
compression, but it could also make things worse.

More details here: http://www.postgresql.org/docs/9.0/static/storage-toast.html

Regards,
Marti

Re: bytea columns and large values

From

Radosław Smogura

Date:

29 September 2011, 08:13:09

On Wed, 28 Sep 2011 10:18:27 -0500, Merlin Moncure wrote:
> On Wed, Sep 28, 2011 at 3:28 AM, David North <dtn@corefiling.co.uk>
> wrote:
>> On 28/09/11 01:50, Craig Ringer wrote:
>>>
>>> On 09/28/2011 01:01 AM, David North wrote:
>>>
>>>> testdb=# select * from problem_table;
>>>> ERROR: invalid memory alloc request size 2003676411
>>>
>>> What Pg version are you using?
>>>
>>> On which platform?
>>>
>>> 32-bit or 64-bit OS? If 64-bit, with a 32-bit or 64-bit build of
>>> PostgreSQL?
>>
>> psql (PostgreSQL) 9.0.4
>>
>> 64 bit fedora:
>>
>> Fedora release 14 (Laughlin)
>> Linux mybox 2.6.35.14-95.fc14.x86_64 #1 SMP Tue Aug 16 21:01:58 UTC
>> 2011
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> I presume my build of PostgreSQL is x64 - ldd `which psql` spits out
>> lots of
>> references to lib64.
>
> sending ~1GB bytea values is borderline crazy, and is completely
> crazy
> if you are not absolutely sure the transmission is not 100% binary.
> I
> don't know if the JDBC sends/receives bytea as binary, but it may
> not.
>  If not, you might have better luck with the large object interface.
>
> merlin
"Crazy"? I thought that official stand was to keep such crazy values as
TOAST, and LOB interface isn't something worth of usage.

You have exposed such interface, and you firm it. Does this crazynies
is measured by fixed amount above 500MB or it's fuzzy measured with
standard deviation near 1GB? If I use bytea to store such values, looks
I'm crazy too. Sorry, for thinking that documentation is trust worthy.

Thanks for mobilization - I reserved moderngres domains. I think it's
about 1-2 week for making some infrastructure for this.

Regards
Radosław Smogura
http://softperience.eu
http://moderngres.eu - in near future

Re: bytea columns and large values

From

Alban Hertroys

Date:

29 September 2011, 09:30:55

On 29 September 2011 13:12, Radosław Smogura <rsmogura@softperience.eu> wrote:
>> sending ~1GB bytea values is borderline crazy, and is completely crazy
>> if you are not absolutely sure the transmission is not 100% binary.  I
>> don't know if the JDBC sends/receives bytea as binary, but it may not.
>>  If not, you might have better luck with the large object interface.

> "Crazy"? I thought that official stand was to keep such crazy values as
> TOAST, and LOB interface isn't something worth of usage.

Both are possible means to handle data-objects that large.

The difference between the two is that with BYTEA, the value in a
result-set is returned with the BYTEA value embedded, whereas with
LOBs you get a file-pointer that you can subsequently read out at your
leisure.

As a consequence, with objects of 1GB in size, BYTEA requires the
server to allocate over 1GB of memory for each record in the result
set until it can send such records to the client, while the memory
footprint with LOBs is MUCH smaller on the server-side; just a
file-handle.

The interface for LOBs is a little more complicated, due to getting a
file handle instead of directly receiving the large object, but you're
saving your server a mountain of memory-load.

> You have exposed such interface, and you firm it. Does this crazynies is
> measured by fixed amount above 500MB or it's fuzzy measured with standard
> deviation near 1GB? If I use bytea to store such values, looks I'm crazy
> too. Sorry, for thinking that documentation is trust worthy.

It gives you a choice. That doesn't automatically make it the best
choice for your situation. That's something only you can decide.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: bytea columns and large values

From

Jon Nelson

Date:

29 September 2011, 10:11:10

On Thu, Sep 29, 2011 at 7:30 AM, Alban Hertroys <haramrae@gmail.com> wrote:
> On 29 September 2011 13:12, Radosław Smogura <rsmogura@softperience.eu> wrote:
>>> sending ~1GB bytea values is borderline crazy, and is completely crazy
>>> if you are not absolutely sure the transmission is not 100% binary.  I
>>> don't know if the JDBC sends/receives bytea as binary, but it may not.
>>>  If not, you might have better luck with the large object interface.
>
>> "Crazy"? I thought that official stand was to keep such crazy values as
>> TOAST, and LOB interface isn't something worth of usage.

I'm not saying that placing such large values in a table (or LO) is a
good idea, but - if I had managed to put data *in* to a table that I
couldn't get back out, I'd be a bit cranky, especially if my attempt
to do so kills the backend I am using (which triggers a shutdown of
all other backends, no?).

--
Jon

Re: bytea columns and large values

From

Merlin Moncure

Date:

29 September 2011, 10:55:18

On Thu, Sep 29, 2011 at 6:12 AM, Radosław Smogura
<rsmogura@softperience.eu> wrote:
> On Wed, 28 Sep 2011 10:18:27 -0500, Merlin Moncure wrote:
>> sending ~1GB bytea values is borderline crazy, and is completely crazy
>> if you are not absolutely sure the transmission is not 100% binary.  I
>> don't know if the JDBC sends/receives bytea as binary, but it may not.
>>  If not, you might have better luck with the large object interface.

hm, I had one extra 'not' in there of course I meant to say you should
be sure data is transferred in binary (I think everyone knew that
though).

> You have exposed such interface, and you firm it. Does this crazynies is
> measured by fixed amount above 500MB or it's fuzzy measured with standard
> deviation near 1GB? If I use bytea to store such values, looks I'm crazy
> too. Sorry, for thinking that documentation is trust worthy.
>
> Thanks for mobilization - I reserved moderngres domains. I think it's about
> 1-2 week for making some infrastructure for this.

The interface (well, libpq) and the protocol are in fact part of the
problem.  To truly support large bytea means streaming features, new
extensions to libpq, maybe some protocol modifications.  There have
been a couple of semi-serious attempts at dealing with this problem
(see archives), but none so far have gains serious traction.

The lo interface sucks but it's slightly better on resources for
really huge bytea and tends to be more consistently implemented in
database drivers.  If I was doing this, I would of course be crafting
a carefully generated client in C, using libpqtypes, which is the gold
standard for sending bytea against which all others should be judged.

merlin

Re: bytea columns and large values

From

Marti Raudsepp

Date:

29 September 2011, 12:28:03

On Thu, Sep 29, 2011 at 15:48, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:
> especially if my attempt
> to do so kills the backend I am using (which triggers a shutdown of
> all other backends, no?).

No, this is just an ereport(ERROR) that's handled gracefully by
rolling back the transaction.

Regards,
Marti

Re: bytea columns and large values

From

David North

Date:

29 September 2011, 12:51:16

On 29/09/11 14:55, Merlin Moncure wrote:
> On Thu, Sep 29, 2011 at 6:12 AM, Radosław Smogura
> <rsmogura@softperience.eu>  wrote:
>> On Wed, 28 Sep 2011 10:18:27 -0500, Merlin Moncure wrote:
>>> sending ~1GB bytea values is borderline crazy, and is completely crazy
>>> if you are not absolutely sure the transmission is not 100% binary.  I
>>> don't know if the JDBC sends/receives bytea as binary, but it may not.
>>>   If not, you might have better luck with the large object interface.
> hm, I had one extra 'not' in there of course I meant to say you should
> be sure data is transferred in binary (I think everyone knew that
> though).
>
>> You have exposed such interface, and you firm it. Does this crazynies is
>> measured by fixed amount above 500MB or it's fuzzy measured with standard
>> deviation near 1GB? If I use bytea to store such values, looks I'm crazy
>> too. Sorry, for thinking that documentation is trust worthy.
>>
>> Thanks for mobilization - I reserved moderngres domains. I think it's about
>> 1-2 week for making some infrastructure for this.
> The interface (well, libpq) and the protocol are in fact part of the
> problem.  To truly support large bytea means streaming features, new
> extensions to libpq, maybe some protocol modifications.  There have
> been a couple of semi-serious attempts at dealing with this problem
> (see archives), but none so far have gains serious traction.
>
> The lo interface sucks but it's slightly better on resources for
> really huge bytea and tends to be more consistently implemented in
> database drivers.  If I was doing this, I would of course be crafting
> a carefully generated client in C, using libpqtypes, which is the gold
> standard for sending bytea against which all others should be judged.

Thanks everyone for your replies.

For my use-case, I'm going to break down the data into smaller lumps and
store one per row (it's actually a bunch of serialized Java objects, so
putting them all in one field of one row was rather lazy in the first
place).

I'll also apply GZip to keep the amount of data in any given bytea/lob
as small as possible.

David

--
David North, Software Developer, CoreFiling Limited
http://www.corefiling.com
Phone: +44-1865-203192

Re: bytea columns and large values

From

Jon Nelson

Date:

29 September 2011, 12:55:35

On Thu, Sep 29, 2011 at 10:51 AM, David North <dtn@corefiling.co.uk> wrote:

> I'll also apply GZip to keep the amount of data in any given bytea/lob as
> small as possible.

Aren't bytea fields compressed by postgresql anyway (when EXTENDED or
MAIN is used) (by default).

http://www.postgresql.org/docs/8.4/static/storage-toast.html

--
Jon

Re: bytea columns and large values

From

Merlin Moncure

Date:

29 September 2011, 13:09:36

On Thu, Sep 29, 2011 at 10:54 AM, Jon Nelson <jnelson+pgsql@jamponi.net> wrote:
> On Thu, Sep 29, 2011 at 10:51 AM, David North <dtn@corefiling.co.uk> wrote:
>
>> I'll also apply GZip to keep the amount of data in any given bytea/lob as
>> small as possible.
>
> Aren't bytea fields compressed by postgresql anyway (when EXTENDED or
> MAIN is used) (by default).

They are assuming you didn't turn compression off manually and the
objects were big enough to toast.  If you truly didn't need access to
the data in the backend, compressing on the client and disabling
compression on the server is probably a good idea.

merlin

Re: bytea columns and large values

From

Dmitriy Igrishin

Date:

01 October 2011, 06:27:43

Hey Merlin,

The lo interface sucks but it's slightly better on resources for
really huge bytea and tends to be more consistently implemented in
database drivers. If I was doing this, I would of course be crafting
a carefully generated client in C, using libpqtypes, which is the gold
standard for sending bytea against which all others should be judged.

I am sorry, but why the Large Objects interface sucks?

--
// Dmitriy.

Re: bytea columns and large values

From

Merlin Moncure

Date:

01 October 2011, 17:05:49

On Sat, Oct 1, 2011 at 4:27 AM, Dmitriy Igrishin <dmitigr@gmail.com> wrote:
> Hey Merlin,
>
>> The lo interface sucks but it's slightly better on resources for
>> really huge bytea and tends to be more consistently implemented in
>> database drivers.  If I was doing this, I would of course be crafting
>> a carefully generated client in C, using libpqtypes, which is the gold
>> standard for sending bytea against which all others should be judged.
>
> I am sorry, but why the Large Objects interface sucks?

hm, good question. well, it's not sql and is outside the type system.
imo, bytea is really the way to go for sending binary.   also, if it
turns out that we need to add features to send large pieces of data,
it should not be specific to bytea.

merlin

merlin

Re: bytea columns and large values

From

Dmitriy Igrishin

Date:

01 October 2011, 17:15:24

2011/10/2 Merlin Moncure <mmoncure@gmail.com>

On Sat, Oct 1, 2011 at 4:27 AM, Dmitriy Igrishin <dmitigr@gmail.com> wrote:
> Hey Merlin,
>
>> The lo interface sucks but it's slightly better on resources for
>> really huge bytea and tends to be more consistently implemented in
>> database drivers. If I was doing this, I would of course be crafting
>> a carefully generated client in C, using libpqtypes, which is the gold
>> standard for sending bytea against which all others should be judged.
>
> I am sorry, but why the Large Objects interface sucks?

hm, good question. well, it's not sql and is outside the type system.
imo, bytea is really the way to go for sending binary. also, if it
turns out that we need to add features to send large pieces of data,
it should not be specific to bytea.

But at the moment, the only way to put/get the data piecewise
is to use LOBs.

merlin

merlin

--
// Dmitriy.