Re: varlena beyond 1GB and matrix - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: varlena beyond 1GB and matrix
Date
Msg-id CAMsr+YEe0T8MMbn=1NwRMRAVgQ_K1dghpEGHWswx7tcP3VC9rA@mail.gmail.com
Whole thread Raw
In response to Re: varlena beyond 1GB and matrix  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Responses Re: varlena beyond 1GB and matrix
List pgsql-hackers
On 8 December 2016 at 12:01, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:

>> At a higher level, I don't understand exactly where such giant
>> ExpandedObjects would come from.  (As you point out, there's certainly
>> no easy way for a client to ship over the data for one.)  So this feels
>> like a very small part of a useful solution, if indeed it's part of a
>> useful solution at all, which is not obvious.
>>
> I expect an aggregate function that consumes millions of rows as source
> of a large matrix larger than 1GB. Once it is formed to a variable, it is
> easy to deliver as an argument of PL functions.

You might be interested in how Java has historically dealt with similar issues.

For a long time the JVM had quite low limits on the maximum amount of
RAM it could manage, in the single gigabytes for a long time. Even for
the 64-bit JVM. Once those limitations were lifted, the garbage
collector algorithm placed a low practical limit on how much RAM it
could cope with effectively.

If you were doing scientific computing with Java, lots of big
image/video work, using GPGPUs, doing large scale caching, etc, this
rapidly became a major pain point. So people introduced external
memory mappings to Java, where objects could reference and manage
memory outside the main JVM heap. The most well known is probably
BigMemory (https://www.terracotta.org/products/bigmemory), but there
are many others. They exposed this via small opaque handle objects
that you used to interact with the external memory store via library
functions.

It might make a lot of sense to apply the same principle to
PostgreSQL, since it's much less intrusive than true 64-bit VARLENA.
Rather than extending all of PostgreSQL to handle special-case
split-up VARLENA extended objects, have your interim representation be
a simple opaque value that points to externally mapped memory. Your
operators for the type, etc, know how to work with it. You probably
don't need a full suite of normal operators, you'll be interacting
with the data in a limited set of ways.

The main issue would presumably be one of resource management, since
we currently assume we can just copy a Datum around without telling
anybody about it or doing any special management. You'd need to know
when to clobber your external segment, when to copy(!) it if
necessary, etc. This probably makes sense for working with GPGPUs
anyway, since they like dealing with big contiguous chunks of memory
(or used to, may have improved?).

It sounds like only code specifically intended to work with the
oversized type should be doing much with it except passing it around
as an opaque handle, right?

Do you need to serialize this type to/from disk at all? Or just
exchange it in chunks with a client? If you do need to, can you
possibly do TOAST-like or pg_largeobject-like storage where you split
it up for on disk storage then reassemble for use?



-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: varlena beyond 1GB and matrix
Next
From: Michael Paquier
Date:
Subject: Re: Quorum commit for multiple synchronous replication.