Thread: varlena beyond 1GB and matrix
This is a design proposal for matrix data type which can be larger than 1GB. Not only a new data type support, it also needs a platform enhancement because existing varlena has a hard limit (1GB). We had a discussion about this topic on the developer unconference at Tokyo/Akihabara, the day before PGconf.ASIA. The design proposal below stands on the overall consensus on the discussion. ============BACKGROUND ============ The varlena format has either short (1-byte) or long (4-bytes) header. We use the long header for in-memory structure which is referenced by VARSIZE()/VARDATA(), or on-disk strcuture which is larger than 126b but less than TOAST threshold. Elsewhere, the short format is used if varlena is less than 126b or externally stored in the toast relation. Any kind of varlena representation does not support data- size larger than 1GB. On the other hands, some use cases which handle (relative) big-data in database are interested in variable length datum larger than 1GB. One example is what I've presented at PGconf.SV. A PL/CUDA function takes two arguments (2D-array of int4 instead of matrix), then returns top-N combination of the chemical compounds according to the similarity, like as: SELECT knn_gpu_similarity(5, Q.matrix, D.matrix) FROM (SELECT array_matrix(bitmap)matrix FROM finger_print_query WHERE tag LIKE '%abc%') Q, (SELECT array_matrix(bitmap)matrix FROM finger_print_10m) D; array_matrix() is an aggregate function to generate 2D-array which contains all the input relation stream. It works fine as long as data size is less than 1GB. Once it exceeds the boundary, user has to split the 2D-array by manual, although it is not uncommon the recent GPU model has more than 10GB RAM. It is really problematic if we cannot split the mathematical problem into several portions appropriately. And, it is not comfortable for users who cannot use full capability of the GPU device. =========PROBLEM ========= Our problem is that varlena format does not permit to have a variable length datum larger than 1GB, even if our "matrix" type wants to move a bunch of data larger than 1GB. So, we need to solve the problem of the varlena format restriction prior to the matrix type implementation. In the developer unconference, people had discussed three ideas towards the problem. Then, overall consensus was the idea of special data-type which can contain multiple indirect references to other data chunks. Both of the main part and referenced data chunks are less than 1GB, but total amount of data size we can represent is more than 1GB. For example, even if we have a large matrix around 8GB, its sub-matrix separated into 9 portions (3x3) are individually less than 1GB. It is problematic when we try to save the matrix which contains indirect reference to the sub-matrixes, because toast_save_datum() writes out the flat portion just after the varlena head onto a tuple or a toast relation as is. If main portion of the matrix contains pointers (indirect reference), it is obviously problematic. We need to have an infrastructure to serialize the indirect reference prior to saving. BTW, other ideas but less acknowledgement were 64bit varlena header and utilization of large object. The earlier idea breaks effects to the current data format, thus, will make unexpected side-effect on the existing code of PostgreSQL core and extensions. The later one requires users to construct a large object preliminary. It makes impossible to use interim result of sub-query, and leads unnecessary i/o for preparation. ==========SOLUTION ========== I like to propose a new optional type handler 'typserialize' to serialize an in-memory varlena structure (that can have indirect references) to on-disk format. If any, it shall be involced on the head of toast_insert_or_update() than indirect references are transformed to something other which is safe to save. (My expectation is, the 'typserialize' handler preliminary saves the indirect chunks to the toast relation, then put toast pointers instead.) On the other hands, it is uncertain whether we need 'typdeserialize' handler symmetrically. Because all the functions/operators which support the special data types should know its internal format, it is possible to load the data chunks indirectly referenced on demand. It will be beneficial from the performance perspective if functions /operators touches only a part of the large structure, because the rest of portions are not necessary to load into the memory. One thing I'm not certain is, whether we can update the datum supplied as an argument in functions/operators. Let me assume the data structure below: struct { int32 varlena_head; Oid element_id; int matrix_type; int blocksz_x; /* horizontalsize of each block */ int blocksz_y; /* vertical size of each block */ int gridsz_x; /* horizontal# of blocks */ int gridsz_y; /* vertical # of blocks */ struct { Oid va_valueid; Oid va_toastrelid; void *ptr_block; } blocks[FLEXIBLE_ARRAY_MEMBER]; }; If and when this structure is fetched from the tuple, its @ptr_block is initialized to NULL. Once it is supplied to a function which references a part of blocks, type specific code can load sub-matrix from the toast relation, then update the @ptr_block not to load the sub-matrix from the toast multiple times. I'm not certain whether it is acceptable behavior/manner. If it is OK, it seems to me the direction to support matrix larger than 1GB is all green. Your comments are welcome. ====================FOR YOUR REFERENCE ==================== * Beyond the 1GB limitation of varlena http://kaigai.hatenablog.com/entry/2016/12/04/223826 * PGconf.SV 2016 and PL/CUDA http://kaigai.hatenablog.com/entry/2016/11/17/070708 * PL/CUDA slides at PGconf.ASIA (English) http://www.slideshare.net/kaigai/pgconfasia2016-plcuda-en Best regards, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 12/7/16 5:50 AM, Kohei KaiGai wrote: > If and when this structure is fetched from the tuple, its @ptr_block > is initialized to NULL. Once it is supplied to a function which > references a part of blocks, type specific code can load sub-matrix > from the toast relation, then update the @ptr_block not to load the > sub-matrix from the toast multiple times. > I'm not certain whether it is acceptable behavior/manner. I'm glad you're looking into this. The 1G limit is becoming a larger problem every day. Have you considered using ExpandedObjects to accomplish this? I don't think the API would work as-is, but I suspect there's other places where we'd like to be able to have this capability (arrays and JSONB come to mind). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > I like to propose a new optional type handler 'typserialize' to > serialize an in-memory varlena structure (that can have indirect > references) to on-disk format. > If any, it shall be involced on the head of toast_insert_or_update() > than indirect references are transformed to something other which > is safe to save. (My expectation is, the 'typserialize' handler > preliminary saves the indirect chunks to the toast relation, then > put toast pointers instead.) This might not work. The reason is that we have important bits of code that expect that they can figure out how to do some operation on a datum (find its length, copy it, serialize it) based only on typlen and typbyval. See src/backend/utils/adt/datum.c for assorted examples. Note also the lengthy comment near the top of the file, which explains that typlen > 0 indicates a fixed-length type, typlen == -1 indicates a varlena, and typlen == -2 indicates a cstring. I think there's probably room for other typlen values; for example, we could have typlen == -3 indicate some new kind of thing -- a super-varlena that has a higher length limit, or some other kind of thing altogether. Now, you can imagine trying to treat what you're talking about as a new type of TOAST pointer, but I think that's not going to help, because at some point the TOAST pointer gets de-toasted into a varlena ... which is still limited to 1GB. So that's not really going to work. And it brings out another point, which is that if you define a new typlen code, like -3, for super-big things, they won't be varlenas, which means they won't work with the existing TOAST interfaces. Still, you might be able to fix that. You would probably have to do some significant surgery on the wire protocol, per the commit message for fa2fa995528023b2e6ba1108f2f47558c6b66dcd. I think it's probably a mistake to conflate objects with substructure with objects > 1GB. Those are two somewhat orthogonal needs. As Jim also points out, expanded objects serve the first need. Of course, those assume that we're dealing with a varlena, so if we made a super-varlena, we'd still need to create an equivalent. But perhaps it would be a fairly simple adaptation of what we've already got. Handling objects >1GB at all seems like the harder part of the problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> I like to propose a new optional type handler 'typserialize' to >> serialize an in-memory varlena structure (that can have indirect >> references) to on-disk format. > I think it's probably a mistake to conflate objects with substructure > with objects > 1GB. Those are two somewhat orthogonal needs. Maybe. I think where KaiGai-san is trying to go with this is being able to turn an ExpandedObject (which could contain very large amounts of data) directly into a toast pointer or vice versa. There's nothing really preventing a TOAST OID from having more than 1GB of data attached, and if you had a side channel like this you could transfer the data without ever having to form a larger-than-1GB tuple. The hole in that approach, to my mind, is that there are too many places that assume that they can serialize an ExpandedObject into part of an in-memory tuple, which might well never be written to disk, or at least not written to disk in a table. (It might be intended to go into a sort or hash join, for instance.) This design can't really work for that case, and unfortunately I think it will be somewhere between hard and impossible to remove all the places where that assumption is made. At a higher level, I don't understand exactly where such giant ExpandedObjects would come from. (As you point out, there's certainly no easy way for a client to ship over the data for one.) So this feels like a very small part of a useful solution, if indeed it's part of a useful solution at all, which is not obvious. FWIW, ExpandedObjects themselves are far from a fully fleshed out concept, one of the main problems being that they don't have very long lifespans except in the case that they're the value of a plpgsql variable. I think we would need to move things along quite a bit in that area before it would get to be useful to think in terms of ExpandedObjects containing multiple GB of data. Otherwise, the inevitable flattenings and re-expansions are just going to kill you. Likewise, the need for clients to be able to transfer data in chunks gets pressing well before you get to 1GB. So there's a lot here that really should be worked on before we try to surmount that barrier. regards, tom lane
I wrote: > Maybe. I think where KaiGai-san is trying to go with this is being > able to turn an ExpandedObject (which could contain very large amounts > of data) directly into a toast pointer or vice versa. There's nothing > really preventing a TOAST OID from having more than 1GB of data > attached, and if you had a side channel like this you could transfer > the data without ever having to form a larger-than-1GB tuple. BTW, you could certainly imagine attaching such infrastructure for direct-to-TOAST-table I/O to ExpandedObjects today, independently of any ambitions about larger-than-1GB values. I'm not entirely sure how often it would get exercised, which is the key subtext of what I wrote before, but it's clearly a possible optimization of what we do now. regards, tom lane
2016-12-08 8:04 GMT+09:00 Robert Haas <robertmhaas@gmail.com>: > On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> I like to propose a new optional type handler 'typserialize' to >> serialize an in-memory varlena structure (that can have indirect >> references) to on-disk format. >> If any, it shall be involced on the head of toast_insert_or_update() >> than indirect references are transformed to something other which >> is safe to save. (My expectation is, the 'typserialize' handler >> preliminary saves the indirect chunks to the toast relation, then >> put toast pointers instead.) > > This might not work. The reason is that we have important bits of > code that expect that they can figure out how to do some operation on > a datum (find its length, copy it, serialize it) based only on typlen > and typbyval. See src/backend/utils/adt/datum.c for assorted > examples. Note also the lengthy comment near the top of the file, > which explains that typlen > 0 indicates a fixed-length type, typlen > == -1 indicates a varlena, and typlen == -2 indicates a cstring. I > think there's probably room for other typlen values; for example, we > could have typlen == -3 indicate some new kind of thing -- a > super-varlena that has a higher length limit, or some other kind of > thing altogether. > > Now, you can imagine trying to treat what you're talking about as a > new type of TOAST pointer, but I think that's not going to help, > because at some point the TOAST pointer gets de-toasted into a varlena > ... which is still limited to 1GB. So that's not really going to > work. And it brings out another point, which is that if you define a > new typlen code, like -3, for super-big things, they won't be > varlenas, which means they won't work with the existing TOAST > interfaces. Still, you might be able to fix that. You would probably > have to do some significant surgery on the wire protocol, per the > commit message for fa2fa995528023b2e6ba1108f2f47558c6b66dcd. > Hmm... The reason why I didn't introduce the idea of 64bit varlena format is this approach seems too invasive for existing PostgreSQL core and extensions, because I assumed this "long" variable length datum utilize/enhance existing varlena infrastructure. However, once we have completely independent infrastructure from the exiting varlena, we may not need to have a risk of code mixture. It seems to me an advantage. My concern about ExpandedObject is its flat format still needs 32bit varlena header that restrict the total length up to 1GB, so, it was not possible to represent a large data chunk. So, I didn't think the current ExpandedObject is a solution for us. Regarding to the protocol stuff, I want to consider the support of a large record next to the internal data format, because my expected primary usage is an internal use for in-database analytics, then user will get its results from a complicated logic described in PL function. > I think it's probably a mistake to conflate objects with substructure > with objects > 1GB. Those are two somewhat orthogonal needs. As Jim > also points out, expanded objects serve the first need. Of course, > those assume that we're dealing with a varlena, so if we made a > super-varlena, we'd still need to create an equivalent. But perhaps > it would be a fairly simple adaptation of what we've already got. > Handling objects >1GB at all seems like the harder part of the > problem. > I could get your point almost. Does the last line above mention about amount of the data object >1GB? even if the "super-varlena" format allows 64bit length? Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2016-12-08 8:36 GMT+09:00 Tom Lane <tgl@sss.pgh.pa.us>: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >>> I like to propose a new optional type handler 'typserialize' to >>> serialize an in-memory varlena structure (that can have indirect >>> references) to on-disk format. > >> I think it's probably a mistake to conflate objects with substructure >> with objects > 1GB. Those are two somewhat orthogonal needs. > > Maybe. I think where KaiGai-san is trying to go with this is being > able to turn an ExpandedObject (which could contain very large amounts > of data) directly into a toast pointer or vice versa. There's nothing > really preventing a TOAST OID from having more than 1GB of data > attached, and if you had a side channel like this you could transfer > the data without ever having to form a larger-than-1GB tuple. > > The hole in that approach, to my mind, is that there are too many places > that assume that they can serialize an ExpandedObject into part of an > in-memory tuple, which might well never be written to disk, or at least > not written to disk in a table. (It might be intended to go into a sort > or hash join, for instance.) This design can't really work for that case, > and unfortunately I think it will be somewhere between hard and impossible > to remove all the places where that assumption is made. > Regardless of the ExpandedObject, does the flatten format need to contain fully flatten data chunks? If a data type internally contains multiple toast pointers as like an array, its flatten image is likely very small we can store using an existing varlena mechanism. One problem is VARSIZE() will never tell us exact total length of the data even if it references multiple GB scale chunks. > At a higher level, I don't understand exactly where such giant > ExpandedObjects would come from. (As you point out, there's certainly > no easy way for a client to ship over the data for one.) So this feels > like a very small part of a useful solution, if indeed it's part of a > useful solution at all, which is not obvious. > I expect an aggregate function that consumes millions of rows as source of a large matrix larger than 1GB. Once it is formed to a variable, it is easy to deliver as an argument of PL functions. > FWIW, ExpandedObjects themselves are far from a fully fleshed out > concept, one of the main problems being that they don't have very long > lifespans except in the case that they're the value of a plpgsql > variable. I think we would need to move things along quite a bit in > that area before it would get to be useful to think in terms of > ExpandedObjects containing multiple GB of data. Otherwise, the > inevitable flattenings and re-expansions are just going to kill you.q > > Likewise, the need for clients to be able to transfer data in chunks > gets pressing well before you get to 1GB. So there's a lot here that > really should be worked on before we try to surmount that barrier. > Do you point out the problem around client<->server protocol, isn't it? Likely, we eventually need this enhancement. I agree. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On 8 December 2016 at 07:36, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Likewise, the need for clients to be able to transfer data in chunks > gets pressing well before you get to 1GB. So there's a lot here that > really should be worked on before we try to surmount that barrier. Yeah. I tend to agree with Tom here. Allowing >1GB varlena-like objects, when we can barely cope with our existing ones in dump/restore, in clients, etc, doesn't strike me as quite the right direction to go in. I understand it solves a specific, niche case you're dealing with when exchanging big blobs of data with a GPGPU. But since the client doesn't actually see that large blob, it's split up into objects that will work on the current protocol and interfaces, why is is necessary to have instances of a single data type with >1GB values, rather than take a TOAST-like / pg_largeobject-like approach and split it up for storage? I'm concerned that this adds a special case format that will create maintenance burden and pain down the track, and it won't help with pain points users face like errors dumping/restoring rows with big varlena objects, problems efficiently exchanging them on the wire protocol, etc. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 8 December 2016 at 12:01, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> At a higher level, I don't understand exactly where such giant >> ExpandedObjects would come from. (As you point out, there's certainly >> no easy way for a client to ship over the data for one.) So this feels >> like a very small part of a useful solution, if indeed it's part of a >> useful solution at all, which is not obvious. >> > I expect an aggregate function that consumes millions of rows as source > of a large matrix larger than 1GB. Once it is formed to a variable, it is > easy to deliver as an argument of PL functions. You might be interested in how Java has historically dealt with similar issues. For a long time the JVM had quite low limits on the maximum amount of RAM it could manage, in the single gigabytes for a long time. Even for the 64-bit JVM. Once those limitations were lifted, the garbage collector algorithm placed a low practical limit on how much RAM it could cope with effectively. If you were doing scientific computing with Java, lots of big image/video work, using GPGPUs, doing large scale caching, etc, this rapidly became a major pain point. So people introduced external memory mappings to Java, where objects could reference and manage memory outside the main JVM heap. The most well known is probably BigMemory (https://www.terracotta.org/products/bigmemory), but there are many others. They exposed this via small opaque handle objects that you used to interact with the external memory store via library functions. It might make a lot of sense to apply the same principle to PostgreSQL, since it's much less intrusive than true 64-bit VARLENA. Rather than extending all of PostgreSQL to handle special-case split-up VARLENA extended objects, have your interim representation be a simple opaque value that points to externally mapped memory. Your operators for the type, etc, know how to work with it. You probably don't need a full suite of normal operators, you'll be interacting with the data in a limited set of ways. The main issue would presumably be one of resource management, since we currently assume we can just copy a Datum around without telling anybody about it or doing any special management. You'd need to know when to clobber your external segment, when to copy(!) it if necessary, etc. This probably makes sense for working with GPGPUs anyway, since they like dealing with big contiguous chunks of memory (or used to, may have improved?). It sounds like only code specifically intended to work with the oversized type should be doing much with it except passing it around as an opaque handle, right? Do you need to serialize this type to/from disk at all? Or just exchange it in chunks with a client? If you do need to, can you possibly do TOAST-like or pg_largeobject-like storage where you split it up for on disk storage then reassemble for use? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
2016-12-08 16:11 GMT+09:00 Craig Ringer <craig@2ndquadrant.com>: > On 8 December 2016 at 12:01, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > >>> At a higher level, I don't understand exactly where such giant >>> ExpandedObjects would come from. (As you point out, there's certainly >>> no easy way for a client to ship over the data for one.) So this feels >>> like a very small part of a useful solution, if indeed it's part of a >>> useful solution at all, which is not obvious. >>> >> I expect an aggregate function that consumes millions of rows as source >> of a large matrix larger than 1GB. Once it is formed to a variable, it is >> easy to deliver as an argument of PL functions. > > You might be interested in how Java has historically dealt with similar issues. > > For a long time the JVM had quite low limits on the maximum amount of > RAM it could manage, in the single gigabytes for a long time. Even for > the 64-bit JVM. Once those limitations were lifted, the garbage > collector algorithm placed a low practical limit on how much RAM it > could cope with effectively. > > If you were doing scientific computing with Java, lots of big > image/video work, using GPGPUs, doing large scale caching, etc, this > rapidly became a major pain point. So people introduced external > memory mappings to Java, where objects could reference and manage > memory outside the main JVM heap. The most well known is probably > BigMemory (https://www.terracotta.org/products/bigmemory), but there > are many others. They exposed this via small opaque handle objects > that you used to interact with the external memory store via library > functions. > > It might make a lot of sense to apply the same principle to > PostgreSQL, since it's much less intrusive than true 64-bit VARLENA. > Rather than extending all of PostgreSQL to handle special-case > split-up VARLENA extended objects, have your interim representation be > a simple opaque value that points to externally mapped memory. Your > operators for the type, etc, know how to work with it. You probably > don't need a full suite of normal operators, you'll be interacting > with the data in a limited set of ways. > > The main issue would presumably be one of resource management, since > we currently assume we can just copy a Datum around without telling > anybody about it or doing any special management. You'd need to know > when to clobber your external segment, when to copy(!) it if > necessary, etc. This probably makes sense for working with GPGPUs > anyway, since they like dealing with big contiguous chunks of memory > (or used to, may have improved?). > > It sounds like only code specifically intended to work with the > oversized type should be doing much with it except passing it around > as an opaque handle, right? > Thanks for your proposition. Its characteristics looks like a largeobject but volatile (because of no backing storage). As you mentioned, I also think its resource management is the core of issues. We have no reference count mechanism, so it is uncertain when we should release the orphan memory chunk, or we have to accept this memory consumption until clean-up of the relevant memory context. Moreever, we have to pay attention to the scenario when this opaque identifier is delivered to the background worker, thus, it needs to be valid on other process's context. (Or, prohibit to exchange it on the planner stage?) > Do you need to serialize this type to/from disk at all? Or just > exchange it in chunks with a client? If you do need to, can you > possibly do TOAST-like or pg_largeobject-like storage where you split > it up for on disk storage then reassemble for use? > Even though I don't expect this big chunk is entirely exported to/imported from the client at once, it makes sense to save/load this big chunk to/from the disk, because construction of a big memory structure consumes much CPU cycles than simple memory copy from buffers. So, in other words, it does not need valid typinput/typoutput handlers, but serialization/deserialization on disk i/o is helpful. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Wed, Dec 7, 2016 at 10:44 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> Handling objects >1GB at all seems like the harder part of the >> problem. >> > I could get your point almost. Does the last line above mention about > amount of the data object >1GB? even if the "super-varlena" format > allows 64bit length? Sorry, I can't understand your question about what I wrote. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Dec 7, 2016 at 11:01 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > Regardless of the ExpandedObject, does the flatten format need to > contain fully flatten data chunks? I suspect it does, and I think that's why this isn't going to get very far without a super-varlena format. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2016-12-23 8:23 GMT+09:00 Robert Haas <robertmhaas@gmail.com>: > On Wed, Dec 7, 2016 at 10:44 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >>> Handling objects >1GB at all seems like the harder part of the >>> problem. >>> >> I could get your point almost. Does the last line above mention about >> amount of the data object >1GB? even if the "super-varlena" format >> allows 64bit length? > > Sorry, I can't understand your question about what I wrote. > I thought you just pointed out it is always harder part to treat large amount of data even if data format allows >1GB or more. Right? -- KaiGai Kohei <kaigai@kaigai.gr.jp>
2016-12-23 8:24 GMT+09:00 Robert Haas <robertmhaas@gmail.com>: > On Wed, Dec 7, 2016 at 11:01 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >> Regardless of the ExpandedObject, does the flatten format need to >> contain fully flatten data chunks? > > I suspect it does, and I think that's why this isn't going to get very > far without a super-varlena format. > Yep, I'm now under investigation how to implement with typlen == -3 approach. Likely, it will be the most straight-forward infrastructure for other potential use cases more than matrix/vector. Thanks, -- KaiGai Kohei <kaigai@kaigai.gr.jp>
On Thu, Dec 22, 2016 at 8:44 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: > 2016-12-23 8:23 GMT+09:00 Robert Haas <robertmhaas@gmail.com>: >> On Wed, Dec 7, 2016 at 10:44 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote: >>>> Handling objects >1GB at all seems like the harder part of the >>>> problem. >>>> >>> I could get your point almost. Does the last line above mention about >>> amount of the data object >1GB? even if the "super-varlena" format >>> allows 64bit length? >> >> Sorry, I can't understand your question about what I wrote. >> > I thought you just pointed out it is always harder part to treat large > amount of data even if data format allows >1GB or more. Right? I *think* we agreeing. But I'm still not 100% sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company