Re: jsonb format is pessimal for toast compression - Mailing list pgsql-hackers

From Tom Lane
Subject Re: jsonb format is pessimal for toast compression
Date
Msg-id 26064.1408144740@sss.pgh.pa.us
Whole thread Raw
In response to Re: jsonb format is pessimal for toast compression  (Arthur Silva <arthurprs@gmail.com>)
Responses Re: jsonb format is pessimal for toast compression
Re: jsonb format is pessimal for toast compression
List pgsql-hackers
Arthur Silva <arthurprs@gmail.com> writes:
> We should add some sort of versionning to the jsonb format. This can be
> explored in the future in many ways.

If we end up making an incompatible change to the jsonb format, I would
support taking the opportunity to stick a version ID in there.  But
I don't want to force a dump/reload cycle *only* to do that.

> As for the current problem, we should explore the directory at the end
> option. It should improve compression and keep good access performance.

Meh.  Pushing the directory to the end is just a band-aid, and since it
would still force a dump/reload, it's not a very enticing band-aid.
The only thing it'd really fix is the first_success_by issue, which
we could fix *without* a dump/reload by using different compression
parameters for jsonb.  Moving the directory to the end, by itself,
does nothing to fix the problem that the directory contents aren't
compressible --- and we now have pretty clear evidence that that is a
significant issue.  (See for instance Josh's results that increasing
first_success_by did very little for the size of his dataset.)

I think the realistic alternatives at this point are either to
switch to all-lengths as in my test patch, or to use the hybrid approach
of Heikki's test patch.  IMO the major attraction of Heikki's patch
is that it'd be upward compatible with existing beta installations,
ie no initdb required (but thus, no opportunity to squeeze in a version
identifier either).  It's not showing up terribly well in the performance
tests I've been doing --- it's about halfway between HEAD and my patch on
that extract-a-key-from-a-PLAIN-stored-column test.  But, just as with my
patch, there are things that could be done to micro-optimize it by
touching a bit more code.

I did some quick stats comparing compressed sizes for the delicio.us
data, printing quartiles as per Josh's lead:

all-lengths    {440,569,609,655,1257}
Heikki's patch    {456,582,624,671,1274}
HEAD        {493,636,684,744,1485}

(As before, this is pg_column_size of the jsonb within a table whose rows
are wide enough to force tuptoaster.c to try to compress the jsonb;
otherwise many of these values wouldn't get compressed.)  These documents
don't have enough keys to trigger the first_success_by issue, so that
HEAD doesn't look too awful, but still there's about an 11% gain from
switching from offsets to lengths.  Heikki's method captures much of
that but not all.

Personally I'd prefer to go to the all-lengths approach, but a large
part of that comes from a subjective assessment that the hybrid approach
is too messy.  Others might well disagree.

In case anyone else wants to do measurements on some more data sets,
attached is a copy of Heikki's patch updated to apply against git tip.

            regards, tom lane

diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 04f35bf..47b2998 100644
*** a/src/backend/utils/adt/jsonb_util.c
--- b/src/backend/utils/adt/jsonb_util.c
*************** convertJsonbArray(StringInfo buffer, JEn
*** 1378,1385 ****
                       errmsg("total size of jsonb array elements exceeds the maximum of %u bytes",
                              JENTRY_POSMASK)));

!         if (i > 0)
              meta = (meta & ~JENTRY_POSMASK) | totallen;
          copyToBuffer(buffer, metaoffset, (char *) &meta, sizeof(JEntry));
          metaoffset += sizeof(JEntry);
      }
--- 1378,1387 ----
                       errmsg("total size of jsonb array elements exceeds the maximum of %u bytes",
                              JENTRY_POSMASK)));

!         if (i % JBE_STORE_LEN_STRIDE == 0)
              meta = (meta & ~JENTRY_POSMASK) | totallen;
+         else
+             meta |= JENTRY_HAS_LEN;
          copyToBuffer(buffer, metaoffset, (char *) &meta, sizeof(JEntry));
          metaoffset += sizeof(JEntry);
      }
*************** convertJsonbObject(StringInfo buffer, JE
*** 1430,1440 ****
                       errmsg("total size of jsonb array elements exceeds the maximum of %u bytes",
                              JENTRY_POSMASK)));

!         if (i > 0)
              meta = (meta & ~JENTRY_POSMASK) | totallen;
          copyToBuffer(buffer, metaoffset, (char *) &meta, sizeof(JEntry));
          metaoffset += sizeof(JEntry);

          convertJsonbValue(buffer, &meta, &pair->value, level);
          len = meta & JENTRY_POSMASK;
          totallen += len;
--- 1432,1445 ----
                       errmsg("total size of jsonb array elements exceeds the maximum of %u bytes",
                              JENTRY_POSMASK)));

!         if (i % JBE_STORE_LEN_STRIDE == 0)
              meta = (meta & ~JENTRY_POSMASK) | totallen;
+         else
+             meta |= JENTRY_HAS_LEN;
          copyToBuffer(buffer, metaoffset, (char *) &meta, sizeof(JEntry));
          metaoffset += sizeof(JEntry);

+         /* put value */
          convertJsonbValue(buffer, &meta, &pair->value, level);
          len = meta & JENTRY_POSMASK;
          totallen += len;
*************** convertJsonbObject(StringInfo buffer, JE
*** 1445,1451 ****
                       errmsg("total size of jsonb array elements exceeds the maximum of %u bytes",
                              JENTRY_POSMASK)));

!         meta = (meta & ~JENTRY_POSMASK) | totallen;
          copyToBuffer(buffer, metaoffset, (char *) &meta, sizeof(JEntry));
          metaoffset += sizeof(JEntry);
      }
--- 1450,1456 ----
                       errmsg("total size of jsonb array elements exceeds the maximum of %u bytes",
                              JENTRY_POSMASK)));

!         meta |= JENTRY_HAS_LEN;
          copyToBuffer(buffer, metaoffset, (char *) &meta, sizeof(JEntry));
          metaoffset += sizeof(JEntry);
      }
*************** uniqueifyJsonbObject(JsonbValue *object)
*** 1592,1594 ****
--- 1597,1635 ----
          object->val.object.nPairs = res + 1 - object->val.object.pairs;
      }
  }
+
+ uint32
+ jsonb_get_offset(const JEntry *ja, int index)
+ {
+     uint32        off = 0;
+     int            i;
+
+     /*
+      * Each absolute entry contains the *end* offset. Start offset of this
+      * entry is equal to the end offset of the previous entry.
+      */
+     for (i = index - 1; i >= 0; i--)
+     {
+         off += JBE_POSFLD(ja[i]);
+         if (!JBE_HAS_LEN(ja[i]))
+             break;
+     }
+     return off;
+ }
+
+ uint32
+ jsonb_get_length(const JEntry *ja, int index)
+ {
+     uint32        off;
+     uint32        len;
+
+     if (JBE_HAS_LEN(ja[index]))
+         len = JBE_POSFLD(ja[index]);
+     else
+     {
+         off = jsonb_get_offset(ja, index);
+         len = JBE_POSFLD(ja[index]) - off;
+     }
+
+     return len;
+ }
diff --git a/src/include/utils/jsonb.h b/src/include/utils/jsonb.h
index 91e3e14..10a07bb 100644
*** a/src/include/utils/jsonb.h
--- b/src/include/utils/jsonb.h
*************** typedef struct JsonbValue JsonbValue;
*** 102,112 ****
   * to JB_FSCALAR | JB_FARRAY.
   *
   * To encode the length and offset of the variable-length portion of each
!  * node in a compact way, the JEntry stores only the end offset within the
!  * variable-length portion of the container node. For the first JEntry in the
!  * container's JEntry array, that equals to the length of the node data.  The
!  * begin offset and length of the rest of the entries can be calculated using
!  * the end offset of the previous JEntry in the array.
   *
   * Overall, the Jsonb struct requires 4-bytes alignment. Within the struct,
   * the variable-length portion of some node types is aligned to a 4-byte
--- 102,113 ----
   * to JB_FSCALAR | JB_FARRAY.
   *
   * To encode the length and offset of the variable-length portion of each
!  * node in a compact way, the JEntry stores either the length of the element,
!  * or its end offset within the variable-length portion of the container node.
!  * Entries that store a length are marked with the JENTRY_HAS_LEN flag, other
!  * entries store an end offset. The begin offset and length of each entry
!  * can be calculated by scanning backwards to the previous entry storing an
!  * end offset, and adding up the lengths of the elements in between.
   *
   * Overall, the Jsonb struct requires 4-bytes alignment. Within the struct,
   * the variable-length portion of some node types is aligned to a 4-byte
*************** typedef struct JsonbValue JsonbValue;
*** 120,134 ****
  /*
   * Jentry format.
   *
!  * The least significant 28 bits store the end offset of the entry (see
!  * JBE_ENDPOS, JBE_OFF, JBE_LEN macros below). The next three bits
!  * are used to store the type of the entry. The most significant bit
!  * is unused, and should be set to zero.
   */
  typedef uint32 JEntry;

  #define JENTRY_POSMASK            0x0FFFFFFF
  #define JENTRY_TYPEMASK            0x70000000

  /* values stored in the type bits */
  #define JENTRY_ISSTRING            0x00000000
--- 121,136 ----
  /*
   * Jentry format.
   *
!  * The least significant 28 bits store the end offset or the length of the
!  * entry, depending on whether the JENTRY_HAS_LEN flag is set (see
!  * JBE_ENDPOS, JBE_OFF, JBE_LEN macros below). The other three bits
!  * are used to store the type of the entry.
   */
  typedef uint32 JEntry;

  #define JENTRY_POSMASK            0x0FFFFFFF
  #define JENTRY_TYPEMASK            0x70000000
+ #define JENTRY_HAS_LEN            0x80000000

  /* values stored in the type bits */
  #define JENTRY_ISSTRING            0x00000000
*************** typedef uint32 JEntry;
*** 146,160 ****
  #define JBE_ISBOOL_TRUE(je_)    (((je_) & JENTRY_TYPEMASK) == JENTRY_ISBOOL_TRUE)
  #define JBE_ISBOOL_FALSE(je_)    (((je_) & JENTRY_TYPEMASK) == JENTRY_ISBOOL_FALSE)
  #define JBE_ISBOOL(je_)            (JBE_ISBOOL_TRUE(je_) || JBE_ISBOOL_FALSE(je_))

  /*
!  * Macros for getting the offset and length of an element. Note multiple
!  * evaluations and access to prior array element.
   */
! #define JBE_ENDPOS(je_)            ((je_) & JENTRY_POSMASK)
! #define JBE_OFF(ja, i)            ((i) == 0 ? 0 : JBE_ENDPOS((ja)[i - 1]))
! #define JBE_LEN(ja, i)            ((i) == 0 ? JBE_ENDPOS((ja)[i]) \
!                                  : JBE_ENDPOS((ja)[i]) - JBE_ENDPOS((ja)[i - 1]))

  /*
   * A jsonb array or object node, within a Jsonb Datum.
--- 148,170 ----
  #define JBE_ISBOOL_TRUE(je_)    (((je_) & JENTRY_TYPEMASK) == JENTRY_ISBOOL_TRUE)
  #define JBE_ISBOOL_FALSE(je_)    (((je_) & JENTRY_TYPEMASK) == JENTRY_ISBOOL_FALSE)
  #define JBE_ISBOOL(je_)            (JBE_ISBOOL_TRUE(je_) || JBE_ISBOOL_FALSE(je_))
+ #define JBE_HAS_LEN(je_)        (((je_) & JENTRY_HAS_LEN) != 0)

  /*
!  * Macros for getting the offset and length of an element.
   */
! #define JBE_POSFLD(je_)            ((je_) & JENTRY_POSMASK)
! #define JBE_OFF(ja, i)            jsonb_get_offset(ja, i)
! #define JBE_LEN(ja, i)            jsonb_get_length(ja, i)
!
! /*
!  * Store an absolute end offset every JBE_STORE_LEN_STRIDE elements (for an
!  * array) or key/value pairs (for an object). Others are stored as lengths.
!  */
! #define JBE_STORE_LEN_STRIDE    8
!
! extern uint32 jsonb_get_offset(const JEntry *ja, int index);
! extern uint32 jsonb_get_length(const JEntry *ja, int index);

  /*
   * A jsonb array or object node, within a Jsonb Datum.

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: strncpy is not a safe version of strcpy
Next
From: Joachim Wieland
Date:
Subject: pg_dump refactor patch to remove global variables