Re: VACUUM FULL versus TOAST - Mailing list pgsql-hackers

From Tom Lane
Subject Re: VACUUM FULL versus TOAST
Date
Msg-id 6954.1313364157@sss.pgh.pa.us
Whole thread Raw
In response to VACUUM FULL versus TOAST  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> I am thinking that the most reasonable solution is instead to fix VACUUM
> FULL/CLUSTER so that they don't change existing toast item OIDs when
> vacuuming a system catalog.  They already do some pretty ugly things to
> avoid changing the toast table's OID in this case, and locking down the
> item OIDs too doesn't seem that much harder.  (Though I've not actually
> looked at the code yet...)

Attached is a proposed patch for this.

> The main potential drawback here is that if any varlena items that had
> not previously been toasted got toasted, they would require additional
> OIDs to be assigned, possibly leading to a duplicate-OID failure.  This
> should not happen unless somebody decides to play with the attstorage
> properties of a system catalog, and I don't feel too bad about a small
> possibility of VAC FULL failing after that.  (Note it should eventually
> succeed if you keep trying, since the generated OIDs would keep
> changing.)

I realized that there is an easy fix for that: since tuptoaster.c
already knows what the old toast table OID is, it can just look into
that table to see if each proposed new OID is already in use, and
iterate till it gets a non-conflicting OID.  This may seem kind of
inefficient, but since it's such a corner case, I don't think the code
path will get hit often enough to matter.

Comments?

            regards, tom lane

diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 4f4dd69291fd50008f8e313176a02cd5bc955e08..785c679879012508b4925b4f8ce93e1c712f7ec4 100644
*** a/src/backend/access/heap/tuptoaster.c
--- b/src/backend/access/heap/tuptoaster.c
*************** do { \
*** 74,80 ****


  static void toast_delete_datum(Relation rel, Datum value);
! static Datum toast_save_datum(Relation rel, Datum value, int options);
  static struct varlena *toast_fetch_datum(struct varlena * attr);
  static struct varlena *toast_fetch_datum_slice(struct varlena * attr,
                          int32 sliceoffset, int32 length);
--- 74,82 ----


  static void toast_delete_datum(Relation rel, Datum value);
! static Datum toast_save_datum(Relation rel, Datum value,
!                  struct varlena *oldexternal, int options);
! static bool toast_valueid_exists(Oid toastrelid, Oid valueid);
  static struct varlena *toast_fetch_datum(struct varlena * attr);
  static struct varlena *toast_fetch_datum_slice(struct varlena * attr,
                          int32 sliceoffset, int32 length);
*************** toast_insert_or_update(Relation rel, Hea
*** 431,436 ****
--- 433,439 ----
      bool        toast_oldisnull[MaxHeapAttributeNumber];
      Datum        toast_values[MaxHeapAttributeNumber];
      Datum        toast_oldvalues[MaxHeapAttributeNumber];
+     struct varlena *toast_oldexternal[MaxHeapAttributeNumber];
      int32        toast_sizes[MaxHeapAttributeNumber];
      bool        toast_free[MaxHeapAttributeNumber];
      bool        toast_delold[MaxHeapAttributeNumber];
*************** toast_insert_or_update(Relation rel, Hea
*** 466,471 ****
--- 469,475 ----
       * ----------
       */
      memset(toast_action, ' ', numAttrs * sizeof(char));
+     memset(toast_oldexternal, 0, numAttrs * sizeof(struct varlena *));
      memset(toast_free, 0, numAttrs * sizeof(bool));
      memset(toast_delold, 0, numAttrs * sizeof(bool));

*************** toast_insert_or_update(Relation rel, Hea
*** 550,555 ****
--- 554,560 ----
               */
              if (VARATT_IS_EXTERNAL(new_value))
              {
+                 toast_oldexternal[i] = new_value;
                  if (att[i]->attstorage == 'p')
                      new_value = heap_tuple_untoast_attr(new_value);
                  else
*************** toast_insert_or_update(Relation rel, Hea
*** 676,682 ****
          {
              old_value = toast_values[i];
              toast_action[i] = 'p';
!             toast_values[i] = toast_save_datum(rel, toast_values[i], options);
              if (toast_free[i])
                  pfree(DatumGetPointer(old_value));
              toast_free[i] = true;
--- 681,688 ----
          {
              old_value = toast_values[i];
              toast_action[i] = 'p';
!             toast_values[i] = toast_save_datum(rel, toast_values[i],
!                                                toast_oldexternal[i], options);
              if (toast_free[i])
                  pfree(DatumGetPointer(old_value));
              toast_free[i] = true;
*************** toast_insert_or_update(Relation rel, Hea
*** 726,732 ****
          i = biggest_attno;
          old_value = toast_values[i];
          toast_action[i] = 'p';
!         toast_values[i] = toast_save_datum(rel, toast_values[i], options);
          if (toast_free[i])
              pfree(DatumGetPointer(old_value));
          toast_free[i] = true;
--- 732,739 ----
          i = biggest_attno;
          old_value = toast_values[i];
          toast_action[i] = 'p';
!         toast_values[i] = toast_save_datum(rel, toast_values[i],
!                                            toast_oldexternal[i], options);
          if (toast_free[i])
              pfree(DatumGetPointer(old_value));
          toast_free[i] = true;
*************** toast_insert_or_update(Relation rel, Hea
*** 839,845 ****
          i = biggest_attno;
          old_value = toast_values[i];
          toast_action[i] = 'p';
!         toast_values[i] = toast_save_datum(rel, toast_values[i], options);
          if (toast_free[i])
              pfree(DatumGetPointer(old_value));
          toast_free[i] = true;
--- 846,853 ----
          i = biggest_attno;
          old_value = toast_values[i];
          toast_action[i] = 'p';
!         toast_values[i] = toast_save_datum(rel, toast_values[i],
!                                            toast_oldexternal[i], options);
          if (toast_free[i])
              pfree(DatumGetPointer(old_value));
          toast_free[i] = true;
*************** toast_compress_datum(Datum value)
*** 1117,1126 ****
   *
   *    Save one single datum into the secondary relation and return
   *    a Datum reference for it.
   * ----------
   */
  static Datum
! toast_save_datum(Relation rel, Datum value, int options)
  {
      Relation    toastrel;
      Relation    toastidx;
--- 1125,1140 ----
   *
   *    Save one single datum into the secondary relation and return
   *    a Datum reference for it.
+  *
+  * rel: the main relation we're working with (not the toast rel!)
+  * value: datum to be pushed to toast storage
+  * oldexternal: if not NULL, toast pointer previously representing the datum
+  * options: options to be passed to heap_insert() for toast rows
   * ----------
   */
  static Datum
! toast_save_datum(Relation rel, Datum value,
!                  struct varlena *oldexternal, int options)
  {
      Relation    toastrel;
      Relation    toastidx;
*************** toast_save_datum(Relation rel, Datum val
*** 1199,1209 ****
          toast_pointer.va_toastrelid = RelationGetRelid(toastrel);

      /*
!      * Choose an unused OID within the toast table for this toast value.
       */
!     toast_pointer.va_valueid = GetNewOidWithIndex(toastrel,
!                                                   RelationGetRelid(toastidx),
!                                                   (AttrNumber) 1);

      /*
       * Initialize constant parts of the tuple data
--- 1213,1267 ----
          toast_pointer.va_toastrelid = RelationGetRelid(toastrel);

      /*
!      * Choose an OID to use as the value ID for this toast value.
!      *
!      * Normally we just choose an unused OID within the toast table.  But
!      * during table-rewriting operations where we are preserving an existing
!      * toast table OID, we want to preserve toast value OIDs too.  So, if
!      * rd_toastoid is set and we had a prior external value from that same
!      * toast table, re-use its value ID.  If we didn't have a prior external
!      * value (which is a corner case, but possible if the table's attstorage
!      * options have been changed), we have to pick a value ID that doesn't
!      * conflict with either new or existing toast value OIDs.
       */
!     if (!OidIsValid(rel->rd_toastoid))
!     {
!         /* normal case: just choose an unused OID */
!         toast_pointer.va_valueid =
!             GetNewOidWithIndex(toastrel,
!                                RelationGetRelid(toastidx),
!                                (AttrNumber) 1);
!     }
!     else
!     {
!         /* rewrite case: check to see if value was in old toast table */
!         toast_pointer.va_valueid = InvalidOid;
!         if (oldexternal != NULL)
!         {
!             struct varatt_external old_toast_pointer;
!
!             Assert(VARATT_IS_EXTERNAL(oldexternal));
!             /* Must copy to access aligned fields */
!             VARATT_EXTERNAL_GET_POINTER(old_toast_pointer, oldexternal);
!             if (old_toast_pointer.va_toastrelid == rel->rd_toastoid)
!                 toast_pointer.va_valueid = old_toast_pointer.va_valueid;
!         }
!         if (toast_pointer.va_valueid == InvalidOid)
!         {
!            /*
!             * new value; must choose an OID that doesn't conflict in either
!             * old or new toast table
!             */
!             do
!             {
!                 toast_pointer.va_valueid =
!                     GetNewOidWithIndex(toastrel,
!                                        RelationGetRelid(toastidx),
!                                        (AttrNumber) 1);
!             } while (toast_valueid_exists(rel->rd_toastoid,
!                                           toast_pointer.va_valueid));
!         }
!     }

      /*
       * Initialize constant parts of the tuple data
*************** toast_delete_datum(Relation rel, Datum v
*** 1339,1344 ****
--- 1397,1448 ----


  /* ----------
+  * toast_valueid_exists -
+  *
+  *    Test whether a toast value with the given ID exists in the toast relation
+  * ----------
+  */
+ static bool
+ toast_valueid_exists(Oid toastrelid, Oid valueid)
+ {
+     bool        result = false;
+     Relation    toastrel;
+     ScanKeyData toastkey;
+     SysScanDesc toastscan;
+
+     /*
+      * Open the toast relation
+      */
+     toastrel = heap_open(toastrelid, AccessShareLock);
+
+     /*
+      * Setup a scan key to find chunks with matching va_valueid
+      */
+     ScanKeyInit(&toastkey,
+                 (AttrNumber) 1,
+                 BTEqualStrategyNumber, F_OIDEQ,
+                 ObjectIdGetDatum(valueid));
+
+     /*
+      * Is there any such chunk?
+      */
+     toastscan = systable_beginscan(toastrel, toastrel->rd_rel->reltoastidxid,
+                                    true, SnapshotToast, 1, &toastkey);
+
+     if (systable_getnext(toastscan) != NULL)
+         result = true;
+
+     /*
+      * End scan and close relations
+      */
+     systable_endscan(toastscan);
+     heap_close(toastrel, AccessShareLock);
+
+     return result;
+ }
+
+
+ /* ----------
   * toast_fetch_datum -
   *
   *    Reconstruct an in memory Datum from the chunks saved
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 9a7649bb4f95581e0169f0a341a9d8d0b1287db3..670d29ea83192a82297100a16b59e7a48303f1fd 100644
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
*************** copy_heap_data(Oid OIDNewHeap, Oid OIDOl
*** 797,802 ****
--- 797,806 ----
           * When doing swap by content, any toast pointers written into NewHeap
           * must use the old toast table's OID, because that's where the toast
           * data will eventually be found.  Set this up by setting rd_toastoid.
+          * This also tells tuptoaster.c to preserve the toast value OIDs,
+          * which we want so as not to invalidate toast pointers in system
+          * catalog caches.
+          *
           * Note that we must hold NewHeap open until we are done writing data,
           * since the relcache will not guarantee to remember this setting once
           * the relation is closed.    Also, this technique depends on the fact

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: VACUUM FULL versus unsafe order-of-operations in DDL commands
Next
From: Greg Stark
Date:
Subject: Re: VACUUM FULL versus TOAST