Thread: Document and/or remove unreachable code in tuptoaster.c from varvarlena patch

Testers here were having a hard time constructing test cases to reach some
lines touched by the varvarlena patch. Upon further investigation I'm
convinced they're unreachable.

Some were added when I did packed varlena -- I've removed those. These lines
were actually necessary earlier but when we made attstorage='p' exempt columns
from SHORT treatment that made these lines unreachable. I think they impede
the readability so it's best to just remove them.

The other lines I think are useful if only to make the code self-documenting.
But they're unreachable due to the way the loop works and the way it tracks
toast_action so I documented that fact for the next person using gcov.

I also couldn't find a way to trigger the pfree() lines but I'm still looking
at that. Even when we're retoasting previously detoasted datums from an
updated row we still don't call pfree so something's a bit weird here.

Lastly, we currently never compress anything below 256 bytes so we never
compress SHORT varlenas. But trying to compress a datum only to find it's
uncompressible is a fairly expensive operation. Aside from doing a palloc we
also end up having to do a whole iteration of this loop including
recalculating the size of the tuple and looking for the next largest datum.

I would suggest we should decrease the minimum size to 32 instead of the
current 256 which will mean we could compress SHORT data. But as long as we
don't do that we should have a check like below which will avoid trying to.
(We might still have a check against VARSIZE in toast_compress_datum to avoid
the palloc.)


Index: tuptoaster.c
===================================================================
RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/access/heap/tuptoaster.c,v
retrieving revision 1.74
diff -c -r1.74 tuptoaster.c
*** tuptoaster.c    6 Apr 2007 04:21:41 -0000    1.74
--- tuptoaster.c    27 Jul 2007 14:38:14 -0000
***************
*** 535,541 ****                 need_change = true;                 need_free = true;             }
!              /*              * Remember the size of this attribute              */
--- 535,548 ----                 need_change = true;                 need_free = true;             }
!             else if (VARATT_IS_SHORT(new_value) || VARSIZE(new_value) < 256)
!             {
!                 /* The default pg_lz_compress strategy doesn't compress things
!                  * under 256 bytes so skip iterations through the loop trying
!                  * to compress them */
!                 toast_action[i] = 'x';
!             }
!                          /*              * Remember the size of this attribute              */
***************
*** 590,596 ****             if (toast_action[i] != ' ')                 continue;             if
(VARATT_IS_EXTERNAL(toast_values[i]))
!                 continue;             if (VARATT_IS_COMPRESSED(toast_values[i]))                 continue;
if (att[i]->attstorage != 'x')
 
--- 597,604 ----             if (toast_action[i] != ' ')                 continue;             if
(VARATT_IS_EXTERNAL(toast_values[i]))
!                 /* Dead code due to toast_action */
!                 continue;              if (VARATT_IS_COMPRESSED(toast_values[i]))                 continue;
 if (att[i]->attstorage != 'x')
 
***************
*** 654,659 ****
--- 662,668 ----             if (toast_action[i] == 'p')                 continue;             if
(VARATT_IS_EXTERNAL(toast_values[i]))
+                 /* Dead code due to toast_action */                 continue;             if (att[i]->attstorage !=
'x'&& att[i]->attstorage != 'e')                 continue;
 
***************
*** 703,712 ****
--- 712,723 ----             if (toast_action[i] != ' ')                 continue;             if
(VARATT_IS_EXTERNAL(toast_values[i]))
+                 /* Dead code due to toast_action */                 continue;             if
(VARATT_IS_COMPRESSED(toast_values[i]))                continue;             if (att[i]->attstorage != 'm')
 
+                 /* Dead code (what else could attstorage be at this point?) */                 continue;
if(toast_sizes[i] > biggest_size)             {
 
***************
*** 766,771 ****
--- 777,783 ----             if (toast_action[i] == 'p')                 continue;             if
(VARATT_IS_EXTERNAL(toast_values[i]))
+                 /* Dead code due to toast_action */                 continue;             if (att[i]->attstorage !=
'm')                continue;
 
***************
*** 1331,1345 ****             chunksize = VARSIZE(chunk) - VARHDRSZ;             chunkdata = VARDATA(chunk);
}
-         else if (VARATT_IS_SHORT(chunk))
-         {
-             /* could happen due to heap_form_tuple doing its thing */
-             chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT;
-             chunkdata = VARDATA_SHORT(chunk);
-         }         else         {
!             /* should never happen */             elog(ERROR, "found toasted toast chunk");             chunksize =
0;               /* keep compiler quiet */             chunkdata = NULL;
 
--- 1343,1351 ----             chunksize = VARSIZE(chunk) - VARHDRSZ;             chunkdata = VARDATA(chunk);         }
       else         {
 
!             /* should never happen due to attstorage='p' for chunk_data in toast tables */             elog(ERROR,
"foundtoasted toast chunk");             chunksize = 0;                /* keep compiler quiet */             chunkdata
=NULL;
 
***************
*** 1533,1547 ****             chunksize = VARSIZE(chunk) - VARHDRSZ;             chunkdata = VARDATA(chunk);
}
-         else if (VARATT_IS_SHORT(chunk))
-         {
-             /* could happen due to heap_form_tuple doing its thing */
-             chunksize = VARSIZE_SHORT(chunk) - VARHDRSZ_SHORT;
-             chunkdata = VARDATA_SHORT(chunk);
-         }         else         {
!             /* should never happen */             elog(ERROR, "found toasted toast chunk");             chunksize =
0;               /* keep compiler quiet */             chunkdata = NULL;
 
--- 1539,1547 ----             chunksize = VARSIZE(chunk) - VARHDRSZ;             chunkdata = VARDATA(chunk);         }
       else         {
 
!             /* should never happen due to attstorage='p' for chunk_data in toast tables */             elog(ERROR,
"foundtoasted toast chunk");             chunksize = 0;                /* keep compiler quiet */             chunkdata
=NULL;
 


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com



Sorry, meant to send the previous message to pgsql-patches.

Here's a version cut using cvs diff so it's usable with -p0

I added one more fixup. There was a silly test in toast_fetch_datum_slice()
which handled compressed datums. Returning a slice of a compressed datum is
nonsensical with toast since the resulting datum would be useless. I also
added an assertion in this function that the datum is external before we treat
it as a toast_pointer.

(Incidentally, I did eventually manage to construct a case to reach all the pfrees.)




--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Attachment
Gregory Stark <stark@enterprisedb.com> writes:
> Testers here were having a hard time constructing test cases to reach some
> lines touched by the varvarlena patch. Upon further investigation I'm
> convinced they're unreachable.

I'm not really happy with any of this patch.  ISTM that the stuff you
say is unreachable is only so because of non-essential behavioral
choices made in other parts of the code.  If we were to change those
other parts later, this code (after patching) would break.  I'd rather
leave complete coverage here and not make fragile assumptions; especially
so since these are presumably not performance-critical paths.
        regards, tom lane


"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> Testers here were having a hard time constructing test cases to reach some
>> lines touched by the varvarlena patch. Upon further investigation I'm
>> convinced they're unreachable.
>
> I'm not really happy with any of this patch. ISTM that the stuff you say is
> unreachable is only so because of non-essential behavioral choices made in
> other parts of the code. If we were to change those other parts later, this
> code (after patching) would break. I'd rather leave complete coverage here
> and not make fragile assumptions; especially so since these are presumably
> not performance-critical paths.

Ok, that's true for the branches handling packed toast chunks. I thought it
was better to throw an error rather than have silently accept something which
indicates something unexpected has happened. And it was just extra useless
code for readers to slog through. 

The others I just put comments on In case the next person was as confused as I
was trying to figure out the logic. They're not especially performance
critical even though they're in the loop because they only get hit when we're
considering fields the first time.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com