Re: Getting the length of varlength data using PG_DETOAST_DATUM_SLICE - Mailing list pgsql-hackers

From Mark Dilger
Subject Re: Getting the length of varlength data using PG_DETOAST_DATUM_SLICE
Date
Msg-id 43EE7DB8.2000304@markdilger.com
Whole thread Raw
In response to Re: Getting the length of varlength data using  (Jeremy Drake <pgsql@jdrake.com>)
List pgsql-hackers
Jeremy Drake wrote:
> It looks like pg_column_size gives you the actual size on disk, ie after
> compression.
> 
> What looks interesting for you would be byteaoctetlen or the function it
> wraps, toast_raw_datum_size.  See src/backend/access/heap/tuptoaster.c.
> pg_column_size calls toast_datum_size, while byteaoctetlen/textoctetlen
> calls toast_raw_datum_size.
> 
> 
> 
> On Sat, 11 Feb 2006, Bruce Momjian wrote:
> 
> 
>>Have you looked at the 8.1.X buildin function pg_column_size()?
>>
>>---------------------------------------------------------------------------
>>
>>Mark Dilger wrote:
>>
>>>Hello, could anyone tell me, for a user contributed variable length data type,
>>>how can you access the length of the data without pulling the entire thing from
>>>disk?  Is there a function or macro for this?
>>>
>>>As a first cut, I tried using the PG_DETOAST_DATUM_SLICE macro, but to no avail.
>>>  grep'ing through the release source for version 8.1.2, I find very little
>>>usage of the PG_GETARG_*_SLICE and PG_DETOAST_DATUM_SLICE macros (and hence
>>>little clue how they are intended to be used.)  The only files where I find them
>>>referenced are:
>>>
>>>    doc/src/sgml/xfunc.sgml
>>>    src/backend/utils/adt/varlena.c
>>>    src/include/fmgr.h
>>>
>>>
>>>I am writing a variable length data type and trying to optimize the disk usage
>>>in certain functions.  There are cases where the return value of the function
>>>can be determined from the length of the data and a prefix of the data without
>>>fetching the whole data from disk.  (The prefix alone is insufficient -- I need
>>>to also know the length for the optimization to work.)
>>>
>>>The first field of the data type is the length, as follows:
>>>
>>>    typedef struct datatype_foo {
>>>        int32 length;
>>>        char data[];
>>>    } datatype_foo;
>>>
>>>But when I fetch the function arguments using
>>>
>>>    datatype_foo * a = (datatype_foo *)
>>>        PG_DETOAST_DATUM_SLICE(PG_GETARG_DATUM(0),0,BLCKSZ);
>>>
>>>the length field is set to the length of the fetched slice, not the length of
>>>the data as it exists on disk. Is there some other function that gets the length
>>>without pulling more than the first block?
>>>
>>>Thanks for any insight,
>>>
>>>--Mark
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: if posting/reading through Usenet, please send an appropriate
>>>       subscribe-nomail command to majordomo@postgresql.org so that your
>>>       message can get through to the mailing list cleanly
>>>

Ok, for anyone following the thread, this code works for me:
    int true_size_arg_zero = toast_raw_datum_size(PG_GETARG_DATUM(0));    int true_size_arg_one  =
toast_raw_datum_size(PG_GETARG_DATUM(1));

Be sure to #include "access/tuptoaster.h"

Thanks Jeremy!




pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Upcoming re-releases
Next
From: Tom Lane
Date:
Subject: Re: Scrollable cursors and Sort performance