Thread: Tagged types module and varlena changes

Tagged types module and varlena changes

From
Alban Hertroys
Date:
Hello all,

I'm trying to get Marcel's tagged types to work, but I ran into some
problems due to changes to the varlena type since he wrote that code.
I tried contacting him personally earlier, but he's having some
hardware issues apparently and can't help me; hence me asking here ;)

The problem is that the tagged types module uses this bit of code:

> struct varlena* tv = (struct varlena*)tt_palloc( VARSIZE( datum ) );
>
> tv->vl_len = VARSIZE( datum ) - sizeof(Oid);
> memcpy( tv->vl_dat,
>     &((struct taggedtypev*)DatumGetPointer( datum ))->val,
>     VARSIZE(datum) - sizeof(Oid) - VARHDRSZ );
> return PointerGetDatum( tv ) ;


This doesn't compile anymore as the vl_len member of struct varlena no
longer exists and we're supposed to use the SET_VARSIZE macro instead
now. I tried that, but then the memcpy bails out due to the size
calculation being wrong. I don't know enough about varlena usage to
figure out what the correct way of calculating the size for memcpy is,
or whether that approach is at all feasable with the current varlena
implementation. What should the above read?

I'm also slightly worried whether DatumGetPointer might return a null-
pointer in some cases. Is that possible?

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a95273611861044619247!



Re: Tagged types module and varlena changes

From
Greg Stark
Date:
On Wed, Aug 26, 2009 at 1:14 PM, Alban
Hertroys<dalroi@solfertje.student.utwente.nl> wrote:
>> struct varlena* tv = (struct varlena*)tt_palloc( VARSIZE( datum ) );
>>
>> tv->vl_len = VARSIZE( datum ) - sizeof(Oid);
>> memcpy( tv->vl_dat,
>>        &((struct taggedtypev*)DatumGetPointer( datum ))->val,
>>        VARSIZE(datum) - sizeof(Oid) - VARHDRSZ );
>> return PointerGetDatum( tv ) ;
>
>
> This doesn't compile anymore as the vl_len member of struct varlena no
> longer exists and we're supposed to use the SET_VARSIZE macro instead now. I
> tried that, but then the memcpy bails out due to the size calculation being
> wrong. I don't know enough about varlena usage to figure out what the
> correct way of calculating the size for memcpy is, or whether that approach
> is at all feasable with the current varlena implementation. What should the
> above read?
>

With the SET_VARSIZE the above should work *as long as datum is not
toasted (or packed)*. If it's been detoasted then that's good, or if
it was freshly generated and not stored in a tuple then it should be
good too.

If it's not guaranteed to be detoasted then you probably should be
adding detoast calls since the Oid will have alignment requirements.
Otherwise you could just use VARSIZE_ANY_EXHDR()

It's generally a good thing to rename the vl_len field to something
like _vl_len to catch anyplace else that's referring directly to it.
That will never work any more, neither for setting it nor for reading
it. They all need to be converted to use VARSIZE and SET_VARSIZE

I haven't looked at the rest of the code to see if the general
approach is still reasonable. The varlena changes shouldn't be fatal
though.
--
greg
http://mit.edu/~gsstark/resume.pdf

Re: Tagged types module and varlena changes

From
Alban Hertroys
Date:
On 26 Aug 2009, at 15:33, Greg Stark wrote:

> On Wed, Aug 26, 2009 at 1:14 PM, Alban
> Hertroys<dalroi@solfertje.student.utwente.nl> wrote:
>>> struct varlena* tv = (struct varlena*)tt_palloc( VARSIZE( datum ) );
>>>
>>> tv->vl_len = VARSIZE( datum ) - sizeof(Oid);
>>> memcpy( tv->vl_dat,
>>>        &((struct taggedtypev*)DatumGetPointer( datum ))->val,
>>>        VARSIZE(datum) - sizeof(Oid) - VARHDRSZ );
>>> return PointerGetDatum( tv ) ;
>>
>>
>> This doesn't compile anymore as the vl_len member of struct varlena
>> no
>> longer exists and we're supposed to use the SET_VARSIZE macro
>> instead now.. I
>> tried that, but then the memcpy bails out due to the size
>> calculation being
>> wrong. I don't know enough about varlena usage to figure out what the
>> correct way of calculating the size for memcpy is, or whether that
>> approach
>> is at all feasable with the current varlena implementation. What
>> should the
>> above read?
>>
>
> With the SET_VARSIZE the above should work *as long as datum is not
> toasted (or packed)*. If it's been detoasted then that's good, or if
> it was freshly generated and not stored in a tuple then it should be
> good too.

I changed it to:
>         struct varlena* tv = (struct
> varlena*)tt_palloc( VARSIZE( datum ) );
>         struct taggedtypev *typev =
>         (struct taggedtypev*) DatumGetPointer( datum );
>         int a = VARSIZE(datum) - sizeof(Oid),
>             b = VARSIZE_ANY_EXHDR(datum) - sizeof(Oid);
>
>         SET_VARSIZE(tv->vl_len_, a);
>         memcpy( tv->vl_dat, &typev->val, b );
>
>         return PointerGetDatum( tv ) ;

But still I get a segfault on the memcpy line. The backtrace shows the
following (line 0 is the memcpy itself, nothing useful to see there):

#1  0x29806f74 in ExtractTaggedTypeDatum (tti=0x2980c560,
datum=726659176)
     at taggedtypes.c:249
249                     memcpy( tv->vl_dat, &typev->val, b );
(gdb) print *tv
$1 = {vl_len_ = "\000\000\000", vl_dat = ""}
(gdb) print a
$2 = 0
(gdb) print b
$3 = -4
(gdb) print *typev
$4 = {len = "\020\000\000", tag = 68899, val = "!\000\000"}

Obviously passing a negative value as the size to copy is what's
causing the segfault, but how come it's negative? Could it be that my
table doesn't have Oid's and that subtracting sizeof(Oid) is what
makes the length become negative?

I did some reading up on TOASTing and how to use those macro's, but
the manual wasn't very detailed in this particular area... I doubt my
values are TOASTed though, they're rather short values; not quite 2k
anyway.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a954cda11869014116556!



Re: Tagged types module and varlena changes

From
Alban Hertroys
Date:
On 26 Aug 2009, at 16:55, Alban Hertroys wrote:

>> With the SET_VARSIZE the above should work *as long as datum is not
>> toasted (or packed)*. If it's been detoasted then that's good, or if
>> it was freshly generated and not stored in a tuple then it should be
>> good too.
>
> I changed it to:
>>        struct varlena* tv = (struct
>> varlena*)tt_palloc( VARSIZE( datum ) );
>>        struct taggedtypev *typev =
>>         (struct taggedtypev*) DatumGetPointer( datum );
>>        int a = VARSIZE(datum) - sizeof(Oid),
>>            b = VARSIZE_ANY_EXHDR(datum) - sizeof(Oid);
>>
>>        SET_VARSIZE(tv->vl_len_, a);
>>        memcpy( tv->vl_dat, &typev->val, b );
>>
>>        return PointerGetDatum( tv ) ;
>
> But still I get a segfault on the memcpy line. The backtrace shows
> the following (line 0 is the memcpy itself, nothing useful to see
> there):
>
> #1  0x29806f74 in ExtractTaggedTypeDatum (tti=0x2980c560,
> datum=726659176)
>    at taggedtypes.c:249
> 249                     memcpy( tv->vl_dat, &typev->val, b );
> (gdb) print *tv
> $1 = {vl_len_ = "\000\000\000", vl_dat = ""}
> (gdb) print a
> $2 = 0
> (gdb) print b
> $3 = -4
> (gdb) print *typev
> $4 = {len = "\020\000\000", tag = 68899, val = "!\000\000"}
>
> Obviously passing a negative value as the size to copy is what's
> causing the segfault, but how come it's negative? Could it be that
> my table doesn't have Oid's and that subtracting sizeof(Oid) is what
> makes the length become negative?


To follow up on this:

One of the failing queries is:
select * from taggedtypes.currency_test ;

development=# \d+ taggedtypes.currency_test
               Table "taggedtypes.currency_test"
  Column |           Type            | Modifiers | Description
--------+---------------------------+-----------+-------------
  c1     | taggedtypes.currency      |           |
  c2     | taggedtypes.currencyint   |           |
  c3     | taggedtypes.currencyfloat |           |
Has OIDs: no

I changed the test script to create the table WITH OIDS, and now the
code works!

Is there some way to check whether a Datum is from a table with OIDs?
I think the code could do with a check for that and error out if the
table doesn't have those...

Alban Hertroys

--
Screwing up is the correct approach to attaching something to the
ceiling.


!DSPAM:737,4a95510b11861909511901!



Re: Tagged types module and varlena changes

From
Tom Lane
Date:
Alban Hertroys <dalroi@solfertje.student.utwente.nl> writes:
> I changed it to:
>> SET_VARSIZE(tv->vl_len_, a);

This is just wrong; use SET_VARSIZE(tv, ...).

>> memcpy( tv->vl_dat, &typev->val, b );

And I wouldn't recommend referencing vl_dat directly either.
Use VARDATA().

In general you're supposed to apply VARSIZE() and VARDATA() and
friends to pointers not Datums.  Although it would usually work
to be sloppy about this, I can't recommend it.  I would extract
the typev pointer first and then apply the VARSIZE macro to it.

> (gdb) print *tv
> $1 = {vl_len_ = "\000\000\000", vl_dat = ""}
> (gdb) print a
> $2 = 0
> (gdb) print b
> $3 = -4
> (gdb) print *typev
> $4 = {len = "\020\000\000", tag = 68899, val = "!\000\000"}

Where did the input come from?  On a little-endian machine that len
value means 4 bytes, so it's wrong right off the bat if it's
supposed to include the tag.

            regards, tom lane