Thread: Tagged types module and varlena changes
Hello all, I'm trying to get Marcel's tagged types to work, but I ran into some problems due to changes to the varlena type since he wrote that code. I tried contacting him personally earlier, but he's having some hardware issues apparently and can't help me; hence me asking here ;) The problem is that the tagged types module uses this bit of code: > struct varlena* tv = (struct varlena*)tt_palloc( VARSIZE( datum ) ); > > tv->vl_len = VARSIZE( datum ) - sizeof(Oid); > memcpy( tv->vl_dat, > &((struct taggedtypev*)DatumGetPointer( datum ))->val, > VARSIZE(datum) - sizeof(Oid) - VARHDRSZ ); > return PointerGetDatum( tv ) ; This doesn't compile anymore as the vl_len member of struct varlena no longer exists and we're supposed to use the SET_VARSIZE macro instead now. I tried that, but then the memcpy bails out due to the size calculation being wrong. I don't know enough about varlena usage to figure out what the correct way of calculating the size for memcpy is, or whether that approach is at all feasable with the current varlena implementation. What should the above read? I'm also slightly worried whether DatumGetPointer might return a null- pointer in some cases. Is that possible? Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a95273611861044619247!
On Wed, Aug 26, 2009 at 1:14 PM, Alban Hertroys<dalroi@solfertje.student.utwente.nl> wrote: >> struct varlena* tv = (struct varlena*)tt_palloc( VARSIZE( datum ) ); >> >> tv->vl_len = VARSIZE( datum ) - sizeof(Oid); >> memcpy( tv->vl_dat, >> &((struct taggedtypev*)DatumGetPointer( datum ))->val, >> VARSIZE(datum) - sizeof(Oid) - VARHDRSZ ); >> return PointerGetDatum( tv ) ; > > > This doesn't compile anymore as the vl_len member of struct varlena no > longer exists and we're supposed to use the SET_VARSIZE macro instead now. I > tried that, but then the memcpy bails out due to the size calculation being > wrong. I don't know enough about varlena usage to figure out what the > correct way of calculating the size for memcpy is, or whether that approach > is at all feasable with the current varlena implementation. What should the > above read? > With the SET_VARSIZE the above should work *as long as datum is not toasted (or packed)*. If it's been detoasted then that's good, or if it was freshly generated and not stored in a tuple then it should be good too. If it's not guaranteed to be detoasted then you probably should be adding detoast calls since the Oid will have alignment requirements. Otherwise you could just use VARSIZE_ANY_EXHDR() It's generally a good thing to rename the vl_len field to something like _vl_len to catch anyplace else that's referring directly to it. That will never work any more, neither for setting it nor for reading it. They all need to be converted to use VARSIZE and SET_VARSIZE I haven't looked at the rest of the code to see if the general approach is still reasonable. The varlena changes shouldn't be fatal though. -- greg http://mit.edu/~gsstark/resume.pdf
On 26 Aug 2009, at 15:33, Greg Stark wrote: > On Wed, Aug 26, 2009 at 1:14 PM, Alban > Hertroys<dalroi@solfertje.student.utwente.nl> wrote: >>> struct varlena* tv = (struct varlena*)tt_palloc( VARSIZE( datum ) ); >>> >>> tv->vl_len = VARSIZE( datum ) - sizeof(Oid); >>> memcpy( tv->vl_dat, >>> &((struct taggedtypev*)DatumGetPointer( datum ))->val, >>> VARSIZE(datum) - sizeof(Oid) - VARHDRSZ ); >>> return PointerGetDatum( tv ) ; >> >> >> This doesn't compile anymore as the vl_len member of struct varlena >> no >> longer exists and we're supposed to use the SET_VARSIZE macro >> instead now.. I >> tried that, but then the memcpy bails out due to the size >> calculation being >> wrong. I don't know enough about varlena usage to figure out what the >> correct way of calculating the size for memcpy is, or whether that >> approach >> is at all feasable with the current varlena implementation. What >> should the >> above read? >> > > With the SET_VARSIZE the above should work *as long as datum is not > toasted (or packed)*. If it's been detoasted then that's good, or if > it was freshly generated and not stored in a tuple then it should be > good too. I changed it to: > struct varlena* tv = (struct > varlena*)tt_palloc( VARSIZE( datum ) ); > struct taggedtypev *typev = > (struct taggedtypev*) DatumGetPointer( datum ); > int a = VARSIZE(datum) - sizeof(Oid), > b = VARSIZE_ANY_EXHDR(datum) - sizeof(Oid); > > SET_VARSIZE(tv->vl_len_, a); > memcpy( tv->vl_dat, &typev->val, b ); > > return PointerGetDatum( tv ) ; But still I get a segfault on the memcpy line. The backtrace shows the following (line 0 is the memcpy itself, nothing useful to see there): #1 0x29806f74 in ExtractTaggedTypeDatum (tti=0x2980c560, datum=726659176) at taggedtypes.c:249 249 memcpy( tv->vl_dat, &typev->val, b ); (gdb) print *tv $1 = {vl_len_ = "\000\000\000", vl_dat = ""} (gdb) print a $2 = 0 (gdb) print b $3 = -4 (gdb) print *typev $4 = {len = "\020\000\000", tag = 68899, val = "!\000\000"} Obviously passing a negative value as the size to copy is what's causing the segfault, but how come it's negative? Could it be that my table doesn't have Oid's and that subtracting sizeof(Oid) is what makes the length become negative? I did some reading up on TOASTing and how to use those macro's, but the manual wasn't very detailed in this particular area... I doubt my values are TOASTed though, they're rather short values; not quite 2k anyway. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a954cda11869014116556!
On 26 Aug 2009, at 16:55, Alban Hertroys wrote: >> With the SET_VARSIZE the above should work *as long as datum is not >> toasted (or packed)*. If it's been detoasted then that's good, or if >> it was freshly generated and not stored in a tuple then it should be >> good too. > > I changed it to: >> struct varlena* tv = (struct >> varlena*)tt_palloc( VARSIZE( datum ) ); >> struct taggedtypev *typev = >> (struct taggedtypev*) DatumGetPointer( datum ); >> int a = VARSIZE(datum) - sizeof(Oid), >> b = VARSIZE_ANY_EXHDR(datum) - sizeof(Oid); >> >> SET_VARSIZE(tv->vl_len_, a); >> memcpy( tv->vl_dat, &typev->val, b ); >> >> return PointerGetDatum( tv ) ; > > But still I get a segfault on the memcpy line. The backtrace shows > the following (line 0 is the memcpy itself, nothing useful to see > there): > > #1 0x29806f74 in ExtractTaggedTypeDatum (tti=0x2980c560, > datum=726659176) > at taggedtypes.c:249 > 249 memcpy( tv->vl_dat, &typev->val, b ); > (gdb) print *tv > $1 = {vl_len_ = "\000\000\000", vl_dat = ""} > (gdb) print a > $2 = 0 > (gdb) print b > $3 = -4 > (gdb) print *typev > $4 = {len = "\020\000\000", tag = 68899, val = "!\000\000"} > > Obviously passing a negative value as the size to copy is what's > causing the segfault, but how come it's negative? Could it be that > my table doesn't have Oid's and that subtracting sizeof(Oid) is what > makes the length become negative? To follow up on this: One of the failing queries is: select * from taggedtypes.currency_test ; development=# \d+ taggedtypes.currency_test Table "taggedtypes.currency_test" Column | Type | Modifiers | Description --------+---------------------------+-----------+------------- c1 | taggedtypes.currency | | c2 | taggedtypes.currencyint | | c3 | taggedtypes.currencyfloat | | Has OIDs: no I changed the test script to create the table WITH OIDS, and now the code works! Is there some way to check whether a Datum is from a table with OIDs? I think the code could do with a check for that and error out if the table doesn't have those... Alban Hertroys -- Screwing up is the correct approach to attaching something to the ceiling. !DSPAM:737,4a95510b11861909511901!
Alban Hertroys <dalroi@solfertje.student.utwente.nl> writes: > I changed it to: >> SET_VARSIZE(tv->vl_len_, a); This is just wrong; use SET_VARSIZE(tv, ...). >> memcpy( tv->vl_dat, &typev->val, b ); And I wouldn't recommend referencing vl_dat directly either. Use VARDATA(). In general you're supposed to apply VARSIZE() and VARDATA() and friends to pointers not Datums. Although it would usually work to be sloppy about this, I can't recommend it. I would extract the typev pointer first and then apply the VARSIZE macro to it. > (gdb) print *tv > $1 = {vl_len_ = "\000\000\000", vl_dat = ""} > (gdb) print a > $2 = 0 > (gdb) print b > $3 = -4 > (gdb) print *typev > $4 = {len = "\020\000\000", tag = 68899, val = "!\000\000"} Where did the input come from? On a little-endian machine that len value means 4 bytes, so it's wrong right off the bat if it's supposed to include the tag. regards, tom lane