Re: Packed short varlenas, what next? - Mailing list pgsql-hackers

From Gregory Stark
Subject Re: Packed short varlenas, what next?
Date
Msg-id 877iu2plwe.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Packed short varlenas, what next?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Packed short varlenas, what next?
List pgsql-hackers
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> I've committed this, but in testing with a hack that does ntohl() in the
> VARSIZE macro and vice-versa in SET_VARSIZE, I find that core passes
> regression but several contrib modules do not.  It looks like the
> contrib modules were depending on various random structs being
> compatible with varlena, while not exposing that dependence in ways that
> either of us caught :-(.

I just noticed that last night myself. In particular the GIST modules seems to
be a major problem. they define dozens of new objects, many of which are just
passing around C data structures internally but some of which are objects
which get stored in the database. I have no idea which are which and which
ones are varlenas.

Worse, it uses PG_GETARG_POINTER() and explicitly calls PG_DETOAST_DATUM() in
the few places it assumes finding toasted data is possible. That's even harder
to track down.

I can send up a patch for the data types I fixed last night.

> I'll work on cleaning up the remaining mess tomorrow, but I think that
> we may need to think twice about whether it's OK to expect that every
> datatype with typlen = -1 will be compatible with the New Rules.  I'm
> back to wondering if maybe only types with typalign 'c' should get
> caught up in the changes.


I don't think we can key off typalign='c'. That would entail changing varlenas
to typalign 'c' which would throw off other consumers of the typalign which
expect it to be the alignment of the detoasted datum. Moreover I still align
them when they have the full 4-byte header by using the typalign.

I think we would want to introduce a new column, or maybe a new attlen value,
or a new typalign value.

I was thinking about that though and it's not so simple. It's easy enough not
to convert to short varlena for data types that don't assert that they support
the packed format. That's not a problem. That takes care of data types which
don't call pg_detoast_datum().

But not storing the varlena header in network byte order sometimes would be
quite tricky. There are a great many places that call VARSIZE that don't look
at the attalign or even have it handy.

If we made it a new attlen value we could have two different macros, but that
will be another quite large patch. It would mean hitting all those datatypes
all over again to change every instance of VARSIZE into NEWVARSIZE or
something like that. Plus all the sites in the core that call VARSIZE would
need to check attlen and call the right one.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Gregory Stark
Date:
Subject: Re: COMMIT NOWAIT Performance Option
Next
From: "Simon Riggs"
Date:
Subject: Re: Resumable vacuum proposal and design overview