Re: mcvstats serialization code is still shy of a load - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: mcvstats serialization code is still shy of a load |
Date | |
Msg-id | 20190626222918.72fprjvs6fvmjkup@development Whole thread Raw |
In response to | Re: mcvstats serialization code is still shy of a load (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: mcvstats serialization code is still shy of a load
|
List | pgsql-hackers |
On Wed, Jun 26, 2019 at 12:31:13PM -0400, Tom Lane wrote: >Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: >> On Wed, Jun 26, 2019 at 11:26:21AM -0400, Tom Lane wrote: >>> #define SizeOfMCVList(ndims,nitems) \ >>> is both woefully underdocumented and completely at variance with >>> reality. It doesn't seem to be accounting for the actual data values. > >> I agree about the macro being underdocumented, but AFAICS it's used >> correctly to check the expected length. It can't include the data values >> directly, because that's variable amount of data - and it's encoded in not >> yet verified part of the data. > >Well, it should have some other name then. Or *at least* a comment. >It's unbelievably misleading as it stands. > True. >> That being said, maybe this is unnecessarily defensive and we should just >> trust the values not being corrupted. > >No, I'm on board with checking the lengths. I just don't like how >hard it is to discern what's being checked. > Understood. >>> I think that part of the problem here is that the way this code is >>> written, "maxaligned" is no such thing. What you're actually maxaligning >>> seems to be the offset from the start of the data area of a varlena value, > >> I don't think so. The pointers should be maxaligned with respect to the >> whole varlena value, which is what 'raw' points to. > >[ squint ... ] OK, I think I misread this: > >statext_mcv_deserialize(bytea *data) >{ >... > /* pointer to the data part (skip the varlena header) */ > ptr = VARDATA_ANY(data); > raw = (char *) data; > >I think this is confusing in itself --- I read it as "raw = (char *) ptr" >and I think most other people would assume that too based on the order >of operations. It'd read better as > > /* remember start of datum for maxalign reference */ > raw = (char *) data; > > /* pointer to the data part (skip the varlena header) */ > ptr = VARDATA_ANY(data); > Yeah, that'd have been better. >Another problem with this code is that it flat doesn't work for >non-4-byte-header varlenas: it'd do the alignment differently than the >serialization side did. That's okay given that the two extant call sites >are guaranteed to pass detoasted datums. But using VARDATA_ANY gives a >completely misleading appearance of being ready to deal with short-header >varlenas, and heaven forbid there should be any comment to discourage >future coders from trying. So really what I'd like to see here is > > /* remember start of datum for maxalign reference */ > raw = (char *) data; > > /* alignment logic assumes full-size datum header */ > Assert(VARATT_IS_4B(data)); > > /* pointer to the data part (skip the varlena header) */ > ptr = VARDATA_ANY(data); > >Or, of course, this could all go away if we got rid of the >bogus maxaligning... > OK. Attached is a patch ditching the alignment in serialized data. I've ditched the macros to access parts of serialized data, and everything gets copied. The main complication is with varlena values, which may or may not have 4B headers (for now there's the PG_DETOAST_DATUM call, but as you mentioned we may want to remove it in the future). So I've stored the length as uint32 separately, followed by the full varlena value (thanks to that the deserialization is simpler). Not sure if that's the best solution, though, because this way we store the length twice. I've kept the alignment in the deserialization code, because there it allows us to allocate the whole value as a single chunk, which I think is useful (I admit I don't have any measurements to demonstrate that). But if we decide to rework this later, we can - it's just in-memory representation, not on-disk. Is this roughly what you had in mind? FWIW I'm sure some of the comments are stale and/or need clarification, but it's a bit too late over here, so I'll look into that tomorrow. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: