Thread: Strange coding in mvdistinct.c
In the wake of the discussion at [1] I went looking for structs that should be using FLEXIBLE_ARRAY_MEMBER and are not, by dint of grepping for size calculations of the form "offsetof(struct,fld) + n * sizeof(...)" and then seeing how "fld" is declared. I haven't yet found anything like that that I want to change, but I did come across this bit in mvdistinct.c's statext_ndistinct_serialize(): len = VARHDRSZ + SizeOfMVNDistinct + ndistinct->nitems * (offsetof(MVNDistinctItem, attrs) + sizeof(int)); Given the way that the subsequent code looks, I would argue that offsetof(MVNDistinctItem, attrs) has got basically nothing to do with this calculation, and that the right way to phrase it is just len = VARHDRSZ + SizeOfMVNDistinct + ndistinct->nitems * (sizeof(double) + sizeof(int)); Consider if there happened to be alignment padding in MVNDistinctItem: as the code stands it'd overestimate the space needed. (There won't be padding on any machine we support, I believe, so this isn't a live bug --- but it's overcomplicated code, and could become buggy if any less-than-double-width fields get added to MVNDistinctItem.) For largely the same reason, I do not think that SizeOfMVNDistinct is a helpful way to compute the space needed for those fields --- any alignment padding that might be included is irrelevant for this purpose. In short I'd be inclined to phrase this just as len = VARHDRSZ + 3 * sizeof(uint32) + ndistinct->nitems * (sizeof(double) + sizeof(int)); It looks to me actually like all the uses of both SizeOfMVNDistinctItem and SizeOfMVNDistinct are wrong, because the code using those symbols is really thinking about the size of this serialized representation, which is guaranteed not to have any inter-field padding, unlike the structs. Thoughts? regards, tom lane [1] https://postgr.es/m/a620f85a-42ab-e0f3-3337-b04b97e2e2f5@redhat.com
Oh, and as I continue to grep, I found this in dependencies.c: dependencies = (MVDependencies *) repalloc(dependencies, offsetof(MVDependencies, deps) + dependencies->ndeps * sizeof(MVDependency)); I'm pretty sure this is an actual bug: the calculation should be offsetof(MVDependencies, deps) + dependencies->ndeps * sizeof(MVDependency *)); because deps is an array of MVDependency* not MVDependency. This would lead to an overallocation not underallocation, and it's probably pretty harmless because ndeps can't get too large (I hope; if it could, this would have O(N^2) performance problems). Still, you oughta fix it. (There's a similar calculation later in the file that gets it right.) regards, tom lane
On Mon, Apr 15, 2019 at 06:12:24PM -0400, Tom Lane wrote: >Oh, and as I continue to grep, I found this in dependencies.c: > > dependencies = (MVDependencies *) repalloc(dependencies, > offsetof(MVDependencies, deps) > + dependencies->ndeps * sizeof(MVDependency)); > >I'm pretty sure this is an actual bug: the calculation should be > > offsetof(MVDependencies, deps) > + dependencies->ndeps * sizeof(MVDependency *)); > >because deps is an array of MVDependency* not MVDependency. > >This would lead to an overallocation not underallocation, and it's >probably pretty harmless because ndeps can't get too large (I hope; >if it could, this would have O(N^2) performance problems). Still, >you oughta fix it. > >(There's a similar calculation later in the file that gets it right.) > Thanks. I noticed some of the bugs while investigating the recent MCV serialization, and I plan to fix them soon. This week, hopefully. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Apr 15, 2019 at 06:12:24PM -0400, Tom Lane wrote: >Oh, and as I continue to grep, I found this in dependencies.c: > > dependencies = (MVDependencies *) repalloc(dependencies, > offsetof(MVDependencies, deps) > + dependencies->ndeps * sizeof(MVDependency)); > >I'm pretty sure this is an actual bug: the calculation should be > > offsetof(MVDependencies, deps) > + dependencies->ndeps * sizeof(MVDependency *)); > >because deps is an array of MVDependency* not MVDependency. > >This would lead to an overallocation not underallocation, and it's >probably pretty harmless because ndeps can't get too large (I hope; >if it could, this would have O(N^2) performance problems). Still, >you oughta fix it. > >(There's a similar calculation later in the file that gets it right.) > I've pushed a fix correcting those issues - both for mvndistinct and functional dependencies. I've reworked the macros used to compute the serialized sizes not to use offsetof(), which however made them pretty useless for other purposes. So I've made them private by moving them to the .c files where they are moved them to the .c files. I don't think we need to backpatch this - we're not going to add new fields in backbranches, so there's little danger of introducing padding. And I'm not sure we want to remove the macros, although it's unlikely anyone else uses them. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services