Thread: btree_gist valgrind warnings about uninitialized memory
Hi, I am not sure whether these are new for 9.4 but they're worth looking at anyway: Valgrind was run with:--trace-children=yes --quiet \--track-origins=yes --leak-check=no \--read-var-info=yes \--suppressions=/home/andres/src/postgresql/src/tools/valgrind.supp\<postgres with options> All seem to be due btree_bit's fault and they seem to have a common origin. Didn't look at code. ==27401== Conditional jump or move depends on uninitialised value(s) ==27401== at 0x4C2C812: bcmp (mc_replace_strmem.c:935) ==27401== by 0x8A8533: byteacmp (varlena.c:2735) ==27401== by 0x8D70A2: DirectFunctionCall2Coll (fmgr.c:1050) ==27401== by 0x180822C6: gbt_bitcmp (btree_bit.c:68) ==27401== by 0x18079C05: gbt_vsrt_cmp (btree_utils_var.c:430) ==27401== by 0x919F66: qsort_arg (qsort_arg.c:121) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484) ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) ==27401== Uninitialised value was created by a heap allocation ==27401== at 0x4C274A0: malloc (vg_replace_malloc.c:291) ==27401== by 0x8F8EE1: AllocSetAlloc (aset.c:853) ==27401== by 0x8FAA71: palloc (mcxt.c:677) ==27401== by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42) ==27401== by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298) ==27401== by 0x48E4A6: gistdentryinit (gistutil.c:549) ==27401== by 0x498217: gistSplitByKey (gistsplit.c:644) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== ==27401== Conditional jump or move depends on uninitialised value(s) ==27401== at 0x8A853B: byteacmp (varlena.c:2736) ==27401== by 0x8D70A2: DirectFunctionCall2Coll (fmgr.c:1050) ==27401== by 0x180822C6: gbt_bitcmp (btree_bit.c:68) ==27401== by 0x18079C05: gbt_vsrt_cmp (btree_utils_var.c:430) ==27401== by 0x919F66: qsort_arg (qsort_arg.c:121) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484) ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) ==27401== by 0x498491: gistSplitByKey (gistsplit.c:697) ==27401== Uninitialised value was created by a heap allocation ==27401== at 0x4C274A0: malloc (vg_replace_malloc.c:291) ==27401== by 0x8F8EE1: AllocSetAlloc (aset.c:853) ==27401== by 0x8FAA71: palloc (mcxt.c:677) ==27401== by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42) ==27401== by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298) ==27401== by 0x48E4A6: gistdentryinit (gistutil.c:549) ==27401== by 0x498217: gistSplitByKey (gistsplit.c:644) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== ==27401== Conditional jump or move depends on uninitialised value(s) ==27401== at 0x18079C0D: gbt_vsrt_cmp (btree_utils_var.c:431) ==27401== by 0x919F66: qsort_arg (qsort_arg.c:121) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484) ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) ==27401== by 0x498491: gistSplitByKey (gistsplit.c:697) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== Uninitialised value was created by a heap allocation ==27401== at 0x4C274A0: malloc (vg_replace_malloc.c:291) ==27401== by 0x8F8EE1: AllocSetAlloc (aset.c:853) ==27401== by 0x8FAA71: palloc (mcxt.c:677) ==27401== by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42) ==27401== by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298) ==27401== by 0x48E4A6: gistdentryinit (gistutil.c:549) ==27401== by 0x498217: gistSplitByKey (gistsplit.c:644) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== ==27401== Conditional jump or move depends on uninitialised value(s) ==27401== at 0x919F69: qsort_arg (qsort_arg.c:121) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) ==27401== by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484) ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) ==27401== by 0x498491: gistSplitByKey (gistsplit.c:697) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== Uninitialised value was created by a heap allocation ==27401== at 0x4C274A0: malloc (vg_replace_malloc.c:291) ==27401== by 0x8F8EE1: AllocSetAlloc (aset.c:853) ==27401== by 0x8FAA71: palloc (mcxt.c:677) ==27401== by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42) ==27401== by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298) ==27401== by 0x48E4A6: gistdentryinit (gistutil.c:549) ==27401== by 0x498217: gistSplitByKey (gistsplit.c:644) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== ==27401== Conditional jump or move depends on uninitialised value(s) ==27401== at 0x8A853B: byteacmp (varlena.c:2736) ==27401== by 0x8D70A2: DirectFunctionCall2Coll (fmgr.c:1050) ==27401== by 0x180822C6: gbt_bitcmp (btree_bit.c:68) ==27401== by 0x180794CA: gbt_var_bin_union (btree_utils_var.c:249) ==27401== by 0x18079EC2: gbt_var_picksplit (btree_utils_var.c:496) ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) ==27401== by 0x498491: gistSplitByKey (gistsplit.c:697) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== Uninitialised value was created by a heap allocation ==27401== at 0x4C274A0: malloc (vg_replace_malloc.c:291) ==27401== by 0x8F8EE1: AllocSetAlloc (aset.c:853) ==27401== by 0x8FAA71: palloc (mcxt.c:677) ==27401== by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42) ==27401== by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298) ==27401== by 0x48E4A6: gistdentryinit (gistutil.c:549) ==27401== by 0x498217: gistSplitByKey (gistsplit.c:644) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== ==27401== Conditional jump or move depends on uninitialised value(s) ==27401== at 0x180794CD: gbt_var_bin_union (btree_utils_var.c:249) ==27401== by 0x18079EC2: gbt_var_picksplit (btree_utils_var.c:496) ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) ==27401== by 0x498491: gistSplitByKey (gistsplit.c:697) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== by 0x499324: gistBuildCallback (gistbuild.c:490) ==27401== Uninitialised value was created by a heap allocation ==27401== at 0x4C274A0: malloc (vg_replace_malloc.c:291) ==27401== by 0x8F8EE1: AllocSetAlloc (aset.c:853) ==27401== by 0x8FAA71: palloc (mcxt.c:677) ==27401== by 0x18078E37: gbt_var_decompress (btree_utils_var.c:42) ==27401== by 0x8D79E5: FunctionCall1Coll (fmgr.c:1298) ==27401== by 0x48E4A6: gistdentryinit (gistutil.c:549) ==27401== by 0x498217: gistSplitByKey (gistsplit.c:644) ==27401== by 0x48BEEC: gistSplit (gist.c:1270) ==27401== by 0x48899D: gistplacetopage (gist.c:242) ==27401== by 0x48BAEE: gistinserttuples (gist.c:1134) ==27401== by 0x48BA73: gistinserttuple (gist.c:1093) ==27401== by 0x48A71F: gistdoinsert (gist.c:750) ==27401== Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 05/06/2014 10:03 PM, Andres Freund wrote: > Hi, > > I am not sure whether these are new for 9.4 but they're worth looking at > anyway: > Valgrind was run with: > --trace-children=yes --quiet \ > --track-origins=yes --leak-check=no \ > --read-var-info=yes \ > --suppressions=/home/andres/src/postgresql/src/tools/valgrind.supp \ > <postgres with options> > All seem to be due btree_bit's fault and they seem to have a common > origin. Didn't look at code. > > ==27401== Conditional jump or move depends on uninitialised value(s) > ==27401== at 0x4C2C812: bcmp (mc_replace_strmem.c:935) > ==27401== by 0x8A8533: byteacmp (varlena.c:2735) > ==27401== by 0x8D70A2: DirectFunctionCall2Coll (fmgr.c:1050) > ==27401== by 0x180822C6: gbt_bitcmp (btree_bit.c:68) > ==27401== by 0x18079C05: gbt_vsrt_cmp (btree_utils_var.c:430) > ==27401== by 0x919F66: qsort_arg (qsort_arg.c:121) > ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) > ==27401== by 0x91A531: qsort_arg (qsort_arg.c:186) > ==27401== by 0x18079E68: gbt_var_picksplit (btree_utils_var.c:484) > ==27401== by 0x180825AC: gbt_bit_picksplit (btree_bit.c:180) > ==27401== by 0x8D7AD3: FunctionCall2Coll (fmgr.c:1324) > ==27401== by 0x497701: gistUserPicksplit (gistsplit.c:433) In a btree_gist bit/varbit index, an internal index item consists of a lower and upper key. The full Varbit struct is not stored as the lower/upper key, but only the "bits" array. The lower key is expanded to next INTALIGNed size, so that when the lower and upper keys are put together into one Datum, the length-field of the upper key is properly aligned. The bug is in gbt_bit_xfrm(), which expands a VarBit struct to an INTALIGNed bytea, containing just the bits array. It doesn't initialize the padding bytes. That can confuse comparisons against lower/upper keys, as the comparison routine doesn't know which bytes are padding, and will merrily compare them as well. I was able to get wrong query results after modifying gbt_bit_xfrm() to knowingly initialize the padding bytes to random(): create table varbittest (b varbit); insert into varbittest select '001'::varbit from generate_series(1, 100) union all select '01'::varbit from generate_series(1, 100) union all select '1'::varbit from generate_series(1, 100); create index varbittest_idx on varbittest using gist(b); set enable_seqscan=off; -- should return 200 postgres=# select count(*) from varbittest where b >= '01'; count ------- 169 (1 row) I committed a fix to gbt_bit_xfrm() to zero the padding bytes. However, if you have an existing btree_gist index on bit/varbit, you will have to reindex as there can be non-zero padding bytes on disk already. That should be mentioned in the release notes. Thanks for the Valgrinding! - Heikki
On 2014-05-13 15:33:08 +0300, Heikki Linnakangas wrote: > On 05/06/2014 10:03 PM, Andres Freund wrote: > >Hi, > > > >I am not sure whether these are new for 9.4 but they're worth looking at > >anyway: > >Valgrind was run with: > > --trace-children=yes --quiet \ > > --track-origins=yes --leak-check=no \ > > --read-var-info=yes \ > > --suppressions=/home/andres/src/postgresql/src/tools/valgrind.supp \ > > <postgres with options> > >All seem to be due btree_bit's fault and they seem to have a common > >origin. Didn't look at code. > I committed a fix to gbt_bit_xfrm() to zero the padding bytes. Thanks! > Thanks for the Valgrinding! That leaves the spgist thing and some infrastructure till we can setup a valgrind animal... From what we've caught with it so far that seems rather worthwile. What's your plans with your spgist fix? Commit it once 9.5 is branched? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 05/13/2014 05:13 PM, Andres Freund wrote: > That leaves the spgist thing and some infrastructure till we can setup a > valgrind animal... From what we've caught with it so far that seems > rather worthwile. > What's your plans with your spgist fix? Commit it once 9.5 is branched? Good question. I don't know. I would still like to commit it to 9.4. It doesn't require catalog changes, but it's an incompatible change in the WAL record format. If we commit it to 9.4, it means that you cannot replicate between 9.4beta1 and 9.4beta2. I think that's OK, but how do others feel about that? - Heikki
On 2014-05-14 14:55:38 +0300, Heikki Linnakangas wrote: > On 05/13/2014 05:13 PM, Andres Freund wrote: > >That leaves the spgist thing and some infrastructure till we can setup a > >valgrind animal... From what we've caught with it so far that seems > >rather worthwile. > >What's your plans with your spgist fix? Commit it once 9.5 is branched? > > Good question. I don't know. I would still like to commit it to 9.4. It > doesn't require catalog changes, but it's an incompatible change in the WAL > record format. If we commit it to 9.4, it means that you cannot replicate > between 9.4beta1 and 9.4beta2. I think that's OK, but how do others feel > about that? I personally think that's not too bad. Alternatively we could come up with a second way that's not breaking the WAL format. I'd feel better if we had a back patchable fix... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > On 05/13/2014 05:13 PM, Andres Freund wrote: >> What's your plans with your spgist fix? Commit it once 9.5 is branched? > Good question. I don't know. I would still like to commit it to 9.4. It > doesn't require catalog changes, but it's an incompatible change in the > WAL record format. If we commit it to 9.4, it means that you cannot > replicate between 9.4beta1 and 9.4beta2. I think that's OK, but how do > others feel about that? I think that's an OK restriction as long as we warn people about it (you could update a replication pair as long as you shut them both down cleanly at the same time, right?). Can the WAL replay routine be made to detect incompatible records? What worries me more is that post-beta1 fixes will, by definition, get noticeably less beta testing than anything that went out in beta1. So how confident are you in this fix? Is it something we'd consider back-patching in the absence of a WAL-format issue? regards, tom lane
On 2014-05-14 10:07:18 -0400, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: > > On 05/13/2014 05:13 PM, Andres Freund wrote: > >> What's your plans with your spgist fix? Commit it once 9.5 is branched? > > > Good question. I don't know. I would still like to commit it to 9.4. It > > doesn't require catalog changes, but it's an incompatible change in the > > WAL record format. If we commit it to 9.4, it means that you cannot > > replicate between 9.4beta1 and 9.4beta2. I think that's OK, but how do > > others feel about that? > > I think that's an OK restriction as long as we warn people about it > (you could update a replication pair as long as you shut them both > down cleanly at the same time, right?). Can the WAL replay routine > be made to detect incompatible records? We could just bump the wal version. Somewhat surprisingly that works if both nodes are shutdown cleanly (primary first)... But the errors about it are really ugly (will moan about unusable checkpoints), so it's probably not a good idea. Especially as it'll make it an issue for all users, not just the ones creating spgist indexes. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-14 10:07:18 -0400, Tom Lane wrote: >> I think that's an OK restriction as long as we warn people about it >> (you could update a replication pair as long as you shut them both >> down cleanly at the same time, right?). Can the WAL replay routine >> be made to detect incompatible records? > We could just bump the wal version. Somewhat surprisingly that works if > both nodes are shutdown cleanly (primary first)... But the errors about > it are really ugly (will moan about unusable checkpoints), so it's > probably not a good idea. Especially as it'll make it an issue for all > users, not just the ones creating spgist indexes. Yeah, I don't think we want to bump the WAL version code post-beta1. Probably better to assign the modified spgist record a new xl_info ID number, so that a beta1 slave would throw an error for it. regards, tom lane
On 2014-05-14 12:20:55 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-05-14 10:07:18 -0400, Tom Lane wrote: > >> I think that's an OK restriction as long as we warn people about it > >> (you could update a replication pair as long as you shut them both > >> down cleanly at the same time, right?). Can the WAL replay routine > >> be made to detect incompatible records? > > > We could just bump the wal version. Somewhat surprisingly that works if > > both nodes are shutdown cleanly (primary first)... But the errors about > > it are really ugly (will moan about unusable checkpoints), so it's > > probably not a good idea. Especially as it'll make it an issue for all > > users, not just the ones creating spgist indexes. > > Yeah, I don't think we want to bump the WAL version code post-beta1. > > Probably better to assign the modified spgist record a new xl_info ID > number, so that a beta1 slave would throw an error for it. Since that ship has now sailed...? It's imo bad form to release a new version that overwrites the stack and heap, even if we can't see a concrete danger. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-05-14 12:20:55 -0400, Tom Lane wrote: >> Yeah, I don't think we want to bump the WAL version code post-beta1. >> >> Probably better to assign the modified spgist record a new xl_info ID >> number, so that a beta1 slave would throw an error for it. > Since that ship has now sailed...? It's imo bad form to release a new > version that overwrites the stack and heap, even if we can't see a > concrete danger. Yeah, no longer much reason to avoid changing the WAL version code. regards, tom lane