Re: valgrind error - Mailing list pgsql-hackers

From Noah Misch
Subject Re: valgrind error
Date
Msg-id 20200605074856.GA2473603@rfd.leadboat.com
Whole thread Raw
In response to Re: valgrind error  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
Responses Re: valgrind error
List pgsql-hackers
On Sun, May 10, 2020 at 09:29:05AM -0400, Andrew Dunstan wrote:
> On 4/18/20 9:15 AM, Andrew Dunstan wrote:
> > I was just trying to revive lousyjack, my valgrind buildfarm animal
> > which has been offline for 12 days, after having upgraded the machine
> > (fedora 31, gcc 9.3.1, valgrind 3.15) and noticed lots of errors like this:

> > {
> >    <insert_a_suppression_name_here>
> >    Memcheck:Value8
> >    fun:pg_comp_crc32c_sb8
> >    fun:XLogRecordAssemble
> >    fun:XLogInsert
> >    fun:LogCurrentRunningXacts
> >    fun:LogStandbySnapshot
> >    fun:CreateCheckPoint
> >    fun:CheckpointerMain
> >    fun:AuxiliaryProcessMain
> >    fun:StartChildProcess
> >    fun:reaper
> >    obj:/usr/lib64/libpthread-2.30.so
> >    fun:select
> >    fun:ServerLoop
> >    fun:PostmasterMain
> >    fun:main
> > }

> After many hours of testing I have a culprit for this. The error appears
> with valgrind 3.15.0  with everything else held constant. 3.14.0  does
> not produce the problem.

I suspect 3.15.0 is just better at tracking the uninitialized data.  A
more-remote possibility is valgrind-3.14.0 emulating sse42.  That would make
pg_crc32c_sse42_available() return true, avoiding the pg_comp_crc32c_sb8().

> andrew@freddo:bf (master)*$ lscpu
...
> Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
> pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
> fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl
> nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy
> svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
> skinit wdt hw_pstate vmmcall npt lbrv svm_lock nrip_save
> 
> 
> I did not manage to reproduce this anywhere else, tried on various
> physical, Virtualbox and Docker instances.

I can reproduce this on a 2017-vintage CPU with ./configure
... USE_SLICING_BY_8_CRC32C=1 and then running "make installcheck-parallel"
under valgrind-3.15.0 (as packaged by RHEL 7.8).  valgrind.supp has a
suppression for CRC calculations, but it didn't get the memo when commit
4f700bc renamed the function.  The attached patch fixes the suppression.

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: OpenSSL 3.0.0 compatibility
Next
From: Michael Paquier
Date:
Subject: Re: BufFileRead() error signalling