Thread: [HACKERS] Re: Crash report for some ICU-52 (debian8) COLLATE and work_memvalues

Adding -hackers.

On Sat, Aug 05, 2017 at 03:55:13PM -0700, Noah Misch wrote:
> On Thu, Aug 03, 2017 at 11:42:25AM -0700, Peter Geoghegan wrote:
> > On Thu, Aug 3, 2017 at 8:49 AM, Daniel Verite <daniel@manitou-mail.org> wrote:
> > > With query #2 it ends up crashing after ~5hours  and produces
> > > the log in log-valgrind-2.txt.gz with some other entries than
> > > case #1, but AFAICS still all about reading  uninitialised values
> > > in space allocated by datumCopy().
> > 
> > Right. This part is really interesting to me:
> > 
> > ==48827==  Uninitialised value was created by a heap allocation
> > ==48827==    at 0x4C28C20: malloc (vg_replace_malloc.c:296)
> > ==48827==    by 0x80B597: AllocSetAlloc (aset.c:771)
> > ==48827==    by 0x810ADC: palloc (mcxt.c:862)
> > ==48827==    by 0x72BFEF: datumCopy (datum.c:171)
> > ==48827==    by 0x81A74C: tuplesort_putdatum (tuplesort.c:1515)
> > ==48827==    by 0x5E91EB: advance_aggregates (nodeAgg.c:1023)
> > 
> > If you actually go to datum.c:171, you see that that's a codepath for
> > pass-by-reference datatypes that lack a varlena header. Text is a
> > datatype that has a varlena header, though, so that's clearly wrong. I
> > don't know how this actually happened, but working back through the
> > relevant tuplesort_begin_datum() caller, initialize_aggregate(), does
> > suggest some things. (tuplesort_begin_datum() is where datumTypeLen is
> > determined for the entire datum tuplesort.)
> > 
> > I am once again only guessing, but I have to wonder if this is a
> > problem in commit b8d7f053. It seems likely that the problem begins
> > before tuplesort_begin_datum() is even called, which is the basis of
> > this suspicion. If the problem is within tuplesort, then that could
> > only be because get_typlenbyval() gives wrong answers, which seems
> > very unlikely.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Peter
> (Eisentraut), since you committed the patch believed to have created it, you
> own this open item.  If some other commit is more relevant or if this does not
> belong as a v10 open item, please let us know.  Otherwise, please observe the
> policy on open item ownership[1] and send a status update within three
> calendar days of this message.  Include a date for your subsequent status
> update.  Testers may discover new open items at any time, and I want to plan
> to get them all fixed well in advance of shipping v10.  Consequently, I will
> appreciate your efforts toward speedy resolution.  Thanks.
> 
> [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



Re: [HACKERS] Re: Crash report for some ICU-52 (debian8) COLLATE andwork_mem values

From
Peter Eisentraut
Date:
On 8/5/17 18:56, Noah Misch wrote:
>> [Action required within three days.  This is a generic notification.]

I'm awaiting further testing and discussion.  Probably nothing happening
for beta3.  Will report on Thursday.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services