On 2013-01-16 00:26:01 +0100, Andres Freund wrote:
> On 2013-01-15 17:56:40 -0500, Tom Lane wrote:
> > Andres Freund <andres@2ndquadrant.com> writes:
> > > I played a bit arround (thanks Sergey!) and it seems to be some rather
> > > strange optimization issue around the fsync request queue.
> >
> > > Namely changing
> > > request->rnode = rnode;
> > > into
> > > request->rnode.spcNode = rnode.spcNode;
> > > request->rnode.dbNode = rnode.dbNode;
> > > request->rnode.relNode = rnode.relNode;
> > > makes it pass reliably.
> >
> > Jeez. That's my candidate for weird compiler bug of the month.
> >
> > > How the hell thats correlating with the elog changes I don't yet know.
> >
> > There is an elog(ERROR) further up in the same function, but it's sure
> > not clear how that could cause the compiler to misimplement a struct
> > assignment.
>
> Indeed, replacing the elog() there with a plain abort() or the old-style
> elog definition makes it work. Just using a do-while with the old
> definition inside makes it fail.
>
> My IA64 knowledge is pretty basic, but I would guess this is stack or
> code alignment related I seem to remember quite some strange
> requirements there.
FWIW its also triggerable if two other function calls are places inside
the above if() (I tried fprintf(stderr, "argh") and kill(0, 0)).
It seems the change just made an existing issue visible.
No idea what to do about it.
Greetings,
Andres Freund
-- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services