Re: [sqlsmith] crashes in RestoreSnapshot on hot standby - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [sqlsmith] crashes in RestoreSnapshot on hot standby
Date
Msg-id CAA4eK1J9QOMhEAnOysFQrca_x3Ea4r+N6Vrz5EgYMmc0+zr67Q@mail.gmail.com
Whole thread Raw
In response to Re: [sqlsmith] crashes in RestoreSnapshot on hot standby  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [sqlsmith] crashes in RestoreSnapshot on hot standby  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Fri, Jul 1, 2016 at 8:48 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
> On Fri, Jul 1, 2016 at 2:17 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Fri, Jul 1, 2016 at 6:26 AM, Andreas Seltenreich <seltenreich@gmx.de> wrote:
> >> #1  0x0000000000822032 in RestoreSnapshot (start_address=start_address@entry=0x7f2701d5a110 <error: Cannot access memory at address 0x7f2701d5a110>) at snapmgr.c:2020
> >
> >         memcpy(snapshot->subxip, serialized_xids + serialized_snapshot->xcnt,
> >                serialized_snapshot->subxcnt * sizeof(TransactionId));
> > So this is choking here? Is one of those pointers NULL?
>
> Theory 1:
> If serialized_snapshot->xcnt == 0, then snapshot->xip never gets
> initialized to a non-NULL value.  Then if serialized_snapshot->subxcnt
> > 0, we set snapshot->subxip = snapshot->xip +
> serialized_snapshot->xcnt (so that's NULL too).  Then in line the line
> you show we call memcpy(snapshot->subxip, ...).  The fix might be
> something like the attached.
>

I was just typing the mail, when I see this mail.  I also reached to the conclusion that this is the reason of crash.  You can see how CopySnapshot calculates the subxipoff, may be writing code that way will be more consistent.  In case of recovery, I think serialized_snapshot->xcnt will always be zero as we fill everything in subxip array (refer below code in GetSnapshotData).

GetSnapshotData()
{
/*
* We're in hot standby, so get XIDs from KnownAssignedXids.
..
..
}


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Is a UDF binary portable across different minor releases and PostgreSQL distributions?
Next
From: Noah Misch
Date:
Subject: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <