Re: Corruption with duplicate primary key - Mailing list pgsql-hackers

From Alex Adriaanse
Subject Re: Corruption with duplicate primary key
Date
Msg-id SN6PR03MB359873DE51E9CD69837E5117A95A0@SN6PR03MB3598.namprd03.prod.outlook.com
Whole thread Raw
In response to Re: Corruption with duplicate primary key  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, December 5, 2019 at 5:34 PM Peter Geoghegan wrote:
> > We have a Postgres 10 database that we recently upgraded to Postgres 12 using pg_upgrade. We recently discovered
thatthere are rows in one of the tables that have duplicate primary keys: 
>
> What's the timeline here? In other words, does it look like these rows
> were updated and/or deleted before, around the same time as, or after
> the upgrade?

The Postgres 12 upgrade was performed on 2019-11-22, so the affected rows were modified after this upgrade (although
someof the rows were originally inserted before then, before they were modified/duplicated). 

> > This database runs inside Docker, with the data directory bind-mounted to a reflink-enabled XFS filesystem. The VM
isrunning Debian's 4.19.16-1~bpo9+1 kernel inside an AWS EC2 instance. We have Debezium stream data from this database
viapgoutput. 
>
> That seems suspicious, since reflink support for XFS is rather immature.

Good point. Looking at kernel commits since 4.19.16 it appears that there have been a few bug fixes in later kernel
versionsthat address a few XFS corruption issues. Regardless of whether FS bugs are responsible of this corruption I'll
planon upgrading to a newer kernel. 

> How did you invoke pg_upgrade? Did you use the --link (hard link) option?

Yes, we first created a backup using "cp -a --reflink=always", ran initdb on the new directory, and then upgraded using
"pg_upgrade-b ... -B ... -d ... -D -k". 

Alex


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: non-exclusive backup cleanup is mildly broken
Next
From: Alex Adriaanse
Date:
Subject: Re: Corruption with duplicate primary key