Re: Corruption with duplicate primary key - Mailing list pgsql-hackers

From Alex Adriaanse
Subject Re: Corruption with duplicate primary key
Date
Msg-id SN6PR03MB3598D5C4ECDC97D0EF5360B6A95A0@SN6PR03MB3598.namprd03.prod.outlook.com
Whole thread Raw
In response to Re: Corruption with duplicate primary key  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Corruption with duplicate primary key  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Thu., December 5, 2019 at 5:45 PM, Tomas Vondra wrote:
> At first I thought maybe this might be due to collations
> changing and breaking the index silently. What collation are you using?

We're using en_US.utf8. We did not make any collation changes to my knowledge.

> 1) When you do the queries, do they use index scan or sequential scan?
> Perhaps it does sequential scan, and if you force index scan (e.g. by
> rewriting the query) it'll only find one of those rows.

By default it used an index scan. When I re-ran the query today (and confirmed that the query used an index only scan)
Idid not see any duplicates. If I force a sequential scan using "SET enable_index[only]scan = false" the duplicates
reappear.

However, using a backup from a week ago I see duplicates in both the query that uses an index only scan as well as the
querythat uses the sequential scan. So somehow over the past week the index got changed to eliminate duplicates. 

> 2) Can you check in backups if this data corruption was present in the
> PG10 cluster, before running pg_upgrade?

Sure. I just checked and did not see any corruption in the PG10 pre-upgrade backup. I also re-upgraded that PG10 backup
toPG12, and right after the upgrade I did not see any corruption either. I checked using both index scans and
sequentialscans. 

Alex


pgsql-hackers by date:

Previous
From: Alex Adriaanse
Date:
Subject: Re: Corruption with duplicate primary key
Next
From: Alex Adriaanse
Date:
Subject: Re: Corruption with duplicate primary key