Re: Duplicated IDs - Mailing list pgsql-general

From Alban Hertroys
Subject Re: Duplicated IDs
Date
Msg-id 9264ABF6-AEC8-46A6-A606-6C2EF8AAE036@gmail.com
Whole thread Raw
In response to Re: Duplicated IDs  (Alexis Bernard <alexis@bernard.io>)
Responses Re: Duplicated IDs  (Alexis Bernard <alexis@bernard.io>)
List pgsql-general
On 09 Aug 2014, at 13:24, Alexis Bernard <alexis@bernard.io> wrote:

> Primary key definition: "tracks_pkey" PRIMARY KEY, btree (id)
> select version(): PostgreSQL 9.3.4 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-16ubuntu6) 4.8.2,
64-bit
> Ubuntu trusty with kernel 3.13.0-29-generic.
>
> I have a autovacuum process running and configuration is by default: ps aux | grep vacuum | grep -v grep
> postgres   587  0.0  0.0 1174304 3720 ?        Ss   juil.30   0:05 postgres: autovacuum launcher process
>
> I tried to vacuum manualy.

What’s the output of: VACUUM VERBOSE tracks;
Anything that looks abnormal?

What do below queries return? They’re from the manual at:
http://www.postgresql.org/docs/9.3/static/routine-vacuuming.html

SELECT c.oid::regclass as table_name,
       greatest(age(c.relfrozenxid),age(t.relfrozenxid)) as age
FROM pg_class c
LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r', 'm');

SELECT datname, age(datfrozenxid) FROM pg_database;

I’m wondering if perhaps autovacuum doesn’t run to completion, the above should determine that.

To get out of this pickle, you might want to try VACUUM FULL, but that puts an exclusive lock on the table and creates
anew copy of it while it’s running, so you better have enough disk space to contain that table twice. It does rewrite
thetable though, which is what I hope fixes your issue. 
The normal VACUUM should have updated the frozenid’s appropriately though, a VACUUM FULL should not actually be
necessary.

> I tried reindex:
>
> => reindex index tracks_pkey;
> ERROR:  failed to find parent tuple for heap-only tuple at (38802,116) in table “tracks

The problem is probably not in the index then, or it wouldn’t have trouble re-creating it...

> Here are more information about storage (raid 1):

Do I read that as disks 0 to 2 are part of an LSI SAS? Are you really using raid-1 then, as if I’m reading that
correctlyonly 1 of the 3 disks is actually attached to the raid array (namely the Dell disk) and the others are "not
claimed"(i.e. spares)? 

Anyway, from this it looks like you are using reliable storage that properly honours fsync commands. Or is this a
hardwareraid-controller with no battery backup unit and caching set to write-back instead of write-thru? 

Oh, one last thing, the convention in this mailing list is to reply inline instead of top-posting.

> lshw -class disk -class storage
>   *-storage
>        description: Serial Attached SCSI controller
>        produit: SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]
>        fabriquant: LSI Logic / Symbios Logic
>        identifiant matériel: 0
>        information bus: pci@0000:01:00.0
>        nom logique: scsi0
>        version: 03
>        bits: 64 bits
>        horloge: 33MHz
>        fonctionnalités: storage pm pciexpress vpd msi msix bus_master cap_list rom
>        configuration: driver=mpt2sas latency=0
>        ressources: irq:16 portE/S:2000(taille=256) mémoire:c5140000-c514ffff mémoire:c5100000-c513ffff
mémoire:c5400000-c54fffff
>      *-disk:0 NON-RÉCLAMÉ
>           description: ATA Disk
>           produit: ST1000NM0033-9ZM
>           fabriquant: Seagate
>           identifiant matériel: 0.0.0
>           information bus: scsi@0:0.0.0
>           version: GA04
>           numéro de série: Z1W0HGB9
>           configuration: ansiversion=5
>      *-disk:1
>           description: SCSI Disk
>           produit: Virtual Disk
>           fabriquant: Dell
>           identifiant matériel: 1.0.0
>           information bus: scsi@0:1.0.0
>           nom logique: /dev/sda
>           version: 1028
>           taille: 931GiB (999GB)
>           capacité: 931GiB (999GB)
>           fonctionnalités: 15000rpm partitioned partitioned:dos
>           configuration: ansiversion=6 sectorsize=512 signature=000e59a7
>      *-disk:2 NON-RÉCLAMÉ
>           description: ATA Disk
>           produit: ST1000NM0033-9ZM
>           fabriquant: Seagate
>           identifiant matériel: 0.1.0
>           information bus: scsi@0:0.1.0
>           version: GA04
>           numéro de série: Z1W0HFYZ
>           configuration: ansiversion=5
>   *-storage
>        description: SATA controller
>        produit: 6 Series/C200 Series Chipset Family SATA AHCI Controller
>        fabriquant: Intel Corporation
>        identifiant matériel: 1f.2
>        information bus: pci@0000:00:1f.2
>        version: 04
>        bits: 32 bits
>        horloge: 66MHz
>        fonctionnalités: storage msi pm ahci_1.0 bus_master cap_list
>        configuration: driver=ahci latency=0
>        ressources: irq:43 portE/S:3048(taille=8) portE/S:3054(taille=4) portE/S:3040(taille=8) portE/S:3050(taille=4)
portE/S:3020(taille=32)mémoire:c5204000-c52047ff 
>
> Cheers,
> Alexis.
>
>
> 2014-08-09 12:35 GMT+02:00 Alban Hertroys <haramrae@gmail.com>:
> On 09 Aug 2014, at 11:38, Alexis Bernard <alexis@bernard.io> wrote:
>
> > Hi,
> >
> > I am having regurlarly duplicated rows with the same primary key.
> >
> > => select id, created_at, updated_at from tracks where created_at = '2014-07-03 15:09:16.336488';
> >    id   |         created_at         |         updated_at
> > --------+----------------------------+----------------------------
> >  331804 | 2014-07-03 15:09:16.336488 | 2014-07-03 15:37:55.253729
> >  331804 | 2014-07-03 15:09:16.336488 | 2014-07-03 15:37:55.253801
> >
> > => select id from tracks where id = 331804;
> >  id
> > ----
> > (0 rows)
> >
>
> First of all, what is the definition of that primary key?
> What exact version of PG are you using?: select version();
> What exact OS is this on? What kind of storage?
> To me it looks like you may have run into transaction wrap-around or a corrupted index.
>
> Before you do anything, make a backup.
>
> Theorising that the issue here indeed is transaction wrap-around, what you’re seeing may be data from older
transactionsthat has become newer because your current transaction txid is lower (due to the wraparound) than the txid
ofthe transactions those rows belong(ed) to. If those transactions were committed, then you’re possibly seeing deleted
orupdated rows that are still around. TX wraparound can occur if you do not vacuum frequently enough and another thing
thatvacuum does is mark old rows obsolete so that the DB can reclaim the space they use. Seeing data from rows that are
nolonger there or that has been modified since seems to fit the bill here. 
>
> Hence the question: When did you last (auto-)vacuum this table? Did you perhaps turn autovacuum off? Did it fall
behind?
>
> If you have been vacuuming and the issue is a corrupt index: Does it help to reindex that table? You said that you
havefsync on; what kind of storage is this database on? Something allowed that index to get corrupted. It is more
likelythat it’s caused by something in the underlying storage system (including the OS) than that it is a bug in PG. 
>
> > => delete from tracks where created_at = '2014-07-03 15:09:16.336488' and updated_at = '2014-07-03
15:37:55.253801';
> > ERROR:  update or delete on table "tracks" violates foreign key constraint "fk_sources_on_track_id" on table
"sources"
> > DETAIL:  Key (id)=(331804) is still referenced from table "sources”.
>
> Apparently there is a row from another table referencing this one. So either the referenced row does actually exist
(corruptindex theory) or it doesn’t and the referencing row is from an older transaction as well (TX wraparound
theory).
>
> Considering that you’re seeing this regularly, my bet is on TX wraparound.
>
> Alban Hertroys
> --
> If you can't see the forest for the trees,
> cut the trees and you'll find there is no forest.
>
>

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: postgresql referencing and creating types as record
Next
From: Seref Arikan
Date:
Subject: Disk space available, but getting error "could not write to hash-join temporary file: No space left on device"