BUG #8656: Duplicate data violating unique constraints - Mailing list pgsql-bugs
From | maciek@heroku.com |
---|---|
Subject | BUG #8656: Duplicate data violating unique constraints |
Date | |
Msg-id | E1VoIgB-0005Ok-3X@wrigleys.postgresql.org Whole thread Raw |
Responses |
Re: BUG #8656: Duplicate data violating unique constraints
Re: BUG #8656: Duplicate data violating unique constraints |
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 8656 Logged by: Maciek Sakrejda Email address: maciek@heroku.com PostgreSQL version: 9.3.1 Operating system: Ubuntu 12.04 LTS 64-bit Description: A customer has run into an issue where data in a single table was apparently duplicated somehow, violating the unique constraint imposed by the primary key: => select id, count(*) from post group by id having count(*) > 1; id | count ---------+------- 1836573 | 2 1855409 | 2 1855421 | 2 1855634 | 2 1855665 | 2 1866749 | 2 1877913 | 2 1877914 | 2 1886591 | 2 1897572 | 2 (10 rows) What's more, the table has 12 columns, and the data for each column for all of these duplicate rows is identical between the two versions. This table also has an updated_at column maintained by triggers, and the updated_at for all of these duplicated rows seems to have happened in a brief window between 2013-12-01 06:09:31.138317+00 and 2013-12-01 06:13:07.398258+00. As far as we can tell, nothing unusual was happening in the application around this time. In case the table schema itself is relevant, here it is (somewhat anonymized at the customer's request): Column | Type | Modifiers ----------------------+--------------------------+------------------------------------------------------------ id | integer | not null default nextval('post_id_seq'::regclass) col2 | integer | not null col3 | timestamp with time zone | not null col4 | character varying(100) | not null col5 | integer | not null col6 | integer | not null col7 | timestamp with time zone | not null updated_at | timestamp with time zone | not null col9 | timestamp with time zone | not null col10 | integer | not null col11 | integer | not null col12 | character varying(100) | not null Indexes: "post_pkey" PRIMARY KEY, btree (id) "post_col4_idx" UNIQUE CONSTRAINT, btree (col4) "post_col2_col3_idx" btree (col2, col3) CLUSTER Foreign-key constraints: "post_col2_fkey" FOREIGN KEY (col2) REFERENCES other_table(id) ON DELETE CASCADE Triggers: t1 BEFORE INSERT ON post FOR EACH ROW EXECUTE PROCEDURE on_insert() t2 BEFORE UPDATE ON post FOR EACH ROW EXECUTE PROCEDURE on_update() CREATE OR REPLACE FUNCTION on_insert() RETURNS trigger LANGUAGE plpgsql AS $function$ BEGIN NEW.version = 1; NEW.created_at = timezone('UTC', now()); NEW.updated_at = NEW.created_at; RETURN NEW; END; $function$ CREATE OR REPLACE FUNCTION on_update() RETURNS trigger LANGUAGE plpgsql AS $function$ BEGIN NEW.version = OLD.version + 1; NEW.created_at = OLD.created_at; NEW.updated_at = timezone('UTC', now()); RETURN NEW; END; $function$ The data has been deleted in the primary system, but it was captured in a pg_dump backup, and I was able to restore that (except for the unique constraints, obviously) in a separate database and can dig in further. This is a fresh database, restored from a pg_dump backup this past Saturday, *not* a promoted replica, so I don't think the recent replication issues come into play here. Any ideas?
pgsql-bugs by date: