Inexplicable duplicate rows with unique constraint - Mailing list pgsql-general
From | Richard van der Hoff |
---|---|
Subject | Inexplicable duplicate rows with unique constraint |
Date | |
Msg-id | 288488b7-d3b6-905b-3706-43f7a3c7bc52@matrix.org Whole thread Raw |
Responses |
Re: Inexplicable duplicate rows with unique constraint
Re: Inexplicable duplicate rows with unique constraint Re: Inexplicable duplicate rows with unique constraint Re: Inexplicable duplicate rows with unique constraint Re: Inexplicable duplicate rows with unique constraint |
List | pgsql-general |
I'm trying to track down the cause of some duplicate rows in a table which I would expect to be impossible due to a unique constraint. I'm hoping that somebody here will be able to suggest something I might have missed. The problem relates to a bug filed against our application (https://github.com/matrix-org/synapse/issues/6696). At first I put this down to random data corruption on a single user's postgres instance, but I've now seen three separate reports in as many days and am wondering if there is more to it. We have a table whose schema is as follows: synapse=# \d current_state_events Table "public.current_state_events" Column | Type | Modifiers ------------+------+----------- event_id | text | not null room_id | text | not null type | text | not null state_key | text | not null membership | text | Indexes: "current_state_events_event_id_key" UNIQUE CONSTRAINT, btree (event_id) "current_state_events_room_id_type_state_key_key" UNIQUE CONSTRAINT, btree (room_id, type, state_key) "current_state_events_member_index" btree (state_key) WHERE type = 'm.room.member'::text Despite the presence of the current_state_events_room_id_type_state_key_key constraint, several users have reported seeing errors which suggest that their tables have duplicate rows for the same (room_id, type, state_key) triplet and indeed querying confirms that to be the case: synapse=> select count(*), room_id, type, state_key from current_state_events group by 2,3,4 order by count(*) DESC LIMIT 2; count | room_id | type | state_key -------+-----------------------------------+---------------+------------------------------------- 3 | !ueLfVrSWYGMnFnoCbT:darkfasel.net | m.room.member | @irc_ebi_:darkfasel.net 3 | !HwocBmCtBcHQhILtYQ:matrix.org | m.room.member | @freenode_AlmarShenwan_:matrix.org (2 rows) Further investigation suggests that these are genuinely separate rows rather than duplicate entries in an index. The index appears to consider itself valid: synapse=> select i.* from pg_class c join pg_index i on i.indexrelid=c.oid where relname='current_state_events_room_id_type_state_key_key'; indexrelid | indrelid | indnatts | indisunique | indisprimary | indisexclusion | indimmediate | indisclustered | indisvalid | indcheckxmin | indisready | indislive | indisreplident | indkey | indcollation | indclass | indoption | indexprs | indpred ------------+----------+----------+-------------+--------------+----------------+--------------+----------------+------------+--------------+------------+-----------+----------------+--------+--------------+----------------+-----------+----------+--------- 17023 | 16456 | 3 | t | f | f | t | f | t | f | t | t | f | 2 3 4 | 100 100 100 | 3126 3126 3126 | 0 0 0 | | (1 row) So, question: what could we be doing wrong to get ourselves into this situation? Some other datapoints which may be relevant: * this has been reported by one user on postgres 9.6.15 and one on 10.10, though it's hard to be certain of the version that was running when the duplication occurred * the constraint is added when the table is first created (before any data is added) * At least one user reports that he has recently migrated his database from one server to another via a `pg_dump -C` and later piping into psql.
pgsql-general by date: