Re: Inexplicable duplicate rows with unique constraint - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: Inexplicable duplicate rows with unique constraint |
Date | |
Msg-id | 16bd51ab-69ef-04f9-d47a-3e692b1ff595@aklaver.com Whole thread Raw |
In response to | Inexplicable duplicate rows with unique constraint (Richard van der Hoff <richard@matrix.org>) |
List | pgsql-general |
On 1/16/20 8:50 AM, Richard van der Hoff wrote: > I'm trying to track down the cause of some duplicate rows in a table > which I would expect to be impossible due to a unique constraint. I'm > hoping that somebody here will be able to suggest something I might have > missed. > > The problem relates to a bug filed against our application > (https://github.com/matrix-org/synapse/issues/6696). At first I put this > down to random data corruption on a single user's postgres instance, but > I've now seen three separate reports in as many days and am wondering if > there is more to it. > > We have a table whose schema is as follows: > > synapse=# \d current_state_events > Table "public.current_state_events" > Column | Type | Modifiers > ------------+------+----------- > event_id | text | not null > room_id | text | not null > type | text | not null > state_key | text | not null > membership | text | > Indexes: > "current_state_events_event_id_key" UNIQUE CONSTRAINT, btree > (event_id) > "current_state_events_room_id_type_state_key_key" UNIQUE > CONSTRAINT, btree (room_id, type, state_key) > "current_state_events_member_index" btree (state_key) WHERE type = > 'm.room.member'::text > > Despite the presence of the > current_state_events_room_id_type_state_key_key constraint, several > users have reported seeing errors which suggest that their tables have > duplicate rows for the same (room_id, type, state_key) triplet and > indeed querying confirms that to be the case: > > synapse=> select count(*), room_id, type, state_key from > current_state_events group by 2,3,4 order by count(*) DESC LIMIT 2; > count | room_id | type | state_key > -------+-----------------------------------+---------------+------------------------------------- > > 3 | !ueLfVrSWYGMnFnoCbT:darkfasel.net | m.room.member | > @irc_ebi_:darkfasel.net > 3 | !HwocBmCtBcHQhILtYQ:matrix.org | m.room.member | > @freenode_AlmarShenwan_:matrix.org > (2 rows) > I'm assuming the above are obfuscated? > > Further investigation suggests that these are genuinely separate rows > rather than duplicate entries in an index. If you use length() on the values are they the same? > > The index appears to consider itself valid: > > synapse=> select i.* from pg_class c join pg_index i on > i.indexrelid=c.oid where > relname='current_state_events_room_id_type_state_key_key'; > indexrelid | indrelid | indnatts | indisunique | indisprimary | > indisexclusion | indimmediate | indisclustered | indisvalid | > indcheckxmin | indisready | indislive | indisreplident | indkey | > indcollation | indclass | indoption | indexprs | indpred > ------------+----------+----------+-------------+--------------+----------------+--------------+----------------+------------+--------------+------------+-----------+----------------+--------+--------------+----------------+-----------+----------+--------- > > 17023 | 16456 | 3 | t | f | f > | t | f | t | f | t > | t | f | 2 3 4 | 100 100 100 | 3126 > 3126 3126 | 0 0 0 | | > (1 row) > > So, question: what could we be doing wrong to get ourselves into this > situation? > > Some other datapoints which may be relevant: > * this has been reported by one user on postgres 9.6.15 and one on > 10.10, though it's hard to be certain of the version that was running > when the duplication occurred > * the constraint is added when the table is first created (before any > data is added) > * At least one user reports that he has recently migrated his database > from one server to another via a `pg_dump -C` and later piping into psql. > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: