Thread: BUG #1242: Major bug in pgSQL

BUG #1242: Major bug in pgSQL

From

"PostgreSQL Bugs List"

Date:

07 September 2004, 08:57:24

The following bug has been logged online:

Bug reference:      1242
Logged by:

Email address:      freecontact@freenet.de

PostgreSQL version: 7.4.3

Operating system:   Linux Debian 3.1

Description:        Major bug in pgSQL

Details:

Hi,

apparently we have found a critical bug in pgSQL. At the present moment we
can not reproduce it, but here is the description:

Foreword: we have a very high loaded pgSQL-based application with thousands
of simultaneous users and overall 20 high performance server. Another
nice-to-know thing to mention - we have very large transaction blocks, with
some hunderds of SQL statements in each block.

Problem: in some cases we experience the following problem - we have found
in the database some _absolutely_ identical rows, despite the fact, that we
have defined some unique (!) indexes on some of the fields and even primary
(!) keys, we can see, that the rows are _exactly_ the same. In some cases we
have seen up to 7 absolutely identical rows, with the same primary keys and
the same unique indexed fields.

This problem is in our eyes absolutely critical. We are even considering
right now the change to another DBMS :(, even though we were in the past
always very satisfied with pgSQL...

We are looking forward to hear, if there are any known solutions for this
kind of problem. Thank you very much!

Re: BUG #1242: Major bug in pgSQL

From

Gaetano Mendola

Date:

07 September 2004, 14:37:50

PostgreSQL Bugs List wrote:

> Problem: in some cases we experience the following problem - we have found
> in the database some _absolutely_ identical rows, despite the fact, that we
> have defined some unique (!) indexes on some of the fields and even primary
> (!) keys, we can see, that the rows are _exactly_ the same. In some cases we
> have seen up to 7 absolutely identical rows, with the same primary keys and
> the same unique indexed fields.

I had the same experiences in 7.3 release and I realized that this is due to
some interaction between vacuum, reindex and update on the same table. See
this posts:

http://archives.postgresql.org/pgsql-bugs/2003-05/msg00060.php
http://www.mail-archive.com/pgsql-admin@postgresql.org/msg09025.html
http://archives.postgresql.org/pgsql-admin/2003-04/msg00407.php
http://archives.postgresql.org/pgsql-bugs/2003-11/msg00129.php

unfortunatelly I never was able to reproduce it.

When you are experiencing this show us the result of this query:

select cmax, cmin, xmax, xmin, * from <table> where <your condition>;

where <your condition> is a filter in order to obtain the rows wit the
primary key duplicated.

However I'm sure that you don't have two row with duplicated primary
key but two version of the same row, the result however is the same.

Are you reindexing your tables regulary ?

Regards
Gaetano Mendola

Re: BUG #1242: Major bug in pgSQL

From

Tom Lane

Date:

07 September 2004, 15:09:19

Gaetano Mendola <mendola@bigfoot.com> writes:
> When you are experiencing this show us the result of this query:
> select cmax, cmin, xmax, xmin, * from <table> where <your condition>;

Also, please, the ctid and oid columns (but leave out oid if you made
the table WITHOUT OIDS).

Also, if the condition is one that will normally use an index, try
the same query with and without "set enable_indexscan = off".  It
could be that a corrupted index would cause the query to visit the
same rows multiple times (or miss rows!).

It might be a good idea to REINDEX the primary-key index on the table,
but I would counsel not doing so until we have more data on what's
happening.  If the problem is index corruption then REINDEX would
destroy all the evidence ...

            regards, tom lane

Re: BUG #1242: Major bug in pgSQL

From

"Nobody"

Date:

11 September 2004, 05:30:37

Thank you very much for a quick answer. Here is our situation:

Query:
select cmax, cmin, xmax, xmin, ctid, oid, *
from unitcontainer where unitcontainername = 'Track';

Result:
oid        id_unitcontainer    id_basecomb    id_unitcontainertype    id_userinstance
unitcontainername
16.995.030    10.464        1.009        1                1.063            'Track'
cmax 102, cmin 126761922, xmax 126761922, xmin 121168748, ctid (84,107)

so, only ONE row. "unitcontainername" has an unique index ("index_12"). Here
is the result for the following query:

Query:
reindex index index_12;

Result:
ERROR:  could not create unique index
DETAIL:  Table contains duplicated values.

Unable to reindex as well :(.

Now comes the most interesting part.

Query:
select cmax, cmin, xmax, xmin, ctid, oid, * from ds_unitcontainer where oid
= 16995030

Result:
oid        id_unitcontainer    id_unitcontainertype    id_userinstance    unitcontainername
16.995.030    10.464        1                1.063            'Track'
cmax 121457766, cmin 21, xmax 121168748, xmin 121168748, ctid (83,79)

16.995.030    10.464        1                1.063            'Track'
cmax 102, cmin 126761922, xmax 126761922, xmin 121168748, ctid (84,107)

Now we get TWO rows! The most astonishing thing about it - there are whole 3
uniqueness violations:
1) primary key violation for "id_unitcontainer"
2) unique index violation for "unitcontainername"
3) the most "crazy" one - for "oid" !

Please let us know, what we can do to fix this kind of problems, reindexing
seems to be impossible too :(:( ... Thank you very much !


> Gaetano Mendola <mendola@bigfoot.com> writes:
> > When you are experiencing this show us the result of this query:
> > select cmax, cmin, xmax, xmin, * from <table> where <your condition>;
>
> Also, please, the ctid and oid columns (but leave out oid if you made
> the table WITHOUT OIDS).
>
> Also, if the condition is one that will normally use an index, try
> the same query with and without "set enable_indexscan = off".  It
> could be that a corrupted index would cause the query to visit the
> same rows multiple times (or miss rows!).
>
> It might be a good idea to REINDEX the primary-key index on the table,
> but I would counsel not doing so until we have more data on what's
> happening.  If the problem is index corruption then REINDEX would
> destroy all the evidence ...
>
>             regards, tom lane
>
>