Thread: BUG #8382: Duplicate primary key

BUG #8382: Duplicate primary key

From
curd.reinert@ppi.de
Date:
The following bug has been logged on the website:

Bug reference:      8382
Logged by:          Curd Reinert
Email address:      curd.reinert@ppi.de
PostgreSQL version: 9.0.5
Operating system:   Windows Server 2013
Description:

One of our customers encounters a problem because some tables contain
multiple rows with the same primary key. I've noticed a similar issue has
been reported in #3231 and #7760, but couldn't see how to proceed from
there.


The customer has about 1.000 installations of our software, each running
with its own Postgres instance, the vast majority running without any
problems. One system is causing most of the trouble, with the problem
occuring about every one or two weeks. Our software uses about 70 tables. In
two of these tables, multiple rows with the same primary key occure. These
tables are used to lock resources between threads. One of them, called
"cluster_ressource" has a constant set of rows which are updated frequently.
The other, called "sperre" is empty when the system is idle. Rows are
inserted to lock certain objects and removed when the locks are released.


The operating system is Windows Server 2003. apparently in a VMWare virtual
machine. The data folder is on D: which is a VMWare virtual disk. Compared
to the other systems, this one has quite a lot of traffic and therefore a
high probability of concurrent updates. According to our customer, there
were no problems with the same system as long as it was running with an
older version of our software which used Postgres 8.3. Of course there were
also changes in our software.


Once the problem is noticed, the customer fixes it by deleting both entries
and inserting a new one. In one case, a REINDEX was necessary to get the
system running again. Afterwards, the system is running smoothly for some
time, approximately one or two weeks. Then duplicate rows appear again.


I checked fsync, it has not been switched off.


Our application is written in Java, using a plain JDBC driver to access the
database. Of course we don't fiddle with the data model, indexes etc. during
runtime.


I do have a copy of the data directory with a duplicate row in
cluster_ressource, in case anyone wants to dive into it.


It seems that this not just one system. Most of their systems are nowadays
running on Postgres 9.0, and the majority without any problems. However, the
customer encountered similar problems with a handful of installations. Again
the problems started after updating to Postgres 9.0, and again these systems
have a higher workload than the average.


Please let me know if you need any further information. Thank you very much
in advance!

Re: BUG #8382: Duplicate primary key

From
Greg Stark
Date:
On Tue, Aug 13, 2013 at 2:50 PM,  <curd.reinert@ppi.de> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      8382
> Logged by:          Curd Reinert
> Email address:      curd.reinert@ppi.de
> PostgreSQL version: 9.0.5

The current bug-fix release for Postgres 9 is 9.0.13. There was an
extremely high profile security vulnerability fixed in 9.0.13 so I
would strongly recommend updating.

I see bugs that could cause this in both 9.0.7 and 9.0.11. However
both are very rare and hard to believe you would see either of them
happen repeatedly. Unless you're calling CREATE INDEX CONCURRENTLY
*often* on this system?

Do you know anything about the history of how often the system was
rebooted -- either the host or the virtual machine? Also, have you run
a memory checker on the system?

--
greg

Re: BUG #8382: Duplicate primary key

From
Curd Reinert
Date:
Hello Greg,

gsstark@gmail.com schrieb am 13.08.2013 19:50:35:
> The current bug-fix release for Postgres 9 is 9.0.13. There was an
> extremely high profile security vulnerability fixed in 9.0.13 so I
> would strongly recommend updating.
Yes, I noticed that one. However, those installations are not exposed to
the outside, they do have backups, the data inside the database is not
very critical and there are about 1.000 installations to update. I don't
think I can make our customer update all of them. Should be no problem for
the one that is causing the trouble, though.

> I see bugs that could cause this in both 9.0.7 and 9.0.11.
Okay. I thought I checked the list of fixed bugs, but didn't notice them
then. I can see them now you've pointed me to the relevant patch levels. I
will tell our cutomer to update that installation to 9.0.13.

> Unless you're calling CREATE INDEX CONCURRENTLY
> *often* on this system?
Never. As we use practically the same code for Postgres, DB2 and Oracle,
we are stuck to plain old SQL and don't do anything but selects, inserts,
updates and deletes. And I'm pretty sure our customer isn't touching the
database, either.

> Do you know anything about the history of how often the system was
> rebooted -- either the host or the virtual machine?
I can't say for sure, but it seems that the VM is rebooted at least every
night. I have no idea about the host.

> Also, have you run a memory checker on the system?
I'm two levels remote from the machine, but I will pass it on.

Thank you very much.

Curd

Re: BUG #8382: Duplicate primary key

From
Greg Stark
Date:
On Tue, Aug 13, 2013 at 2:50 PM,  <curd.reinert@ppi.de> wrote:
> I do have a copy of the data directory with a duplicate row in
> cluster_ressource, in case anyone wants to dive into it.

Could you run the following?

select ctid,xmin,xmax,primarykey from tab where primarykey = 'duplicatevalue'

And also show the \d tab output?

And can you load the pageinspect contrib module in this database?


--
greg

Re: BUG #8382: Duplicate primary key

From
Curd Reinert
Date:
gsstark@gmail.com schrieb am 14.08.2013 12:44:40:
> select ctid,xmin,xmax,primarykey from tab where primarykey =3D=20
'duplicatevalue'
traviclinkengine=3D# select ctid,xmin,xmax,* from cluster=5Fressource where=
=20
clusterknoten=5Fid =3D 1 and ressource=5Fid =3D 10;
 ctid  |   xmin    | xmax | clusterknoten=5Fid | ressource=5Fid | anzahl
-------+-----------+------+------------------+--------------+--------
 (1,9) | 308561268 |    0 |                1 |           10 |      0
 (3,9) | 308561268 |    0 |                1 |           10 |      0
(2 Zeilen)

> And also show the \d tab output?
traviclinkengine=3D# \d cluster=5Fressource
Tabelle =BBpublic.cluster=5Fressource=AB
      Spalte      |   Typ    | Attribute
------------------+----------+-----------
 clusterknoten=5Fid | bigint   | not null
 ressource=5Fid     | bigint   | not null
 anzahl           | smallint | not null
Indexe:
    "cluster=5Fressource=5Fpkey" PRIMARY KEY, btree (clusterknoten=5Fid,=20
ressource=5Fid)
    "cl=5Fres=5Fres=5Ffkind" btree (ressource=5Fid)
Fremdschl=FCssel-Constraints:
    "cl=5Fres=5Fclkn=5Ffk" FOREIGN KEY (clusterknoten=5Fid) REFERENCES=20
clusterknoten(id) ON DELETE RESTRICT
    "cl=5Fres=5Fres=5Ffk" FOREIGN KEY (ressource=5Fid) REFERENCES ressource=
(id) ON=20
DELETE RESTRICT

> And can you load the pageinspect contrib module in this database?
Yes. At least I think I've just done it. :-)

Curd