Thread: pg primary key bug?

pg primary key bug?

From
pginfo
Date:
Hi ,

I am using pg 7.4.5 on RedHat AS 3.0.

I am using it via jdbc and jboss.

I have found big problem about the way pg supports primary keys.
The bug was reported from my customers ( we are installed pg with our 
ERP on ~ 500 costomers) and I do not know hot it is happen.

sklad21=# \d a_constants_str        Table "public.a_constants_str"  Column   |         Type          | Modifiers
------------+-----------------------+-----------constname  | character varying(30) | not nullfid        | integer
       | not nullconstvalue | character varying(30) |
 
Indexes:   "a_constants_str_pkey" primary key, btree (constname, fid)

sklad21=# select * from a_constants_str ;constname | fid | constvalue 
-----------+-----+-------------AACCGRID  |   0 | SOF_3AKLTYPID  |   0 | SOF_3ADARID    |   0 | SOF_2AOBLASTID |   0 |
SOF_6AUSERID  |   0 | SOF_17ANMGRID   |   0 | SOF_21LOCAID    |   0 | SOF_41DOCID     |   0 | SOF_1585254DOCPLAID  |
0| SOF_1052900AKLIID    |   0 | SOF_18740DOCRID    |   0 | SOF_2268142DOCPOGPLA |   0 | SOF_324586DOCID     |   0 |
SOF_1585254DOCID    |   0 | SOF_1585254DOCID     |   0 | SOF_1585254AKLGRID   |   0 | SOF_45DOCID     |   0 |
SOF_1585254DOCID    |   0 | SOF_1585254ASETUPID  |   0 | SOF_4605DOCID     |   0 | SOF_1585254DOCID     |   0 |
SOF_1585254TDOCID   |   0 | SOF_337TDOCRID   |   0 | SOF_19450DOCID     |   0 | SOF_1585254DOCID     |   0 |
SOF_1585254AGRADID  |   0 | SOF_256DOCID     |   0 | SOF_1585254ASLUID    |   0 | SOF_46NASTRF    |   0 | SOF_88ANOMID
 |   0 | SOF_1200
 
(30 rows)


Pls., see the records with 'DOCID' and note we have primary key defined.
At this moment we will create a new db and dump and reload the data.
The old one will stay and if needet we will be able to study the files 
(the access to this customer is not very simple, but possible).

regards,
ivan.





Re: pg primary key bug?

From
Michael Glaesemann
Date:
On Feb 3, 2005, at 21:53, pginfo wrote:

> I am using pg 7.4.5 on RedHat AS 3.0.

> sklad21=# \d a_constants_str
>         Table "public.a_constants_str"
>   Column   |         Type          | Modifiers
> ------------+-----------------------+-----------
> constname  | character varying(30) | not null
> fid        | integer               | not null
> constvalue | character varying(30) |
> Indexes:
>    "a_constants_str_pkey" primary key, btree (constname, fid)
>
> sklad21=# select * from a_constants_str ;
> constname | fid | constvalue -----------+-----+-------------
> AACCGRID  |   0 | SOF_3
> AKLTYPID  |   0 | SOF_3
> ADARID    |   0 | SOF_2
> AOBLASTID |   0 | SOF_6
> AUSERID   |   0 | SOF_17
> ANMGRID   |   0 | SOF_21
> LOCAID    |   0 | SOF_41
> DOCID     |   0 | SOF_1585254
> DOCPLAID  |   0 | SOF_1052900
> AKLIID    |   0 | SOF_18740
> DOCRID    |   0 | SOF_2268142
> DOCPOGPLA |   0 | SOF_324586
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> AKLGRID   |   0 | SOF_45
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> ASETUPID  |   0 | SOF_4605
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> TDOCID    |   0 | SOF_337
> TDOCRID   |   0 | SOF_19450
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> AGRADID   |   0 | SOF_256
> DOCID     |   0 | SOF_1585254
> ASLUID    |   0 | SOF_46
> NASTRF    |   0 | SOF_88
> ANOMID    |   0 | SOF_1200
> (30 rows)
>
>
> Pls., see the records with 'DOCID' and note we have primary key 
> defined.


It's unclear from just this data, but there's a chance that there are 
varying numbers of spaces after 'DOCID', which would appear as separate 
values for the index, even though they aren't readily apparent. Could 
you show us the results of the following query?

select constname, length(constname) as constname_length
from a_constants_str;

Here's another one which would show if PostgreSQL is treating them 
equally:

select constname, count(constname)
from a_constants_str;

The results of these queries might shed some light on the issue.

Hope this helps.

Michael Glaesemann
grzm myrealbox com



Re: pg primary key bug?

From
pginfo
Date:
Hi,

It is not spaces problem.

I needet to dump the data and I deleted all rows for  'DOCID' with 
delete from a_constants_str where constname= 'DOCID'; and the pg 
reported 10 rows deleted.

Sorry I can not execute the commend now ( I dropped the data, becaus I 
needet the uniqu constnames for the restore).

regards,
ivan.


Michael Glaesemann wrote:

>
> On Feb 3, 2005, at 21:53, pginfo wrote:
>
>> I am using pg 7.4.5 on RedHat AS 3.0.
>
>
>> sklad21=# \d a_constants_str
>>         Table "public.a_constants_str"
>>   Column   |         Type          | Modifiers
>> ------------+-----------------------+-----------
>> constname  | character varying(30) | not null
>> fid        | integer               | not null
>> constvalue | character varying(30) |
>> Indexes:
>>    "a_constants_str_pkey" primary key, btree (constname, fid)
>>
>> sklad21=# select * from a_constants_str ;
>> constname | fid | constvalue -----------+-----+-------------
>> AACCGRID  |   0 | SOF_3
>> AKLTYPID  |   0 | SOF_3
>> ADARID    |   0 | SOF_2
>> AOBLASTID |   0 | SOF_6
>> AUSERID   |   0 | SOF_17
>> ANMGRID   |   0 | SOF_21
>> LOCAID    |   0 | SOF_41
>> DOCID     |   0 | SOF_1585254
>> DOCPLAID  |   0 | SOF_1052900
>> AKLIID    |   0 | SOF_18740
>> DOCRID    |   0 | SOF_2268142
>> DOCPOGPLA |   0 | SOF_324586
>> DOCID     |   0 | SOF_1585254
>> DOCID     |   0 | SOF_1585254
>> DOCID     |   0 | SOF_1585254
>> AKLGRID   |   0 | SOF_45
>> DOCID     |   0 | SOF_1585254
>> DOCID     |   0 | SOF_1585254
>> ASETUPID  |   0 | SOF_4605
>> DOCID     |   0 | SOF_1585254
>> DOCID     |   0 | SOF_1585254
>> TDOCID    |   0 | SOF_337
>> TDOCRID   |   0 | SOF_19450
>> DOCID     |   0 | SOF_1585254
>> DOCID     |   0 | SOF_1585254
>> AGRADID   |   0 | SOF_256
>> DOCID     |   0 | SOF_1585254
>> ASLUID    |   0 | SOF_46
>> NASTRF    |   0 | SOF_88
>> ANOMID    |   0 | SOF_1200
>> (30 rows)
>>
>>
>> Pls., see the records with 'DOCID' and note we have primary key defined.
>
>
>
> It's unclear from just this data, but there's a chance that there are 
> varying numbers of spaces after 'DOCID', which would appear as 
> separate values for the index, even though they aren't readily 
> apparent. Could you show us the results of the following query?
>
> select constname, length(constname) as constname_length
> from a_constants_str;
>
> Here's another one which would show if PostgreSQL is treating them 
> equally:
>
> select constname, count(constname)
> from a_constants_str;
>
> The results of these queries might shed some light on the issue.
>
> Hope this helps.
>
> Michael Glaesemann
> grzm myrealbox com
>
>
>




Re: pg primary key bug?

From
Richard_D_Levine@raytheon.com
Date:
There are not enough spaces available in the column to allow for the number
of DOC_IDs.  There are three spaces, allowing for at most four occurrences
of DOC_ID, but there are eleven.  --Rick


                                    Michael Glaesemann
                                                   <grzm@myrealbox.com        To:       pginfo
<pginfo@t1.unisoftbg.com>                                                                  >
cc:      pgsql-sql@postgresql.org                                                                           Sent by:
              Subject:  Re: [SQL] pg primary key bug?
  pgsql-sql-owner@pos
                 tgresql.org


                                                              02/03/2005 09:14 AM


                                                                                      
 





On Feb 3, 2005, at 21:53, pginfo wrote:

> I am using pg 7.4.5 on RedHat AS 3.0.

> sklad21=# \d a_constants_str
>         Table "public.a_constants_str"
>   Column   |         Type          | Modifiers
> ------------+-----------------------+-----------
> constname  | character varying(30) | not null
> fid        | integer               | not null
> constvalue | character varying(30) |
> Indexes:
>    "a_constants_str_pkey" primary key, btree (constname, fid)
>
> sklad21=# select * from a_constants_str ;
> constname | fid | constvalue -----------+-----+-------------
> AACCGRID  |   0 | SOF_3
> AKLTYPID  |   0 | SOF_3
> ADARID    |   0 | SOF_2
> AOBLASTID |   0 | SOF_6
> AUSERID   |   0 | SOF_17
> ANMGRID   |   0 | SOF_21
> LOCAID    |   0 | SOF_41
> DOCID     |   0 | SOF_1585254
> DOCPLAID  |   0 | SOF_1052900
> AKLIID    |   0 | SOF_18740
> DOCRID    |   0 | SOF_2268142
> DOCPOGPLA |   0 | SOF_324586
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> AKLGRID   |   0 | SOF_45
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> ASETUPID  |   0 | SOF_4605
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> TDOCID    |   0 | SOF_337
> TDOCRID   |   0 | SOF_19450
> DOCID     |   0 | SOF_1585254
> DOCID     |   0 | SOF_1585254
> AGRADID   |   0 | SOF_256
> DOCID     |   0 | SOF_1585254
> ASLUID    |   0 | SOF_46
> NASTRF    |   0 | SOF_88
> ANOMID    |   0 | SOF_1200
> (30 rows)
>
>
> Pls., see the records with 'DOCID' and note we have primary key
> defined.


It's unclear from just this data, but there's a chance that there are
varying numbers of spaces after 'DOCID', which would appear as separate
values for the index, even though they aren't readily apparent. Could
you show us the results of the following query?

select constname, length(constname) as constname_length
from a_constants_str;

Here's another one which would show if PostgreSQL is treating them
equally:

select constname, count(constname)
from a_constants_str;

The results of these queries might shed some light on the issue.

Hope this helps.

Michael Glaesemann
grzm myrealbox com


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command   (send "unregister YourEmailAddressHere" to
majordomo@postgresql.org)





Re: pg primary key bug?

From
pginfo
Date:
<br /> Hi all,<br /><br /> Sorry for my post, but I think it is pg primary key bug.<br /><br /> It is secont time in
thatwe found the bug (see the replays for this message).<br /> We have many server with pg and use it over jdbc +
jboss.<br/><br /> I am not able to stop this server for long time, but I have dumped the problem table.<br /> It is
veryimportant to know if it is bug, because we have many server running pg + our ERP and continuing to install new.<br
/><br/> 01=# select * from a_constants_str order by constname;<br />  constname | fid | constvalue <br />
-----------+-----+------------<br/>  AACCGRID  |   0 | SOF_3<br />  ADARID    |   0 | SOF_2<br />  AGRADID   |   0 |
SOF_165<br/>  AKLGRID   |   0 | SOF_8<br />  AKLIID    |   0 | SOF_3513<br />  AKLTYPID  |   0 | SOF_3<br />  ANMGRID  
|  0 | SOF_10<br />  ANOMID    |   0 | SOF_747<br />  AOBLASTID |   0 | SOF_3<br />  ASETUPID  |   0 | SOF_399<br />
 ASLUID   |   0 | SOF_17<br />  AUSERID   |   0 | SOF_3<br />  DOCID     |   0 | SOF_25658<br />  DOCPLAID  |   0 |
SOF_19738<br/>  DOCPLAID  |   0 | SOF_19738<br />  DOCPOGPLA |   0 | SOF_24281<br />  DOCRID    |   0 | SOF_184547<br
/> LOCAID    |   0 | SOF_13<br />  NASTRF    |   0 | SOF_1<br />  TDOCID    |   0 | SOF_47<br />  TDOCRID   |   0 |
SOF_2439<br/> (21 rows)<br /><br /> 01=# select * from a_constants_str where constname='DOCPLAID' ;<br />  constname |
fid| constvalue <br /> -----------+-----+------------<br />  DOCPLAID  |   0 | SOF_19738<br />  DOCPLAID  |   0 |
SOF_19738<br/> (2 rows)<br /><br /><br /> 01=# \d a_constants_str;<br />          Table "public.a_constants_str"<br />
  Column   |         Type          | Modifiers <br /> ------------+-----------------------+-----------<br />
 constname | character varying(30) | not null<br />  fid        | integer               | not null<br />  constvalue |
charactervarying(30) | <br /> Indexes:<br />     "a_constants_str_pkey" primary key, btree (constname, fid)<br /><br
/><br/> regards,<br /> ivan<br /><br /><br /><br /><br /><br /><br /><br /><br /><br /><a
class="moz-txt-link-abbreviated"href="mailto:Richard_D_Levine@raytheon.com">Richard_D_Levine@raytheon.com</a> wrote:<br
/><blockquotecite="midOFEF67063C.F61988F3-ON05256F9D.005A91D6@ftw.us.ray.com" type="cite"><pre wrap="">There are not
enoughspaces available in the column to allow for the number
 
of DOC_IDs.  There are three spaces, allowing for at most four occurrences
of DOC_ID, but there are eleven.  --Rick


                                    Michael Glaesemann
                                                   <<a class="moz-txt-link-abbreviated"
href="mailto:grzm@myrealbox.com">grzm@myrealbox.com</a>       To:       pginfo <a class="moz-txt-link-rfc2396E"
href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>
                  >                          cc:       <a class="moz-txt-link-abbreviated"
href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>
                    Sent by:                   Subject:  Re: [SQL] pg primary key bug?
                                   pgsql-sql-owner@pos
                                                  tgresql.org


                                                                                               02/03/2005 09:14 AM









On Feb 3, 2005, at 21:53, pginfo wrote:
 </pre><blockquote type="cite"><pre wrap="">I am using pg 7.4.5 on RedHat AS 3.0.   </pre></blockquote><pre wrap="">
</pre><blockquotetype="cite"><pre wrap="">sklad21=# \d a_constants_str       Table "public.a_constants_str" Column   |
      Type          | Modifiers
 
------------+-----------------------+-----------
constname  | character varying(30) | not null
fid        | integer               | not null
constvalue | character varying(30) |
Indexes:  "a_constants_str_pkey" primary key, btree (constname, fid)

sklad21=# select * from a_constants_str ;
constname | fid | constvalue -----------+-----+-------------
AACCGRID  |   0 | SOF_3
AKLTYPID  |   0 | SOF_3
ADARID    |   0 | SOF_2
AOBLASTID |   0 | SOF_6
AUSERID   |   0 | SOF_17
ANMGRID   |   0 | SOF_21
LOCAID    |   0 | SOF_41
DOCID     |   0 | SOF_1585254
DOCPLAID  |   0 | SOF_1052900
AKLIID    |   0 | SOF_18740
DOCRID    |   0 | SOF_2268142
DOCPOGPLA |   0 | SOF_324586
DOCID     |   0 | SOF_1585254
DOCID     |   0 | SOF_1585254
DOCID     |   0 | SOF_1585254
AKLGRID   |   0 | SOF_45
DOCID     |   0 | SOF_1585254
DOCID     |   0 | SOF_1585254
ASETUPID  |   0 | SOF_4605
DOCID     |   0 | SOF_1585254
DOCID     |   0 | SOF_1585254
TDOCID    |   0 | SOF_337
TDOCRID   |   0 | SOF_19450
DOCID     |   0 | SOF_1585254
DOCID     |   0 | SOF_1585254
AGRADID   |   0 | SOF_256
DOCID     |   0 | SOF_1585254
ASLUID    |   0 | SOF_46
NASTRF    |   0 | SOF_88
ANOMID    |   0 | SOF_1200
(30 rows)


Pls., see the records with 'DOCID' and note we have primary key
defined.   </pre></blockquote><pre wrap="">

It's unclear from just this data, but there's a chance that there are
varying numbers of spaces after 'DOCID', which would appear as separate
values for the index, even though they aren't readily apparent. Could
you show us the results of the following query?

select constname, length(constname) as constname_length
from a_constants_str;

Here's another one which would show if PostgreSQL is treating them
equally:

select constname, count(constname)
from a_constants_str;

The results of these queries might shed some light on the issue.

Hope this helps.

Michael Glaesemann
grzm myrealbox com


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command   (send "unregister YourEmailAddressHere" to <a
class="moz-txt-link-abbreviated"href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a>)
 




---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command   (send "unregister YourEmailAddressHere" to <a
class="moz-txt-link-abbreviated"href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a>)
 

 </pre></blockquote><br />

Re: pg primary key bug?

From
Michael Fuhr
Date:
On Thu, Feb 17, 2005 at 04:12:38PM +0100, pginfo wrote:

> 01=# select * from a_constants_str where constname='DOCPLAID' ;
> constname | fid | constvalue
> -----------+-----+------------
> DOCPLAID  |   0 | SOF_19738
> DOCPLAID  |   0 | SOF_19738
> (2 rows)

Do you have any inherited tables?  What's the result of the following
query?

SELECT tableoid::regclass, *
FROM a_constants_str
WHERE constname = 'DOCPLAID';

Inherited tables are documented to have deficiencies regarding
constraints.  Observe:

CREATE TABLE parent (   constname   varchar(30) NOT NULL,   fid         integer NOT NULL,   constvalue  varchar(30),
PRIMARYKEY (constname, fid)
 
);  
CREATE TABLE child () INHERITS (parent);

INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738');

INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738');
ERROR:  duplicate key violates unique constraint "parent_pkey"

INSERT INTO child VALUES ('DOCPLAID', 0, 'SOF_19738');

SELECT tableoid::regclass, * FROM parent;tableoid | constname | fid | constvalue 
----------+-----------+-----+------------parent   | DOCPLAID  |   0 | SOF_19738child    | DOCPLAID  |   0 | SOF_19738
(2 rows)

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


Re: pg primary key bug?

From
pginfo
Date:
Hi,<br /><br /> No, I do not inherited tables.<br /><br /> The result of this query is:<br /> 01=# SELECT
tableoid::regclass,*<br /> 01-# FROM a_constants_str<br /> 01-# WHERE constname = 'DOCPLAID';<br />     tableoid     |
constname| fid | constvalue <br /> -----------------+-----------+-----+------------<br />  a_constants_str | DOCPLAID 
|  0 | SOF_19738<br />  a_constants_str | DOCPLAID  |   0 | SOF_19738<br /> (2 rows)<br /><br /> regards,<br />
ivan.<br/><br /><br /> Michael Fuhr wrote:<br /><blockquote cite="mid20050217164013.GA26345@winnie.fuhr.org"
type="cite"><prewrap="">On Thu, Feb 17, 2005 at 04:12:38PM +0100, pginfo wrote:
 
 </pre><blockquote type="cite"><pre wrap="">01=# select * from a_constants_str where constname='DOCPLAID' ;
constname | fid | constvalue
-----------+-----+------------
DOCPLAID  |   0 | SOF_19738
DOCPLAID  |   0 | SOF_19738
(2 rows)   </pre></blockquote><pre wrap="">
Do you have any inherited tables?  What's the result of the following
query?

SELECT tableoid::regclass, *
FROM a_constants_str
WHERE constname = 'DOCPLAID';

Inherited tables are documented to have deficiencies regarding
constraints.  Observe:

CREATE TABLE parent (   constname   varchar(30) NOT NULL,   fid         integer NOT NULL,   constvalue  varchar(30),
PRIMARYKEY (constname, fid)
 
);  
CREATE TABLE child () INHERITS (parent);

INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738');

INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738');
ERROR:  duplicate key violates unique constraint "parent_pkey"

INSERT INTO child VALUES ('DOCPLAID', 0, 'SOF_19738');

SELECT tableoid::regclass, * FROM parent;tableoid | constname | fid | constvalue 
----------+-----------+-----+------------parent   | DOCPLAID  |   0 | SOF_19738child    | DOCPLAID  |   0 | SOF_19738
(2 rows)
 </pre></blockquote><br />

Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> 01=# select * from a_constants_str where constname='DOCPLAID' ;
>  constname | fid | constvalue
> -----------+-----+------------
>  DOCPLAID  |   0 | SOF_19738
>  DOCPLAID  |   0 | SOF_19738
> (2 rows)

Could we see the system columns on these rows?
select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str where ...
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
Hi,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid8125.1108661189@sss.pgh.pa.us" type="cite"><pre
wrap="">pginfo<a class="moz-txt-link-rfc2396E"
href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre
wrap="">01=#select * from a_constants_str where constname='DOCPLAID' ;constname | fid | constvalue
 
-----------+-----+------------DOCPLAID  |   0 | SOF_19738DOCPLAID  |   0 | SOF_19738
(2 rows)   </pre></blockquote><pre wrap="">
Could we see the system columns on these rows?
select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str where ...
        regards, tom lane

 </pre></blockquote> yes,<br /> 01=# select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str;<br />   oid   | 
xmin  |  cmin   |  xmax   |  cmax   |  ctid   | constname | fid | constvalue <br />
--------+---------+---------+---------+---------+---------+-----------+-----+------------<br/>   17910 |   22331
|      2 |       2 |   26679 |   (0,7) | ANMGRID   |   0 | SOF_10<br />   17908 |     985 |       6 |       6 |       0
| (0,18) | AKLTYPID  |   0 | SOF_3<br />   17907 |     985 |      30 |      30 |       0 |  (0,21) | ADARID    |   0 |
SOF_2<br/>   17921 |     985 |      34 |      34 |       0 |  (0,22) | AOBLASTID |   0 | SOF_3<br />   17911 |    4640
|      6 |       6 |   26679 |  (0,24) | AACCGRID  |   0 | SOF_3<br />   17920 | 1220598 |       2 |       2 | 1475630
| (0,47) | ASLUID    |   0 | SOF_17<br />   17917 |  643083 |       2 |       2 | 1475630 |  (0,49) | LOCAID    |   0 |
SOF_13<br/>   17918 |  762851 |       3 |       3 | 1475630 |  (0,50) | AUSERID   |   0 | SOF_3<br />   17923 |   35539
|    165 |     165 | 1475630 |  (0,51) | ASETUPID  |   0 | SOF_399<br />  283686 |  514327 |       3 |       3 |
1475630|  (0,52) | NASTRF    |   0 | SOF_1<br />   17909 | 2156667 |       2 |       2 | 2193198 |  (0,54) | AKLGRID  
|  0 | SOF_8<br />   17922 | 2103298 |       2 |       2 | 2193198 |  (0,55) | AGRADID   |   0 | SOF_165<br />   17913
|2092705 |       2 |       2 | 2193198 |  (0,56) | ANOMID    |   0 | SOF_747<br />   63247 | 2226373 |       2 |      
2| 2233003 | (0,126) | TDOCID    |   0 | SOF_47<br />   17914 | 2232893 | 2235861 | 2235861 |       3 | (1,125) |
DOCID    |   0 | SOF_25658<br />   17915 | 2232893 | 2235861 | 2235861 |      19 |  (3,38) | DOCRID    |   0 |
SOF_184547<br/>   17916 | 2232893 | 2235861 | 2235861 |      42 |  (4,71) | DOCPLAID  |   0 | SOF_19738<br />   17919 |
2232893|     231 |     231 | 2233003 | (5,104) | DOCPOGPLA |   0 | SOF_24281<br />   17912 | 2231139 |       2 |      
2|       0 |   (6,1) | AKLIID    |   0 | SOF_3513<br />   17916 | 2232893 | 2235861 | 2235861 |      41 |  (7,62) |
DOCPLAID |   0 | SOF_19738<br />   63249 | 2226373 |     103 |     103 |       0 | (16,31) | TDOCRID   |   0 |
SOF_2439<br/> (21 rows)<br /><br /> and <br /><br /> 01=# select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str
whereconstname='DOCPLAID';<br />   oid  |  xmin   |  cmin   |  xmax   | cmax |  ctid  | constname | fid | constvalue
<br/> -------+---------+---------+---------+------+--------+-----------+-----+------------<br />  17916 | 2232893 |
2235861| 2235861 |   42 | (4,71) | DOCPLAID  |   0 | SOF_19738<br />  17916 | 2232893 | 2235861 | 2235861 |   41 |
(7,62)| DOCPLAID  |   0 | SOF_19738<br /> (2 rows)<br /><br /> regards,<br /> ivan.<br /> 

Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> Tom Lane wrote:
>> Could we see the system columns on these rows?

> 01=# select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str where 
> constname='DOCPLAID';
>   oid  |  xmin   |  cmin   |  xmax   | cmax |  ctid  | constname | fid | 
> constvalue
> -------+---------+---------+---------+------+--------+-----------+-----+------------
>  17916 | 2232893 | 2235861 | 2235861 |   42 | (4,71) | DOCPLAID  |   0 | 
> SOF_19738
>  17916 | 2232893 | 2235861 | 2235861 |   41 | (7,62) | DOCPLAID  |   0 | 
> SOF_19738
> (2 rows)

Given the identical OID and xmin values, it seems certain that these are
the "same" row, ie there was only one insertion event.  My bet is that
the one at (7,62) is the original, and that the one at (4,71) is a copy
that was made by VACUUM FULL trying to move the row to compact the
table.  So the question is how did both copies get to be marked
simultaneously valid?  That should be impossible, unless a disk write
got dropped.  Have you had any system crashes during VACUUM FULL
operations recently?
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
Hi,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid9340.1108664352@sss.pgh.pa.us" type="cite"><pre
wrap="">pginfo<a class="moz-txt-link-rfc2396E"
href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre
wrap="">TomLane wrote:   </pre><blockquote type="cite"><pre wrap="">Could we see the system columns on these rows?
</pre></blockquote></blockquote><prewrap=""> </pre><blockquote type="cite"><pre wrap="">01=# select
oid,xmin,cmin,xmax,cmax,ctid,*from a_constants_str where 
 
constname='DOCPLAID'; oid  |  xmin   |  cmin   |  xmax   | cmax |  ctid  | constname | fid | 
constvalue
-------+---------+---------+---------+------+--------+-----------+-----+------------17916 | 2232893 | 2235861 | 2235861
|  42 | (4,71) | DOCPLAID  |   0 | 
 
SOF_1973817916 | 2232893 | 2235861 | 2235861 |   41 | (7,62) | DOCPLAID  |   0 | 
SOF_19738
(2 rows)   </pre></blockquote><pre wrap="">
Given the identical OID and xmin values, it seems certain that these are
the "same" row, ie there was only one insertion event.  My bet is that
the one at (7,62) is the original, and that the one at (4,71) is a copy
that was made by VACUUM FULL trying to move the row to compact the
table.  So the question is how did both copies get to be marked
simultaneously valid?  That should be impossible, unless a disk write
got dropped.  Have you had any system crashes during VACUUM FULL
operations recently?
        regards, tom lane

 </pre></blockquote> I do not know exact, but it is possible.<br /> This is customer server and I do not have any info
aboutserver/os crash, but I am 100% sure.<br /> I will ask as soon as possible and replay.<br /> Also I will check the
logs.<br/> We start pg witl pg_ctl ...-D... -l  /logfile  and if we can found the needet info in pg log file I can
checkit ( I do not nkow what to search in this file).<br /> We are running vacuum once per 24 h.<br /> The OS is RedHat
AS3.<br /> PostgreSQL is 7.4.1 (but we have the same problem on 7.4.5).<br /><br /> Will upgrade to 8.0 solve this type
ofproblems ?<br /><br /> regards,<br /> ivan.<br /> 

Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> Will upgrade to 8.0 solve this type of problems ?

The problem is probably not Postgres' fault at all.  I'm wondering about
disks with write cacheing enabled.  And you didn't turn off fsync,
I trust?
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
Hi,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid9785.1108666900@sss.pgh.pa.us" type="cite"><pre
wrap="">pginfo<a class="moz-txt-link-rfc2396E"
href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre
wrap="">Willupgrade to 8.0 solve this type of problems ?   </pre></blockquote><pre wrap="">
 
The problem is probably not Postgres' fault at all.  I'm wondering about
disks with write cacheing enabled.  And you didn't turn off fsync,
I trust? </pre></blockquote> About fsync (part from postgresql.conf) :<br /><br /><br />
#---------------------------------------------------------------------------<br/> # WRITE AHEAD LOG<br />
#---------------------------------------------------------------------------<br/><br /> # - Settings -<br /><br />
#fsync= true                   # turns forced synchronization on or off<br /> #wal_sync_method = fsync        # the
defaultvaries across platforms:<br />                                 # fsync, fdatasync, open_sync, or
open_datasync<br/> #wal_buffers = 8                # min 4, 8KB each<br /><br /> # - Checkpoints -<br /><br />
#checkpoint_segments= 3        # in logfile segments, min 1, 16MB each<br /> #checkpoint_timeout = 300       # range
30-3600,in seconds<br /> #checkpoint_warning = 30        # 0 is off, in seconds<br /><br /><br /> Also part from pg
logfile:<br/><br /> LOG:  statistics collector process (PID 2716) exited with exit code 1<br /> LOG:  shutting down<br
/>LOG:  database system is shut down<br /> LOG:  could not create IPv6 socket: Address family not supported by
protocol<br/> LOG:  database system was shut down at 2005-02-11 19:58:26 EET<br /> LOG:  checkpoint record is at
2/BAC39188<br/> LOG:  redo record is at 2/BAC39188; undo record is at 0/0; shutdown TRUE<br /> LOG:  next transaction
ID:2221145; next OID: 826607<br /> LOG:  database system is ready<br /> LOG:  recycled transaction log file
"00000002000000BA"<br/> LOG:  recycled transaction log file "00000002000000BB"<br /> LOG:  recycled transaction log
file"00000002000000BC"<br /> LOG:  recycled transaction log file "00000002000000BD"<br /> LOG:  recycled transaction
logfile "00000002000000BE"<br /> WARNING:  index "a_constants_str_pkey" contains 1449 row versions, but table contains
1422row versions<br /> HINT:  Rebuild the index with REINDEX.<br /> ERROR:  duplicate key violates unique constraint
"a_constants_str_pkey"<br/> ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"<br /> ERROR: 
duplicatekey violates unique constraint "a_constants_str_pkey"<br /> ERROR:  duplicate key violates unique constraint
"a_constants_str_pkey"<br/> ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"<br /> ERROR: 
duplicatekey violates unique constraint "a_constants_str_pkey"<br /> ERROR:  duplicate key violates unique constraint
"a_constants_str_pkey"<br/> ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"<br /> ERROR: 
duplicatekey violates unique constraint "a_constants_str_pkey"<br /> ERROR:  duplicate key violates unique constraint
"a_constants_str_pkey"<br/> ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"<br /> LOG: 
receivedsmart shutdown request<br /> FATAL:  terminating connection due to administrator command<br /> FATAL: 
terminatingconnection due to administrator command<br /> FATAL:  terminating connection due to administrator command<br
/>FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating connection due to administrator
command<br/> FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating connection due to
administratorcommand<br /> FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating
connectiondue to administrator command<br /> FATAL:  terminating connection due to administrator command<br /> FATAL: 
terminatingconnection due to administrator command<br /> FATAL:  terminating connection due to administrator command<br
/>FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating connection due to administrator
command<br/> FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating connection due to
administratorcommand<br /> FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating
connectiondue to administrator command<br /> FATAL:  terminating connection due to administrator command<br /> FATAL: 
terminatingconnection due to administrator command<br /> FATAL:  terminating connection due to administrator command<br
/>FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating connection due to administrator
command<br/> FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating connection due to
administratorcommand<br /> FATAL:  terminating connection due to administrator command<br /> FATAL:  terminating
connectiondue to administrator command<br /> FATAL:  terminating connection due to administrator command<br /> FATAL: 
terminatingconnection due to administrator command<br /> FATAL:  terminating connection due to administrator command<br
/>LOG:  statistics collector process (PID 2713) exited with exit code 1<br /> LOG:  shutting down<br /> LOG:  database
systemis shut down<br /> LOG:  could not create IPv6 socket: Address family not supported by protocol<br /> LOG: 
databasesystem was shut down at 2005-02-16 08:32:21 EET<br /> LOG:  checkpoint record is at 2/BFAE09EC<br /> LOG:  redo
recordis at 2/BFAE09EC; undo record is at 0/0; shutdown TRUE<br /><br /><br /> Note we was informed about the problem
on 2005-02-16 and rebooted the box.<br /> As I see in log file the last restart was on 2005-02-11 and after it all
workedwell.<br /> The affected table is critical  and the most used table in this system and if the insert stop to work
stopalso the system.<br /> I will notice also that in the first case when we found the same problem an this system
worked~80 users. <br /><br /> Is it possible pg_dump or reindex table or vacuum analyze to make this problem?<br /> We
areusing it very regular.<br /><br /> Will we need to stop using vacuum full at all?<br /><br /> regards,<br />
ivan.<br/><br /><br /><blockquote cite="mid9785.1108666900@sss.pgh.pa.us" type="cite"><pre wrap="">        regards, tom
lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

 </pre></blockquote><br />

Re: pg primary key bug?

From
"Iain"
Date:
Hi Ivan,
 
Sorry, I can't remember all you said in earlier posts, but I was wondering, your log file says:
 
> HINT:  Rebuild the index with REINDEX.
Did you do that, and did it solve the problem?
 
regards
Iain
----- Original Message -----
From: pginfo
Sent: Friday, February 18, 2005 1:53 PM
Subject: Re: [SQL] pg primary key bug?

Hi,

Tom Lane wrote:
pginfo <pginfo@t1.unisoftbg.com> writes:  
Will upgrade to 8.0 solve this type of problems ?    
The problem is probably not Postgres' fault at all.  I'm wondering about
disks with write cacheing enabled.  And you didn't turn off fsync,
I trust?  
About fsync (part from postgresql.conf) :


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = true                   # turns forced synchronization on or off
#wal_sync_method = fsync        # the default varies across platforms:
                                # fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8                # min 4, 8KB each

# - Checkpoints -

#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300       # range 30-3600, in seconds
#checkpoint_warning = 30        # 0 is off, in seconds


Also part from pg logfile:

LOG:  statistics collector process (PID 2716) exited with exit code 1
LOG:  shutting down
LOG:  database system is shut down
LOG:  could not create IPv6 socket: Address family not supported by protocol
LOG:  database system was shut down at 2005-02-11 19:58:26 EET
LOG:  checkpoint record is at 2/BAC39188
LOG:  redo record is at 2/BAC39188; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 2221145; next OID: 826607
LOG:  database system is ready
LOG:  recycled transaction log file "00000002000000BA"
LOG:  recycled transaction log file "00000002000000BB"
LOG:  recycled transaction log file "00000002000000BC"
LOG:  recycled transaction log file "00000002000000BD"
LOG:  recycled transaction log file "00000002000000BE"
WARNING:  index "a_constants_str_pkey" contains 1449 row versions, but table contains 1422 row versions
HINT:  Rebuild the index with REINDEX.
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
LOG:  received smart shutdown request
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
LOG:  statistics collector process (PID 2713) exited with exit code 1
LOG:  shutting down
LOG:  database system is shut down
LOG:  could not create IPv6 socket: Address family not supported by protocol
LOG:  database system was shut down at 2005-02-16 08:32:21 EET
LOG:  checkpoint record is at 2/BFAE09EC
LOG:  redo record is at 2/BFAE09EC; undo record is at 0/0; shutdown TRUE


Note we was informed about the problem on  2005-02-16 and rebooted the box.
As I see in log file the last restart was on 2005-02-11 and after it all worked well.
The affected table is critical  and the most used table in this system and if the insert stop to work stop also the system.
I will notice also that in the first case when we found the same problem an this system worked ~80 users.

Is it possible pg_dump or reindex table or vacuum analyze to make this problem?
We are using it very regular.

Will we need to stop using vacuum full at all?

regards,
ivan.


			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

  

Re: pg primary key bug?

From
pginfo
Date:
Hi Iain,

Iain wrote:
Hi Ivan,
 
Sorry, I can't remember all you said in earlier posts, but I was wondering, your log file says:
 
> HINT:  Rebuild the index with REINDEX.
Did you do that, and did it solve the problem?
 
No it do not  solve the problem.
I sendet the log only to show that we do not have any server crash nor pg restart.

regards,
ivan.
regards
Iain
----- Original Message -----
From: pginfo
Sent: Friday, February 18, 2005 1:53 PM
Subject: Re: [SQL] pg primary key bug?

Hi,

Tom Lane wrote:
pginfo <pginfo@t1.unisoftbg.com> writes:  
Will upgrade to 8.0 solve this type of problems ?    
The problem is probably not Postgres' fault at all.  I'm wondering about
disks with write cacheing enabled.  And you didn't turn off fsync,
I trust?  
About fsync (part from postgresql.conf) :


#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------

# - Settings -

#fsync = true                   # turns forced synchronization on or off
#wal_sync_method = fsync        # the default varies across platforms:
                                # fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8                # min 4, 8KB each

# - Checkpoints -

#checkpoint_segments = 3        # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300       # range 30-3600, in seconds
#checkpoint_warning = 30        # 0 is off, in seconds


Also part from pg logfile:

LOG:  statistics collector process (PID 2716) exited with exit code 1
LOG:  shutting down
LOG:  database system is shut down
LOG:  could not create IPv6 socket: Address family not supported by protocol
LOG:  database system was shut down at 2005-02-11 19:58:26 EET
LOG:  checkpoint record is at 2/BAC39188
LOG:  redo record is at 2/BAC39188; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 2221145; next OID: 826607
LOG:  database system is ready
LOG:  recycled transaction log file "00000002000000BA"
LOG:  recycled transaction log file "00000002000000BB"
LOG:  recycled transaction log file "00000002000000BC"
LOG:  recycled transaction log file "00000002000000BD"
LOG:  recycled transaction log file "00000002000000BE"
WARNING:  index "a_constants_str_pkey" contains 1449 row versions, but table contains 1422 row versions
HINT:  Rebuild the index with REINDEX.
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
ERROR:  duplicate key violates unique constraint "a_constants_str_pkey"
LOG:  received smart shutdown request
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command
LOG:  statistics collector process (PID 2713) exited with exit code 1
LOG:  shutting down
LOG:  database system is shut down
LOG:  could not create IPv6 socket: Address family not supported by protocol
LOG:  database system was shut down at 2005-02-16 08:32:21 EET
LOG:  checkpoint record is at 2/BFAE09EC
LOG:  redo record is at 2/BFAE09EC; undo record is at 0/0; shutdown TRUE


Note we was informed about the problem on  2005-02-16 and rebooted the box.
As I see in log file the last restart was on 2005-02-11 and after it all worked well.
The affected table is critical  and the most used table in this system and if the insert stop to work stop also the system.
I will notice also that in the first case when we found the same problem an this system worked ~80 users.

Is it possible pg_dump or reindex table or vacuum analyze to make this problem?
We are using it very regular.

Will we need to stop using vacuum full at all?

regards,
ivan.


			regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

  


Re: pg primary key bug?

From
pginfo
Date:
Hi,<br /> sorry, but we have the case number 3 in with the same problem.<br /> Also this time we do not find any linux
boxcrash nor pg stop or restart.<br /> The pg version is 7.4.2 on dual xeon + scsi running also RedHat 3.0 AS.<br /> In
allthe cases we are running  RedHat AS 3.0.<br /> This system was running for over 12 m. without any problems.<br /> I
sendalso the state of the problem table (also the same) and my question is:<br /> Need  we to stop using vacuum full
fornow?<br /> And can only vacuum analyze make the same problem in pg?<br /> As I understand the problem is in OS by
makingvacuum full analyze (as Tom wrote).<br /> We do not found any problems in OS and the ony solution we see is to
stopusing vacuum full analyze.<br /> Also we are using only jdbc to access pg. Is it possible that jdbc to make this
problem?<br/><br /> regards,<br /> ivan.<br /> serv117=# select oid, xmin, cmin, xmax, cmax, ctid, * from
a_constants_str;<br />     oid    |  xmin   |  cmin   |  xmax   |  cmax   |   ctid   | constname | fid | constvalue 
<br/> -----------+---------+---------+---------+---------+----------+-----------+-----+-------------<br />  760807304 |
7357839|       0 |       0 |       0 |    (0,1) | PARTID    |   0 | SOF_79<br />  760807305 | 7357839 |       0 |      
0|       0 |    (0,2) | AACCGRID  |   0 | SOF_29<br />  760807306 | 7357839 |       0 |       0 |       0 |    (0,3) |
AKLTYPID |   0 | SOF_47<br />  760807307 | 7357839 |       0 |       0 |       0 |    (0,4) | AOBLASTID |   0 |
SOF_41<br/>  760807308 | 7357839 |       0 |       0 |       0 |    (0,5) | ANMGRID   |   0 | SOF_102<br />  760807309
|7357839 |       0 |       0 |       0 |    (0,6) | LOCAID    |   0 | SOF_112<br />  760807310 | 7357839 |       0
|      0 |       0 |    (0,7) | AKLGRID   |   0 | SOF_116<br />  760807311 | 7357839 |       0 |       0 |       0 |   
(0,8)| ADARID    |   0 | SOF_33<br />  760807314 | 7357839 |       0 |       0 |       0 |   (0,11) | ASLUID    |   0 |
SOF_86<br/>  760807315 | 7357839 |       0 |       0 |       0 |   (0,12) | AUSERID   |   0 | SOF_28<br />  760807318 |
7357839|       0 |       0 |       0 |   (0,15) | ANLIZPID  |   0 | SOF_100137<br />  760807316 | 7507505 |       3
|      3 |       0 |   (0,36) | ASETUPID  |   0 | SOF_4618<br />  760807324 | 7750088 | 7766293 | 7766293 |       2 |  
(0,92)| DOCID     |   0 | SOF_836141<br />  760807319 | 7740812 |       2 |       2 |       0 |    (4,8) | ANOMID   
|  0 | SOF_31353<br />  760807325 | 7750088 |      19 |      19 |       0 |  (4,111) | DOCRID    |   0 | SOF_2067257<br
/> 760807326 | 7750088 |      41 |      41 | 7750975 |   (6,27) | DOCPLAID  |   0 | SOF_44261<br />  760807327 |
7750088|      46 |      46 | 7750975 |  (7,106) | DOCPOGPLA |   0 | SOF_58034<br />  760807324 | 7750088 | 7766293 |
7766293|       1 |  (9,107) | DOCID     |   0 | SOF_836141<br />  760807313 | 7680519 |       2 |       2 |       0 |  
(10,3)| NASTRF    |   0 | SOF_161<br />  760807312 | 7688072 |       2 |       2 |       0 |  (10,92) | AGRADID   |   0
|SOF_804<br />  760807324 | 7750088 | 7766293 | 7766293 |       1 |  (12,18) | DOCID     |   0 | SOF_836141<br />
 760807324| 7750088 | 7766293 | 7766293 |       1 |  (13,94) | DOCID     |   0 | SOF_836141<br />  760807324 | 7750088
|7766293 | 7766293 |       1 |  (15,45) | DOCID     |   0 | SOF_836141<br />  760807324 | 7750088 | 7766293 | 7766293
|      1 |   (17,4) | DOCID     |   0 | SOF_836141<br />  760807324 | 7750088 | 7766293 | 7766293 |       1 |  (18,80)
|DOCID     |   0 | SOF_836141<br />  760807324 | 7750088 | 7766293 | 7766293 |       1 |  (20,31) | DOCID     |   0 |
SOF_836141<br/>  760807324 | 7750088 | 7766293 | 7766293 |       1 | (21,109) | DOCID     |   0 | SOF_836141<br />
 760807324| 7750088 | 7766293 | 7766293 |       1 |  (23,58) | DOCID     |   0 | SOF_836141<br />  760807324 | 7750088
|7766293 | 7766293 |       1 |   (25,9) | DOCID     |   0 | SOF_836141<br />  760807324 | 7750088 | 7766293 | 7766293
|      1 |  (26,85) | DOCID     |   0 | SOF_836141<br />  760807324 | 7750088 | 7766293 | 7766293 |       1 |  (28,36)
|DOCID     |   0 | SOF_836141<br />  760807324 | 7750088 | 7766293 | 7766293 |       1 | (29,114) | DOCID     |   0 |
SOF_836141<br/>  760807317 | 7702028 |       2 |       2 |       0 |  (51,41) | AMITAID   |   0 | SOF_345<br />
 760807320| 7702064 |       2 |       2 |       0 |  (51,42) | ATRANSID  |   0 | SOF_458<br />  760807321 | 7707993
|      2 |       2 |       0 |   (57,8) | TDOCID    |   0 | SOF_546<br />  760807323 | 7753774 |       3 |       3
|      0 |   (59,7) | AKLIID    |   0 | SOF_22695<br />  760807322 | 7707993 |    2385 |    2385 |       0 |  (59,95) |
TDOCRID  |   0 | SOF_105930<br /> (37 rows)<br /><br /><br /><br /> serv117=# \d a_constants_str<br />          Table
"public.a_constants_str"<br/>    Column   |         Type          | Modifiers <br />
------------+-----------------------+-----------<br/>  constname  | character varying(30) | not null<br />  fid       
|integer               | not null<br />  constvalue | character varying(30) | <br /> Indexes:<br />    
"a_constants_str_pkey"primary key, btree (constname, fid)<br /><br /><br /><br /> Tom Lane wrote:<br /><blockquote
cite="mid9785.1108666900@sss.pgh.pa.us"type="cite"><pre wrap="">pginfo <a class="moz-txt-link-rfc2396E"
href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre
wrap="">Willupgrade to 8.0 solve this type of problems ?   </pre></blockquote><pre wrap="">
 
The problem is probably not Postgres' fault at all.  I'm wondering about
disks with write cacheing enabled.  And you didn't turn off fsync,
I trust?
        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

 </pre></blockquote><br />

Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> sorry, but we have the case number 3 in with the same problem.
> Also this time we do not find any linux box crash nor pg stop or restart.

Hmm, well there's *something* mighty curious about the history of this
table.  The xmin values span a range of almost 400,000 transactions and
yet the oids are twenty-three consecutive values.  Is this the only
table you've inserted into during the last 400K transactions?

It's also odd that there's so much empty space (only 37 rows in 60
pages).  It's hard to see how that could be unless there were many
updates on the table, but judging from the name and contents of the
table I can hardly see a reason for lots of updates.  How is this table
used exactly?  Do you do UPDATEs on it?  DELETEs?  SELECT FOR UPDATE?
Do you do VACUUM, or VACUUM FULL, or both?
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
Hi Tom,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre
wrap="">pginfo<a class="moz-txt-link-rfc2396E"
href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre
wrap="">sorry,but we have the case number 3 in with the same problem.
 
Also this time we do not find any linux box crash nor pg stop or restart.   </pre></blockquote><pre wrap="">
Hmm, well there's *something* mighty curious about the history of this
table.  The xmin values span a range of almost 400,000 transactions and
yet the oids are twenty-three consecutive values.  Is this the only
table you've inserted into during the last 400K transactions? </pre></blockquote> No.<br /><blockquote
cite="mid3293.1109026294@sss.pgh.pa.us"type="cite"><pre wrap="">
 
It's also odd that there's so much empty space (only 37 rows in 60
pages).  It's hard to see how that could be unless there were many
updates on the table, but judging from the name and contents of the
table I can hardly see a reason for lots of updates.  How is this table
used exactly?</pre></blockquote> In this table we store the last value for the ID of part from other tables.<br /> For
eachtable we have one constant in this table. We are using the table as sequence.<br /> For Example if we nee to insert
thenext record in some table we make:<br /> select constvalue from a_constants_str where constname ='...' for
update;<br/> increase the value and make <br /> update a_constants_str set   constvalue= (new value) where...<br /><br
/>It is not so easy as I described, but in general this is the case.<br /> Al this is connected with replications and
datasyncronisation and so on.<br /><br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap="">
Doyou do UPDATEs on it?</pre></blockquote> Yes, see the description.<br /><blockquote
cite="mid3293.1109026294@sss.pgh.pa.us"type="cite"><pre wrap="">  DELETEs?</pre></blockquote> No, never.<br
/><blockquotecite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap="">  SELECT FOR UPDATE?</pre></blockquote>
Yes.<br/><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap="">
 
Do you do VACUUM, or VACUUM FULL, or both?</pre></blockquote> Only vacuum full analyze once per day.<br /> Also once
per4h pg_dump (for pg 7.x we do not have any chance to restore data without full dump. With 8.0 we will test the ne
solutionand stop to make pg_dump.<br /><br /> We have > 200 relative big installs for this ERP and all are using
pg.<br/> For the last 3 y. we have one time data corruption and in this 3 cases problem with pkey's.<br /> All the time
pgis working relative well .<br /><br /> regards,<br /> ivan.<br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us"
type="cite"><prewrap="">
 
        regards, tom lane

 </pre></blockquote><br />

Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> In this table we store the last value for the ID of part from other tables.
> For each table we have one constant in this table. We are using the 
> table as sequence.
> For Example if we nee to insert the next record in some table we make:
> select constvalue from a_constants_str where constname ='...' for update;
> increase the value and make
> update a_constants_str set   constvalue= (new value) where...

> It is not so easy as I described, but in general this is the case.
> Al this is connected with replications and data syncronisation and so on.

"Connected"?  What exactly is hiding under that last comment?

One way I could take your report is that you've found a weird
interaction between SELECT FOR UPDATE and VACUUM FULL that no one else
has seen before.  Another way is that you're using some nonstandard
backend extension that has nasty bugs in it.

It is interesting that you say this system has been working well for
years and only recently have you seen problems.  To me the obvious
question is "what have you changed recently?"  It might not be a bogus
change in itself, but it could have triggered a bug at lower levels.

It's certainly possible that you have your finger on a backend bug,
but if so there's not nearly enough information here for anyone to
find and fix it.  You need to be thinking in terms of how to reproduce
the problem so that it can be studied and fixed.  "How can I avoid this
problem" is exactly the wrong question to be asking, because even if
avoiding it is all that concerns you, no one can answer with any
confidence until we understand what the failure mechanism is.
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo
<aclass="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> writes:
</pre><blockquotetype="cite"><pre wrap="">In this table we store the last value for the ID of part from other tables.
 
For each table we have one constant in this table. We are using the 
table as sequence.
For Example if we nee to insert the next record in some table we make:
select constvalue from a_constants_str where constname ='...' for update;
increase the value and make
update a_constants_str set   constvalue= (new value) where...   </pre></blockquote><pre wrap=""> </pre><blockquote
type="cite"><prewrap="">It is not so easy as I described, but in general this is the case.
 
Al this is connected with replications and data syncronisation and so on.   </pre></blockquote><pre wrap="">
"Connected"?  What exactly is hiding under that last comment? </pre></blockquote> We are using separate table for
sequencesand not sequences from pg direct, because this is built in application method for making <br /> replication
anddata syncro.<br /> I wish only to clarify the reason of using the table and to describe the groud for so many
updatesand select for updates.<br /> Sorry for my bad english ):.<br /><blockquote
cite="mid7012.1109061770@sss.pgh.pa.us"type="cite"><pre wrap="">
 
One way I could take your report is that you've found a weird
interaction between SELECT FOR UPDATE and VACUUM FULL that no one else
has seen before.  Another way is that you're using some nonstandard
backend extension that has nasty bugs in it.

It is interesting that you say this system has been working well for
years and only recently have you seen problems.</pre></blockquote> Yes, exact.<br /><blockquote
cite="mid7012.1109061770@sss.pgh.pa.us"type="cite"><pre wrap="">  To me the obvious
 
question is "what have you changed recently?"</pre></blockquote> If I know !<br /> In general we do not make any global
changesconnected to database access method.<br /> We are using  jdbc (jdbc driver from pg)  + jboss (java based
applicationserver) + connection pool (biult in jboss).<br /> We are using jdbc with Statement.executeBatch(...) and
alsodirect with Statement.executeUpdate(...) .<br /> We are using exact the same ide with oracle without any problem (
oraclehave anoder problems and I prefer pg).<br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre
wrap=""> It might not be a bogus
 
change in itself, but it could have triggered a bug at lower levels.

It's certainly possible that you have your finger on a backend bug,
but if so there's not nearly enough information here for anyone to
find and fix it.</pre></blockquote> I am sure (not 100%) that it is bug. That is the reason to report the problem.<br
/><br/><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap="">  You need to be thinking in terms
ofhow to reproduce
 
the problem so that it can be studied and fixed.</pre></blockquote> You idea was that we have "vacuum full" + update or
selectfor update in the same time.<br /> I think it is not the case, because we start vacuum full at 1:00 AM and no one
isworking in this time.<br /><br /> Will vacuum full generate this problem if we have locked table in this time? (It is
possibleto have locked table in theory)<br /><br /> At this time we do not have info about how to reproduce the
problem.<br/> As the first step we will stop using "vacum full" (if needet we will stop using vacuum analyze too) to
tryto collect more info.<br /><br /><br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap="">
"How can I avoid this
 
problem" is exactly the wrong question to be asking, because even if
avoiding it is all that concerns you, no one can answer with any
confidence until we understand what the failure mechanism is. </pre></blockquote> Can we set some log parameters to
collectthe needet data?<br /> Can you describe more detailed the idea of  problem with "vacuum full" + "update" and can
someone make patch if this problem exists in theory (if I  understand you right)?<br /> We can start using this patch
andsee if the problem will be again .<br /><br /> If you have anoder Idea we are ready to collect the needet data.<br
/><blockquotecite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap="">        regards, tom lane
 

 </pre></blockquote> regards,<br /> ivan.<br />

Re: pg primary key bug?

From
Ragnar Hafstað
Date:
On Tue, 2005-02-22 at 10:33 +0100, pginfo wrote:
> 
>   
> We are using  jdbc (jdbc driver from pg)  + jboss (java based
> application server) + connection pool (biult in jboss).
> ...
> Will vacuum full generate this problem if we have locked table in this
> time? (It is possible to have locked table in theory)

I do not know if this is relevant, but I have seen jboss applications
keep sessions in 'Idle in transaction' state, apparently with some
locks granted. Would such cases not interfere with vacuum?

gnari




Re: pg primary key bug?

From
pginfo
Date:
<br /><br /> Ragnar Hafstað wrote:<br /><blockquote cite="mid1109073162.17839.124.camel@localhost.localdomain"
type="cite"><prewrap="">On Tue, 2005-02-22 at 10:33 +0100, pginfo wrote: </pre><blockquote type="cite"><pre wrap="">  
 
We are using  jdbc (jdbc driver from pg)  + jboss (java based
application server) + connection pool (biult in jboss).
...
Will vacuum full generate this problem if we have locked table in this
time? (It is possible to have locked table in theory)   </pre></blockquote><pre wrap="">
I do not know if this is relevant, but I have seen jboss applications
keep sessions in 'Idle in transaction' state, apparently with some
locks granted. Would such cases not interfere with vacuum?

gnari
 </pre></blockquote> Only to add,<br /> also keeping sme transactions for long time not commited (possible).<br />
regards,<br/> ivan.<br /><blockquote cite="mid1109073162.17839.124.camel@localhost.localdomain" type="cite"><pre
wrap="">

 </pre></blockquote><br />

Re: pg primary key bug?

From
Markus Schaber
Date:
Hi, Ragnar,

Ragnar Hafstað schrieb:

>>We are using  jdbc (jdbc driver from pg)  + jboss (java based
>>application server) + connection pool (biult in jboss).
>>...
>>Will vacuum full generate this problem if we have locked table in this
>>time? (It is possible to have locked table in theory)
> I do not know if this is relevant, but I have seen jboss applications
> keep sessions in 'Idle in transaction' state, apparently with some
> locks granted. Would such cases not interfere with vacuum?

Most of those "Idle in transaction" problems were caused by suboptimal
handling of BEGIN in the pgjdbc driver, this should be fixed in current
versions of postgres.jar (build 8.0-310).

Markus

--
markus schaber | dipl. informatiker
logi-track ag | rennweg 14-16 | ch 8001 zürich
phone +41-43-888 62 52 | fax +41-43-888 62 53
mailto:schabios@logi-track.com | www.logi-track.com


Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> You idea was that we have "vacuum full" + update or select for update in 
> the same time.
> I think it is not the case, because we start vacuum full at 1:00 AM and 
> no one is working in this time.

Hmm.  AFAICT the duplicate row copies could only be produced by vacuum
full, so that's certainly part of the issue.  But if vacuum full in
isolation were broken, we'd surely know it; so there must be some other
contributing factor involved that your setup is exercising but other
people are (mostly) not doing.

I agree with the plan to use plain vacuum for awhile and see if that
makes the problem go away.  I think it would have to, but maybe I'm
all wet about that.

In the meantime I would suggest seeing if you can distill your
application down into a test case that other people can run to reproduce
the problem.  It doesn't matter if the test doesn't make the bug happen
very often, but we have to see the problem happening before we have much
hope of fixing it.

> Will vacuum full generate this problem if we have locked table in this 
> time? (It is possible to have locked table in theory)

No, that's hardly likely.  vacuum full deals with locks all the time.

> Can you describe more detailed the idea of  problem with "vacuum full" + 
> "update" and can some one make patch if this problem exists in theory 
> (if I  understand you right)?

I have no idea what the actual failure mechanism might be.
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid10243.1109090250@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo
<aclass="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> writes:
</pre><blockquotetype="cite"><pre wrap="">You idea was that we have "vacuum full" + update or select for update in 
 
the same time.
I think it is not the case, because we start vacuum full at 1:00 AM and 
no one is working in this time.   </pre></blockquote><pre wrap="">
Hmm.  AFAICT the duplicate row copies could only be produced by vacuum
full, so that's certainly part of the issue.  But if vacuum full in
isolation were broken, we'd surely know it; so there must be some other
contributing factor involved that your setup is exercising but other
people are (mostly) not doing.

I agree with the plan to use plain vacuum for awhile and see if that
makes the problem go away.  I think it would have to, but maybe I'm
all wet about that. </pre></blockquote> Ok, we can still using vacuum full on some installs (with risk to make problems
tocustomes).<br /> I will to ask if it will be possible to start some querys (I do not know the query) exactly before
runningvacuum full and to save the results in some log file. If it is possible, we will be able to post the results to
thelist in case of  ne problem and to have some start point for reproducing the problem. My idea is some one more
familiarwith pg to send this querys (if it exists) and we will install it in vacuum scripts.<br /><br /><blockquote
cite="mid10243.1109090250@sss.pgh.pa.us"type="cite"><pre wrap="">
 
In the meantime I would suggest seeing if you can distill your
application down into a test case that other people can run to reproduce
the problem.  It doesn't matter if the test doesn't make the bug happen
very often, but we have to see the problem happening before we have much
hope of fixing it.
 </pre><blockquote type="cite"><pre wrap="">Will vacuum full generate this problem if we have locked table in this 
time? (It is possible to have locked table in theory)   </pre></blockquote><pre wrap="">
No, that's hardly likely.  vacuum full deals with locks all the time.
 </pre><blockquote type="cite"><pre wrap="">Can you describe more detailed the idea of  problem with "vacuum full" + 
"update" and can some one make patch if this problem exists in theory 
(if I  understand you right)?   </pre></blockquote><pre wrap="">
I have no idea what the actual failure mechanism might be.
        regards, tom lane </pre></blockquote> regards,<br /> ivan.<br /><blockquote
cite="mid10243.1109090250@sss.pgh.pa.us"type="cite"><pre wrap="">
 
 </pre></blockquote><br />

Re: pg primary key bug?

From
Tom Lane
Date:
pginfo <pginfo@t1.unisoftbg.com> writes:
> I will to ask if it will be possible to start some querys (I do not know 
> the query) exactly before running vacuum full and to save the results in 
> some log file. If it is possible, we will be able to post the results to 
> the list in case of  ne problem and to have some start point for 
> reproducing the problem.

Well, you should definitely turn on log_statement across the whole
installation so that you have a complete record of all SQL commands
being issued.  Make sure the log includes timestamps and PIDs.

I would suggest adding a simple probe for duplicate records to the
vacuum script.  Maybe something like
set enable_indexscan to off;select constname,fid,count(*) from a_constants_strgroup by constname,fid having count(*) >
1;

(The indexscan off bit is just paranoia --- I think that an indexscan
might mask the presence of multiple copies of what's supposedly a unique
key.)  Do this just before and just after the vacuum full command.  That
will at least nail down whether vacuum full is creating the dups, and
once we see it happen the trace of the day's SQL commands may give some
ideas where to look.
        regards, tom lane


Re: pg primary key bug?

From
pginfo
Date:
<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid10956.1109095448@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo
<aclass="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> writes:
</pre><blockquotetype="cite"><pre wrap="">I will to ask if it will be possible to start some querys (I do not know 
 
the query) exactly before running vacuum full and to save the results in 
some log file. If it is possible, we will be able to post the results to 
the list in case of  ne problem and to have some start point for 
reproducing the problem.   </pre></blockquote><pre wrap="">
Well, you should definitely turn on log_statement across the whole
installation so that you have a complete record of all SQL commands
being issued.  Make sure the log includes timestamps and PIDs.

I would suggest adding a simple probe for duplicate records to the
vacuum script.  Maybe something like
set enable_indexscan to off;select constname,fid,count(*) from a_constants_strgroup by constname,fid having count(*)
>1;
 

(The indexscan off bit is just paranoia --- I think that an indexscan
might mask the presence of multiple copies of what's supposedly a unique
key.)  Do this just before and just after the vacuum full command.  That
will at least nail down whether vacuum full is creating the dups, and
once we see it happen the trace of the day's SQL commands may give some
ideas where to look. </pre></blockquote> Ok, it was my idea.<br /> We will do it and install the script in ~100 servers
andwill see the result.<br /><blockquote cite="mid10956.1109095448@sss.pgh.pa.us" type="cite"><pre wrap="">
regards,tom lane
 

 </pre></blockquote> regards,<br /> ivan.<br />