Thread: pg primary key bug?
Hi , I am using pg 7.4.5 on RedHat AS 3.0. I am using it via jdbc and jboss. I have found big problem about the way pg supports primary keys. The bug was reported from my customers ( we are installed pg with our ERP on ~ 500 costomers) and I do not know hot it is happen. sklad21=# \d a_constants_str Table "public.a_constants_str" Column | Type | Modifiers ------------+-----------------------+-----------constname | character varying(30) | not nullfid | integer | not nullconstvalue | character varying(30) | Indexes: "a_constants_str_pkey" primary key, btree (constname, fid) sklad21=# select * from a_constants_str ;constname | fid | constvalue -----------+-----+-------------AACCGRID | 0 | SOF_3AKLTYPID | 0 | SOF_3ADARID | 0 | SOF_2AOBLASTID | 0 | SOF_6AUSERID | 0 | SOF_17ANMGRID | 0 | SOF_21LOCAID | 0 | SOF_41DOCID | 0 | SOF_1585254DOCPLAID | 0| SOF_1052900AKLIID | 0 | SOF_18740DOCRID | 0 | SOF_2268142DOCPOGPLA | 0 | SOF_324586DOCID | 0 | SOF_1585254DOCID | 0 | SOF_1585254DOCID | 0 | SOF_1585254AKLGRID | 0 | SOF_45DOCID | 0 | SOF_1585254DOCID | 0 | SOF_1585254ASETUPID | 0 | SOF_4605DOCID | 0 | SOF_1585254DOCID | 0 | SOF_1585254TDOCID | 0 | SOF_337TDOCRID | 0 | SOF_19450DOCID | 0 | SOF_1585254DOCID | 0 | SOF_1585254AGRADID | 0 | SOF_256DOCID | 0 | SOF_1585254ASLUID | 0 | SOF_46NASTRF | 0 | SOF_88ANOMID | 0 | SOF_1200 (30 rows) Pls., see the records with 'DOCID' and note we have primary key defined. At this moment we will create a new db and dump and reload the data. The old one will stay and if needet we will be able to study the files (the access to this customer is not very simple, but possible). regards, ivan.
On Feb 3, 2005, at 21:53, pginfo wrote: > I am using pg 7.4.5 on RedHat AS 3.0. > sklad21=# \d a_constants_str > Table "public.a_constants_str" > Column | Type | Modifiers > ------------+-----------------------+----------- > constname | character varying(30) | not null > fid | integer | not null > constvalue | character varying(30) | > Indexes: > "a_constants_str_pkey" primary key, btree (constname, fid) > > sklad21=# select * from a_constants_str ; > constname | fid | constvalue -----------+-----+------------- > AACCGRID | 0 | SOF_3 > AKLTYPID | 0 | SOF_3 > ADARID | 0 | SOF_2 > AOBLASTID | 0 | SOF_6 > AUSERID | 0 | SOF_17 > ANMGRID | 0 | SOF_21 > LOCAID | 0 | SOF_41 > DOCID | 0 | SOF_1585254 > DOCPLAID | 0 | SOF_1052900 > AKLIID | 0 | SOF_18740 > DOCRID | 0 | SOF_2268142 > DOCPOGPLA | 0 | SOF_324586 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > AKLGRID | 0 | SOF_45 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > ASETUPID | 0 | SOF_4605 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > TDOCID | 0 | SOF_337 > TDOCRID | 0 | SOF_19450 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > AGRADID | 0 | SOF_256 > DOCID | 0 | SOF_1585254 > ASLUID | 0 | SOF_46 > NASTRF | 0 | SOF_88 > ANOMID | 0 | SOF_1200 > (30 rows) > > > Pls., see the records with 'DOCID' and note we have primary key > defined. It's unclear from just this data, but there's a chance that there are varying numbers of spaces after 'DOCID', which would appear as separate values for the index, even though they aren't readily apparent. Could you show us the results of the following query? select constname, length(constname) as constname_length from a_constants_str; Here's another one which would show if PostgreSQL is treating them equally: select constname, count(constname) from a_constants_str; The results of these queries might shed some light on the issue. Hope this helps. Michael Glaesemann grzm myrealbox com
Hi, It is not spaces problem. I needet to dump the data and I deleted all rows for 'DOCID' with delete from a_constants_str where constname= 'DOCID'; and the pg reported 10 rows deleted. Sorry I can not execute the commend now ( I dropped the data, becaus I needet the uniqu constnames for the restore). regards, ivan. Michael Glaesemann wrote: > > On Feb 3, 2005, at 21:53, pginfo wrote: > >> I am using pg 7.4.5 on RedHat AS 3.0. > > >> sklad21=# \d a_constants_str >> Table "public.a_constants_str" >> Column | Type | Modifiers >> ------------+-----------------------+----------- >> constname | character varying(30) | not null >> fid | integer | not null >> constvalue | character varying(30) | >> Indexes: >> "a_constants_str_pkey" primary key, btree (constname, fid) >> >> sklad21=# select * from a_constants_str ; >> constname | fid | constvalue -----------+-----+------------- >> AACCGRID | 0 | SOF_3 >> AKLTYPID | 0 | SOF_3 >> ADARID | 0 | SOF_2 >> AOBLASTID | 0 | SOF_6 >> AUSERID | 0 | SOF_17 >> ANMGRID | 0 | SOF_21 >> LOCAID | 0 | SOF_41 >> DOCID | 0 | SOF_1585254 >> DOCPLAID | 0 | SOF_1052900 >> AKLIID | 0 | SOF_18740 >> DOCRID | 0 | SOF_2268142 >> DOCPOGPLA | 0 | SOF_324586 >> DOCID | 0 | SOF_1585254 >> DOCID | 0 | SOF_1585254 >> DOCID | 0 | SOF_1585254 >> AKLGRID | 0 | SOF_45 >> DOCID | 0 | SOF_1585254 >> DOCID | 0 | SOF_1585254 >> ASETUPID | 0 | SOF_4605 >> DOCID | 0 | SOF_1585254 >> DOCID | 0 | SOF_1585254 >> TDOCID | 0 | SOF_337 >> TDOCRID | 0 | SOF_19450 >> DOCID | 0 | SOF_1585254 >> DOCID | 0 | SOF_1585254 >> AGRADID | 0 | SOF_256 >> DOCID | 0 | SOF_1585254 >> ASLUID | 0 | SOF_46 >> NASTRF | 0 | SOF_88 >> ANOMID | 0 | SOF_1200 >> (30 rows) >> >> >> Pls., see the records with 'DOCID' and note we have primary key defined. > > > > It's unclear from just this data, but there's a chance that there are > varying numbers of spaces after 'DOCID', which would appear as > separate values for the index, even though they aren't readily > apparent. Could you show us the results of the following query? > > select constname, length(constname) as constname_length > from a_constants_str; > > Here's another one which would show if PostgreSQL is treating them > equally: > > select constname, count(constname) > from a_constants_str; > > The results of these queries might shed some light on the issue. > > Hope this helps. > > Michael Glaesemann > grzm myrealbox com > > >
There are not enough spaces available in the column to allow for the number of DOC_IDs. There are three spaces, allowing for at most four occurrences of DOC_ID, but there are eleven. --Rick Michael Glaesemann <grzm@myrealbox.com To: pginfo <pginfo@t1.unisoftbg.com> > cc: pgsql-sql@postgresql.org Sent by: Subject: Re: [SQL] pg primary key bug? pgsql-sql-owner@pos tgresql.org 02/03/2005 09:14 AM On Feb 3, 2005, at 21:53, pginfo wrote: > I am using pg 7.4.5 on RedHat AS 3.0. > sklad21=# \d a_constants_str > Table "public.a_constants_str" > Column | Type | Modifiers > ------------+-----------------------+----------- > constname | character varying(30) | not null > fid | integer | not null > constvalue | character varying(30) | > Indexes: > "a_constants_str_pkey" primary key, btree (constname, fid) > > sklad21=# select * from a_constants_str ; > constname | fid | constvalue -----------+-----+------------- > AACCGRID | 0 | SOF_3 > AKLTYPID | 0 | SOF_3 > ADARID | 0 | SOF_2 > AOBLASTID | 0 | SOF_6 > AUSERID | 0 | SOF_17 > ANMGRID | 0 | SOF_21 > LOCAID | 0 | SOF_41 > DOCID | 0 | SOF_1585254 > DOCPLAID | 0 | SOF_1052900 > AKLIID | 0 | SOF_18740 > DOCRID | 0 | SOF_2268142 > DOCPOGPLA | 0 | SOF_324586 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > AKLGRID | 0 | SOF_45 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > ASETUPID | 0 | SOF_4605 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > TDOCID | 0 | SOF_337 > TDOCRID | 0 | SOF_19450 > DOCID | 0 | SOF_1585254 > DOCID | 0 | SOF_1585254 > AGRADID | 0 | SOF_256 > DOCID | 0 | SOF_1585254 > ASLUID | 0 | SOF_46 > NASTRF | 0 | SOF_88 > ANOMID | 0 | SOF_1200 > (30 rows) > > > Pls., see the records with 'DOCID' and note we have primary key > defined. It's unclear from just this data, but there's a chance that there are varying numbers of spaces after 'DOCID', which would appear as separate values for the index, even though they aren't readily apparent. Could you show us the results of the following query? select constname, length(constname) as constname_length from a_constants_str; Here's another one which would show if PostgreSQL is treating them equally: select constname, count(constname) from a_constants_str; The results of these queries might shed some light on the issue. Hope this helps. Michael Glaesemann grzm myrealbox com ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
<br /> Hi all,<br /><br /> Sorry for my post, but I think it is pg primary key bug.<br /><br /> It is secont time in thatwe found the bug (see the replays for this message).<br /> We have many server with pg and use it over jdbc + jboss.<br/><br /> I am not able to stop this server for long time, but I have dumped the problem table.<br /> It is veryimportant to know if it is bug, because we have many server running pg + our ERP and continuing to install new.<br /><br/> 01=# select * from a_constants_str order by constname;<br /> constname | fid | constvalue <br /> -----------+-----+------------<br/> AACCGRID | 0 | SOF_3<br /> ADARID | 0 | SOF_2<br /> AGRADID | 0 | SOF_165<br/> AKLGRID | 0 | SOF_8<br /> AKLIID | 0 | SOF_3513<br /> AKLTYPID | 0 | SOF_3<br /> ANMGRID | 0 | SOF_10<br /> ANOMID | 0 | SOF_747<br /> AOBLASTID | 0 | SOF_3<br /> ASETUPID | 0 | SOF_399<br /> ASLUID | 0 | SOF_17<br /> AUSERID | 0 | SOF_3<br /> DOCID | 0 | SOF_25658<br /> DOCPLAID | 0 | SOF_19738<br/> DOCPLAID | 0 | SOF_19738<br /> DOCPOGPLA | 0 | SOF_24281<br /> DOCRID | 0 | SOF_184547<br /> LOCAID | 0 | SOF_13<br /> NASTRF | 0 | SOF_1<br /> TDOCID | 0 | SOF_47<br /> TDOCRID | 0 | SOF_2439<br/> (21 rows)<br /><br /> 01=# select * from a_constants_str where constname='DOCPLAID' ;<br /> constname | fid| constvalue <br /> -----------+-----+------------<br /> DOCPLAID | 0 | SOF_19738<br /> DOCPLAID | 0 | SOF_19738<br/> (2 rows)<br /><br /><br /> 01=# \d a_constants_str;<br /> Table "public.a_constants_str"<br /> Column | Type | Modifiers <br /> ------------+-----------------------+-----------<br /> constname | character varying(30) | not null<br /> fid | integer | not null<br /> constvalue | charactervarying(30) | <br /> Indexes:<br /> "a_constants_str_pkey" primary key, btree (constname, fid)<br /><br /><br/> regards,<br /> ivan<br /><br /><br /><br /><br /><br /><br /><br /><br /><br /><a class="moz-txt-link-abbreviated"href="mailto:Richard_D_Levine@raytheon.com">Richard_D_Levine@raytheon.com</a> wrote:<br /><blockquotecite="midOFEF67063C.F61988F3-ON05256F9D.005A91D6@ftw.us.ray.com" type="cite"><pre wrap="">There are not enoughspaces available in the column to allow for the number of DOC_IDs. There are three spaces, allowing for at most four occurrences of DOC_ID, but there are eleven. --Rick Michael Glaesemann <<a class="moz-txt-link-abbreviated" href="mailto:grzm@myrealbox.com">grzm@myrealbox.com</a> To: pginfo <a class="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> > cc: <a class="moz-txt-link-abbreviated" href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a> Sent by: Subject: Re: [SQL] pg primary key bug? pgsql-sql-owner@pos tgresql.org 02/03/2005 09:14 AM On Feb 3, 2005, at 21:53, pginfo wrote: </pre><blockquote type="cite"><pre wrap="">I am using pg 7.4.5 on RedHat AS 3.0. </pre></blockquote><pre wrap=""> </pre><blockquotetype="cite"><pre wrap="">sklad21=# \d a_constants_str Table "public.a_constants_str" Column | Type | Modifiers ------------+-----------------------+----------- constname | character varying(30) | not null fid | integer | not null constvalue | character varying(30) | Indexes: "a_constants_str_pkey" primary key, btree (constname, fid) sklad21=# select * from a_constants_str ; constname | fid | constvalue -----------+-----+------------- AACCGRID | 0 | SOF_3 AKLTYPID | 0 | SOF_3 ADARID | 0 | SOF_2 AOBLASTID | 0 | SOF_6 AUSERID | 0 | SOF_17 ANMGRID | 0 | SOF_21 LOCAID | 0 | SOF_41 DOCID | 0 | SOF_1585254 DOCPLAID | 0 | SOF_1052900 AKLIID | 0 | SOF_18740 DOCRID | 0 | SOF_2268142 DOCPOGPLA | 0 | SOF_324586 DOCID | 0 | SOF_1585254 DOCID | 0 | SOF_1585254 DOCID | 0 | SOF_1585254 AKLGRID | 0 | SOF_45 DOCID | 0 | SOF_1585254 DOCID | 0 | SOF_1585254 ASETUPID | 0 | SOF_4605 DOCID | 0 | SOF_1585254 DOCID | 0 | SOF_1585254 TDOCID | 0 | SOF_337 TDOCRID | 0 | SOF_19450 DOCID | 0 | SOF_1585254 DOCID | 0 | SOF_1585254 AGRADID | 0 | SOF_256 DOCID | 0 | SOF_1585254 ASLUID | 0 | SOF_46 NASTRF | 0 | SOF_88 ANOMID | 0 | SOF_1200 (30 rows) Pls., see the records with 'DOCID' and note we have primary key defined. </pre></blockquote><pre wrap=""> It's unclear from just this data, but there's a chance that there are varying numbers of spaces after 'DOCID', which would appear as separate values for the index, even though they aren't readily apparent. Could you show us the results of the following query? select constname, length(constname) as constname_length from a_constants_str; Here's another one which would show if PostgreSQL is treating them equally: select constname, count(constname) from a_constants_str; The results of these queries might shed some light on the issue. Hope this helps. Michael Glaesemann grzm myrealbox com ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to <a class="moz-txt-link-abbreviated"href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a>) ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to <a class="moz-txt-link-abbreviated"href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a>) </pre></blockquote><br />
On Thu, Feb 17, 2005 at 04:12:38PM +0100, pginfo wrote: > 01=# select * from a_constants_str where constname='DOCPLAID' ; > constname | fid | constvalue > -----------+-----+------------ > DOCPLAID | 0 | SOF_19738 > DOCPLAID | 0 | SOF_19738 > (2 rows) Do you have any inherited tables? What's the result of the following query? SELECT tableoid::regclass, * FROM a_constants_str WHERE constname = 'DOCPLAID'; Inherited tables are documented to have deficiencies regarding constraints. Observe: CREATE TABLE parent ( constname varchar(30) NOT NULL, fid integer NOT NULL, constvalue varchar(30), PRIMARYKEY (constname, fid) ); CREATE TABLE child () INHERITS (parent); INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738'); INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738'); ERROR: duplicate key violates unique constraint "parent_pkey" INSERT INTO child VALUES ('DOCPLAID', 0, 'SOF_19738'); SELECT tableoid::regclass, * FROM parent;tableoid | constname | fid | constvalue ----------+-----------+-----+------------parent | DOCPLAID | 0 | SOF_19738child | DOCPLAID | 0 | SOF_19738 (2 rows) -- Michael Fuhr http://www.fuhr.org/~mfuhr/
Hi,<br /><br /> No, I do not inherited tables.<br /><br /> The result of this query is:<br /> 01=# SELECT tableoid::regclass,*<br /> 01-# FROM a_constants_str<br /> 01-# WHERE constname = 'DOCPLAID';<br /> tableoid | constname| fid | constvalue <br /> -----------------+-----------+-----+------------<br /> a_constants_str | DOCPLAID | 0 | SOF_19738<br /> a_constants_str | DOCPLAID | 0 | SOF_19738<br /> (2 rows)<br /><br /> regards,<br /> ivan.<br/><br /><br /> Michael Fuhr wrote:<br /><blockquote cite="mid20050217164013.GA26345@winnie.fuhr.org" type="cite"><prewrap="">On Thu, Feb 17, 2005 at 04:12:38PM +0100, pginfo wrote: </pre><blockquote type="cite"><pre wrap="">01=# select * from a_constants_str where constname='DOCPLAID' ; constname | fid | constvalue -----------+-----+------------ DOCPLAID | 0 | SOF_19738 DOCPLAID | 0 | SOF_19738 (2 rows) </pre></blockquote><pre wrap=""> Do you have any inherited tables? What's the result of the following query? SELECT tableoid::regclass, * FROM a_constants_str WHERE constname = 'DOCPLAID'; Inherited tables are documented to have deficiencies regarding constraints. Observe: CREATE TABLE parent ( constname varchar(30) NOT NULL, fid integer NOT NULL, constvalue varchar(30), PRIMARYKEY (constname, fid) ); CREATE TABLE child () INHERITS (parent); INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738'); INSERT INTO parent VALUES ('DOCPLAID', 0, 'SOF_19738'); ERROR: duplicate key violates unique constraint "parent_pkey" INSERT INTO child VALUES ('DOCPLAID', 0, 'SOF_19738'); SELECT tableoid::regclass, * FROM parent;tableoid | constname | fid | constvalue ----------+-----------+-----+------------parent | DOCPLAID | 0 | SOF_19738child | DOCPLAID | 0 | SOF_19738 (2 rows) </pre></blockquote><br />
pginfo <pginfo@t1.unisoftbg.com> writes: > 01=# select * from a_constants_str where constname='DOCPLAID' ; > constname | fid | constvalue > -----------+-----+------------ > DOCPLAID | 0 | SOF_19738 > DOCPLAID | 0 | SOF_19738 > (2 rows) Could we see the system columns on these rows? select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str where ... regards, tom lane
Hi,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid8125.1108661189@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo<a class="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre wrap="">01=#select * from a_constants_str where constname='DOCPLAID' ;constname | fid | constvalue -----------+-----+------------DOCPLAID | 0 | SOF_19738DOCPLAID | 0 | SOF_19738 (2 rows) </pre></blockquote><pre wrap=""> Could we see the system columns on these rows? select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str where ... regards, tom lane </pre></blockquote> yes,<br /> 01=# select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str;<br /> oid | xmin | cmin | xmax | cmax | ctid | constname | fid | constvalue <br /> --------+---------+---------+---------+---------+---------+-----------+-----+------------<br/> 17910 | 22331 | 2 | 2 | 26679 | (0,7) | ANMGRID | 0 | SOF_10<br /> 17908 | 985 | 6 | 6 | 0 | (0,18) | AKLTYPID | 0 | SOF_3<br /> 17907 | 985 | 30 | 30 | 0 | (0,21) | ADARID | 0 | SOF_2<br/> 17921 | 985 | 34 | 34 | 0 | (0,22) | AOBLASTID | 0 | SOF_3<br /> 17911 | 4640 | 6 | 6 | 26679 | (0,24) | AACCGRID | 0 | SOF_3<br /> 17920 | 1220598 | 2 | 2 | 1475630 | (0,47) | ASLUID | 0 | SOF_17<br /> 17917 | 643083 | 2 | 2 | 1475630 | (0,49) | LOCAID | 0 | SOF_13<br/> 17918 | 762851 | 3 | 3 | 1475630 | (0,50) | AUSERID | 0 | SOF_3<br /> 17923 | 35539 | 165 | 165 | 1475630 | (0,51) | ASETUPID | 0 | SOF_399<br /> 283686 | 514327 | 3 | 3 | 1475630| (0,52) | NASTRF | 0 | SOF_1<br /> 17909 | 2156667 | 2 | 2 | 2193198 | (0,54) | AKLGRID | 0 | SOF_8<br /> 17922 | 2103298 | 2 | 2 | 2193198 | (0,55) | AGRADID | 0 | SOF_165<br /> 17913 |2092705 | 2 | 2 | 2193198 | (0,56) | ANOMID | 0 | SOF_747<br /> 63247 | 2226373 | 2 | 2| 2233003 | (0,126) | TDOCID | 0 | SOF_47<br /> 17914 | 2232893 | 2235861 | 2235861 | 3 | (1,125) | DOCID | 0 | SOF_25658<br /> 17915 | 2232893 | 2235861 | 2235861 | 19 | (3,38) | DOCRID | 0 | SOF_184547<br/> 17916 | 2232893 | 2235861 | 2235861 | 42 | (4,71) | DOCPLAID | 0 | SOF_19738<br /> 17919 | 2232893| 231 | 231 | 2233003 | (5,104) | DOCPOGPLA | 0 | SOF_24281<br /> 17912 | 2231139 | 2 | 2| 0 | (6,1) | AKLIID | 0 | SOF_3513<br /> 17916 | 2232893 | 2235861 | 2235861 | 41 | (7,62) | DOCPLAID | 0 | SOF_19738<br /> 63249 | 2226373 | 103 | 103 | 0 | (16,31) | TDOCRID | 0 | SOF_2439<br/> (21 rows)<br /><br /> and <br /><br /> 01=# select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str whereconstname='DOCPLAID';<br /> oid | xmin | cmin | xmax | cmax | ctid | constname | fid | constvalue <br/> -------+---------+---------+---------+------+--------+-----------+-----+------------<br /> 17916 | 2232893 | 2235861| 2235861 | 42 | (4,71) | DOCPLAID | 0 | SOF_19738<br /> 17916 | 2232893 | 2235861 | 2235861 | 41 | (7,62)| DOCPLAID | 0 | SOF_19738<br /> (2 rows)<br /><br /> regards,<br /> ivan.<br />
pginfo <pginfo@t1.unisoftbg.com> writes: > Tom Lane wrote: >> Could we see the system columns on these rows? > 01=# select oid,xmin,cmin,xmax,cmax,ctid,* from a_constants_str where > constname='DOCPLAID'; > oid | xmin | cmin | xmax | cmax | ctid | constname | fid | > constvalue > -------+---------+---------+---------+------+--------+-----------+-----+------------ > 17916 | 2232893 | 2235861 | 2235861 | 42 | (4,71) | DOCPLAID | 0 | > SOF_19738 > 17916 | 2232893 | 2235861 | 2235861 | 41 | (7,62) | DOCPLAID | 0 | > SOF_19738 > (2 rows) Given the identical OID and xmin values, it seems certain that these are the "same" row, ie there was only one insertion event. My bet is that the one at (7,62) is the original, and that the one at (4,71) is a copy that was made by VACUUM FULL trying to move the row to compact the table. So the question is how did both copies get to be marked simultaneously valid? That should be impossible, unless a disk write got dropped. Have you had any system crashes during VACUUM FULL operations recently? regards, tom lane
Hi,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid9340.1108664352@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo<a class="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre wrap="">TomLane wrote: </pre><blockquote type="cite"><pre wrap="">Could we see the system columns on these rows? </pre></blockquote></blockquote><prewrap=""> </pre><blockquote type="cite"><pre wrap="">01=# select oid,xmin,cmin,xmax,cmax,ctid,*from a_constants_str where constname='DOCPLAID'; oid | xmin | cmin | xmax | cmax | ctid | constname | fid | constvalue -------+---------+---------+---------+------+--------+-----------+-----+------------17916 | 2232893 | 2235861 | 2235861 | 42 | (4,71) | DOCPLAID | 0 | SOF_1973817916 | 2232893 | 2235861 | 2235861 | 41 | (7,62) | DOCPLAID | 0 | SOF_19738 (2 rows) </pre></blockquote><pre wrap=""> Given the identical OID and xmin values, it seems certain that these are the "same" row, ie there was only one insertion event. My bet is that the one at (7,62) is the original, and that the one at (4,71) is a copy that was made by VACUUM FULL trying to move the row to compact the table. So the question is how did both copies get to be marked simultaneously valid? That should be impossible, unless a disk write got dropped. Have you had any system crashes during VACUUM FULL operations recently? regards, tom lane </pre></blockquote> I do not know exact, but it is possible.<br /> This is customer server and I do not have any info aboutserver/os crash, but I am 100% sure.<br /> I will ask as soon as possible and replay.<br /> Also I will check the logs.<br/> We start pg witl pg_ctl ...-D... -l /logfile and if we can found the needet info in pg log file I can checkit ( I do not nkow what to search in this file).<br /> We are running vacuum once per 24 h.<br /> The OS is RedHat AS3.<br /> PostgreSQL is 7.4.1 (but we have the same problem on 7.4.5).<br /><br /> Will upgrade to 8.0 solve this type ofproblems ?<br /><br /> regards,<br /> ivan.<br />
pginfo <pginfo@t1.unisoftbg.com> writes: > Will upgrade to 8.0 solve this type of problems ? The problem is probably not Postgres' fault at all. I'm wondering about disks with write cacheing enabled. And you didn't turn off fsync, I trust? regards, tom lane
Hi,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid9785.1108666900@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo<a class="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre wrap="">Willupgrade to 8.0 solve this type of problems ? </pre></blockquote><pre wrap=""> The problem is probably not Postgres' fault at all. I'm wondering about disks with write cacheing enabled. And you didn't turn off fsync, I trust? </pre></blockquote> About fsync (part from postgresql.conf) :<br /><br /><br /> #---------------------------------------------------------------------------<br/> # WRITE AHEAD LOG<br /> #---------------------------------------------------------------------------<br/><br /> # - Settings -<br /><br /> #fsync= true # turns forced synchronization on or off<br /> #wal_sync_method = fsync # the defaultvaries across platforms:<br /> # fsync, fdatasync, open_sync, or open_datasync<br/> #wal_buffers = 8 # min 4, 8KB each<br /><br /> # - Checkpoints -<br /><br /> #checkpoint_segments= 3 # in logfile segments, min 1, 16MB each<br /> #checkpoint_timeout = 300 # range 30-3600,in seconds<br /> #checkpoint_warning = 30 # 0 is off, in seconds<br /><br /><br /> Also part from pg logfile:<br/><br /> LOG: statistics collector process (PID 2716) exited with exit code 1<br /> LOG: shutting down<br />LOG: database system is shut down<br /> LOG: could not create IPv6 socket: Address family not supported by protocol<br/> LOG: database system was shut down at 2005-02-11 19:58:26 EET<br /> LOG: checkpoint record is at 2/BAC39188<br/> LOG: redo record is at 2/BAC39188; undo record is at 0/0; shutdown TRUE<br /> LOG: next transaction ID:2221145; next OID: 826607<br /> LOG: database system is ready<br /> LOG: recycled transaction log file "00000002000000BA"<br/> LOG: recycled transaction log file "00000002000000BB"<br /> LOG: recycled transaction log file"00000002000000BC"<br /> LOG: recycled transaction log file "00000002000000BD"<br /> LOG: recycled transaction logfile "00000002000000BE"<br /> WARNING: index "a_constants_str_pkey" contains 1449 row versions, but table contains 1422row versions<br /> HINT: Rebuild the index with REINDEX.<br /> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br/> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br /> ERROR: duplicatekey violates unique constraint "a_constants_str_pkey"<br /> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br/> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br /> ERROR: duplicatekey violates unique constraint "a_constants_str_pkey"<br /> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br/> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br /> ERROR: duplicatekey violates unique constraint "a_constants_str_pkey"<br /> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br/> ERROR: duplicate key violates unique constraint "a_constants_str_pkey"<br /> LOG: receivedsmart shutdown request<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminatingconnection due to administrator command<br /> FATAL: terminating connection due to administrator command<br />FATAL: terminating connection due to administrator command<br /> FATAL: terminating connection due to administrator command<br/> FATAL: terminating connection due to administrator command<br /> FATAL: terminating connection due to administratorcommand<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminating connectiondue to administrator command<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminatingconnection due to administrator command<br /> FATAL: terminating connection due to administrator command<br />FATAL: terminating connection due to administrator command<br /> FATAL: terminating connection due to administrator command<br/> FATAL: terminating connection due to administrator command<br /> FATAL: terminating connection due to administratorcommand<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminating connectiondue to administrator command<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminatingconnection due to administrator command<br /> FATAL: terminating connection due to administrator command<br />FATAL: terminating connection due to administrator command<br /> FATAL: terminating connection due to administrator command<br/> FATAL: terminating connection due to administrator command<br /> FATAL: terminating connection due to administratorcommand<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminating connectiondue to administrator command<br /> FATAL: terminating connection due to administrator command<br /> FATAL: terminatingconnection due to administrator command<br /> FATAL: terminating connection due to administrator command<br />LOG: statistics collector process (PID 2713) exited with exit code 1<br /> LOG: shutting down<br /> LOG: database systemis shut down<br /> LOG: could not create IPv6 socket: Address family not supported by protocol<br /> LOG: databasesystem was shut down at 2005-02-16 08:32:21 EET<br /> LOG: checkpoint record is at 2/BFAE09EC<br /> LOG: redo recordis at 2/BFAE09EC; undo record is at 0/0; shutdown TRUE<br /><br /><br /> Note we was informed about the problem on 2005-02-16 and rebooted the box.<br /> As I see in log file the last restart was on 2005-02-11 and after it all workedwell.<br /> The affected table is critical and the most used table in this system and if the insert stop to work stopalso the system.<br /> I will notice also that in the first case when we found the same problem an this system worked~80 users. <br /><br /> Is it possible pg_dump or reindex table or vacuum analyze to make this problem?<br /> We areusing it very regular.<br /><br /> Will we need to stop using vacuum full at all?<br /><br /> regards,<br /> ivan.<br/><br /><br /><blockquote cite="mid9785.1108666900@sss.pgh.pa.us" type="cite"><pre wrap=""> regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend </pre></blockquote><br />
Hi Ivan,
Sorry, I can't remember all you said in earlier posts, but I was wondering, your log file says:
> HINT: Rebuild the index with REINDEX.
Did you do that, and did it solve the problem?
regards
Iain
----- Original Message -----From: pginfoTo: Tom LaneCc: Richard_D_Levine@raytheon.com ; Michael Glaesemann ; pgsql-sql@postgresql.org ; pgsql-sql-owner@postgresql.orgSent: Friday, February 18, 2005 1:53 PMSubject: Re: [SQL] pg primary key bug?Hi,
Tom Lane wrote:About fsync (part from postgresql.conf) :pginfo <pginfo@t1.unisoftbg.com> writes:Will upgrade to 8.0 solve this type of problems ?The problem is probably not Postgres' fault at all. I'm wondering about disks with write cacheing enabled. And you didn't turn off fsync, I trust?
#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------
# - Settings -
#fsync = true # turns forced synchronization on or off
#wal_sync_method = fsync # the default varies across platforms:
# fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8 # min 4, 8KB each
# - Checkpoints -
#checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300 # range 30-3600, in seconds
#checkpoint_warning = 30 # 0 is off, in seconds
Also part from pg logfile:
LOG: statistics collector process (PID 2716) exited with exit code 1
LOG: shutting down
LOG: database system is shut down
LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: database system was shut down at 2005-02-11 19:58:26 EET
LOG: checkpoint record is at 2/BAC39188
LOG: redo record is at 2/BAC39188; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 2221145; next OID: 826607
LOG: database system is ready
LOG: recycled transaction log file "00000002000000BA"
LOG: recycled transaction log file "00000002000000BB"
LOG: recycled transaction log file "00000002000000BC"
LOG: recycled transaction log file "00000002000000BD"
LOG: recycled transaction log file "00000002000000BE"
WARNING: index "a_constants_str_pkey" contains 1449 row versions, but table contains 1422 row versions
HINT: Rebuild the index with REINDEX.
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
LOG: received smart shutdown request
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
LOG: statistics collector process (PID 2713) exited with exit code 1
LOG: shutting down
LOG: database system is shut down
LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: database system was shut down at 2005-02-16 08:32:21 EET
LOG: checkpoint record is at 2/BFAE09EC
LOG: redo record is at 2/BFAE09EC; undo record is at 0/0; shutdown TRUE
Note we was informed about the problem on 2005-02-16 and rebooted the box.
As I see in log file the last restart was on 2005-02-11 and after it all worked well.
The affected table is critical and the most used table in this system and if the insert stop to work stop also the system.
I will notice also that in the first case when we found the same problem an this system worked ~80 users.
Is it possible pg_dump or reindex table or vacuum analyze to make this problem?
We are using it very regular.
Will we need to stop using vacuum full at all?
regards,
ivan.regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
Hi Iain,
Iain wrote:
No it do not solve the problem.
I sendet the log only to show that we do not have any server crash nor pg restart.
regards,
ivan.
Iain wrote:
Hi Ivan,
Sorry, I can't remember all you said in earlier posts, but I was wondering, your log file says:
> HINT: Rebuild the index with REINDEX.
Did you do that, and did it solve the problem?
I sendet the log only to show that we do not have any server crash nor pg restart.
regards,
ivan.
regardsIain----- Original Message -----From: pginfoTo: Tom LaneCc: Richard_D_Levine@raytheon.com ; Michael Glaesemann ; pgsql-sql@postgresql.org ; pgsql-sql-owner@postgresql.orgSent: Friday, February 18, 2005 1:53 PMSubject: Re: [SQL] pg primary key bug?Hi,
Tom Lane wrote:About fsync (part from postgresql.conf) :pginfo <pginfo@t1.unisoftbg.com> writes:Will upgrade to 8.0 solve this type of problems ?The problem is probably not Postgres' fault at all. I'm wondering about disks with write cacheing enabled. And you didn't turn off fsync, I trust?
#---------------------------------------------------------------------------
# WRITE AHEAD LOG
#---------------------------------------------------------------------------
# - Settings -
#fsync = true # turns forced synchronization on or off
#wal_sync_method = fsync # the default varies across platforms:
# fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8 # min 4, 8KB each
# - Checkpoints -
#checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300 # range 30-3600, in seconds
#checkpoint_warning = 30 # 0 is off, in seconds
Also part from pg logfile:
LOG: statistics collector process (PID 2716) exited with exit code 1
LOG: shutting down
LOG: database system is shut down
LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: database system was shut down at 2005-02-11 19:58:26 EET
LOG: checkpoint record is at 2/BAC39188
LOG: redo record is at 2/BAC39188; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 2221145; next OID: 826607
LOG: database system is ready
LOG: recycled transaction log file "00000002000000BA"
LOG: recycled transaction log file "00000002000000BB"
LOG: recycled transaction log file "00000002000000BC"
LOG: recycled transaction log file "00000002000000BD"
LOG: recycled transaction log file "00000002000000BE"
WARNING: index "a_constants_str_pkey" contains 1449 row versions, but table contains 1422 row versions
HINT: Rebuild the index with REINDEX.
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
ERROR: duplicate key violates unique constraint "a_constants_str_pkey"
LOG: received smart shutdown request
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
FATAL: terminating connection due to administrator command
LOG: statistics collector process (PID 2713) exited with exit code 1
LOG: shutting down
LOG: database system is shut down
LOG: could not create IPv6 socket: Address family not supported by protocol
LOG: database system was shut down at 2005-02-16 08:32:21 EET
LOG: checkpoint record is at 2/BFAE09EC
LOG: redo record is at 2/BFAE09EC; undo record is at 0/0; shutdown TRUE
Note we was informed about the problem on 2005-02-16 and rebooted the box.
As I see in log file the last restart was on 2005-02-11 and after it all worked well.
The affected table is critical and the most used table in this system and if the insert stop to work stop also the system.
I will notice also that in the first case when we found the same problem an this system worked ~80 users.
Is it possible pg_dump or reindex table or vacuum analyze to make this problem?
We are using it very regular.
Will we need to stop using vacuum full at all?
regards,
ivan.regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend
Hi,<br /> sorry, but we have the case number 3 in with the same problem.<br /> Also this time we do not find any linux boxcrash nor pg stop or restart.<br /> The pg version is 7.4.2 on dual xeon + scsi running also RedHat 3.0 AS.<br /> In allthe cases we are running RedHat AS 3.0.<br /> This system was running for over 12 m. without any problems.<br /> I sendalso the state of the problem table (also the same) and my question is:<br /> Need we to stop using vacuum full fornow?<br /> And can only vacuum analyze make the same problem in pg?<br /> As I understand the problem is in OS by makingvacuum full analyze (as Tom wrote).<br /> We do not found any problems in OS and the ony solution we see is to stopusing vacuum full analyze.<br /> Also we are using only jdbc to access pg. Is it possible that jdbc to make this problem?<br/><br /> regards,<br /> ivan.<br /> serv117=# select oid, xmin, cmin, xmax, cmax, ctid, * from a_constants_str;<br /> oid | xmin | cmin | xmax | cmax | ctid | constname | fid | constvalue <br/> -----------+---------+---------+---------+---------+----------+-----------+-----+-------------<br /> 760807304 | 7357839| 0 | 0 | 0 | (0,1) | PARTID | 0 | SOF_79<br /> 760807305 | 7357839 | 0 | 0| 0 | (0,2) | AACCGRID | 0 | SOF_29<br /> 760807306 | 7357839 | 0 | 0 | 0 | (0,3) | AKLTYPID | 0 | SOF_47<br /> 760807307 | 7357839 | 0 | 0 | 0 | (0,4) | AOBLASTID | 0 | SOF_41<br/> 760807308 | 7357839 | 0 | 0 | 0 | (0,5) | ANMGRID | 0 | SOF_102<br /> 760807309 |7357839 | 0 | 0 | 0 | (0,6) | LOCAID | 0 | SOF_112<br /> 760807310 | 7357839 | 0 | 0 | 0 | (0,7) | AKLGRID | 0 | SOF_116<br /> 760807311 | 7357839 | 0 | 0 | 0 | (0,8)| ADARID | 0 | SOF_33<br /> 760807314 | 7357839 | 0 | 0 | 0 | (0,11) | ASLUID | 0 | SOF_86<br/> 760807315 | 7357839 | 0 | 0 | 0 | (0,12) | AUSERID | 0 | SOF_28<br /> 760807318 | 7357839| 0 | 0 | 0 | (0,15) | ANLIZPID | 0 | SOF_100137<br /> 760807316 | 7507505 | 3 | 3 | 0 | (0,36) | ASETUPID | 0 | SOF_4618<br /> 760807324 | 7750088 | 7766293 | 7766293 | 2 | (0,92)| DOCID | 0 | SOF_836141<br /> 760807319 | 7740812 | 2 | 2 | 0 | (4,8) | ANOMID | 0 | SOF_31353<br /> 760807325 | 7750088 | 19 | 19 | 0 | (4,111) | DOCRID | 0 | SOF_2067257<br /> 760807326 | 7750088 | 41 | 41 | 7750975 | (6,27) | DOCPLAID | 0 | SOF_44261<br /> 760807327 | 7750088| 46 | 46 | 7750975 | (7,106) | DOCPOGPLA | 0 | SOF_58034<br /> 760807324 | 7750088 | 7766293 | 7766293| 1 | (9,107) | DOCID | 0 | SOF_836141<br /> 760807313 | 7680519 | 2 | 2 | 0 | (10,3)| NASTRF | 0 | SOF_161<br /> 760807312 | 7688072 | 2 | 2 | 0 | (10,92) | AGRADID | 0 |SOF_804<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (12,18) | DOCID | 0 | SOF_836141<br /> 760807324| 7750088 | 7766293 | 7766293 | 1 | (13,94) | DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 |7766293 | 7766293 | 1 | (15,45) | DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (17,4) | DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (18,80) |DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (20,31) | DOCID | 0 | SOF_836141<br/> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (21,109) | DOCID | 0 | SOF_836141<br /> 760807324| 7750088 | 7766293 | 7766293 | 1 | (23,58) | DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 |7766293 | 7766293 | 1 | (25,9) | DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (26,85) | DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (28,36) |DOCID | 0 | SOF_836141<br /> 760807324 | 7750088 | 7766293 | 7766293 | 1 | (29,114) | DOCID | 0 | SOF_836141<br/> 760807317 | 7702028 | 2 | 2 | 0 | (51,41) | AMITAID | 0 | SOF_345<br /> 760807320| 7702064 | 2 | 2 | 0 | (51,42) | ATRANSID | 0 | SOF_458<br /> 760807321 | 7707993 | 2 | 2 | 0 | (57,8) | TDOCID | 0 | SOF_546<br /> 760807323 | 7753774 | 3 | 3 | 0 | (59,7) | AKLIID | 0 | SOF_22695<br /> 760807322 | 7707993 | 2385 | 2385 | 0 | (59,95) | TDOCRID | 0 | SOF_105930<br /> (37 rows)<br /><br /><br /><br /> serv117=# \d a_constants_str<br /> Table "public.a_constants_str"<br/> Column | Type | Modifiers <br /> ------------+-----------------------+-----------<br/> constname | character varying(30) | not null<br /> fid |integer | not null<br /> constvalue | character varying(30) | <br /> Indexes:<br /> "a_constants_str_pkey"primary key, btree (constname, fid)<br /><br /><br /><br /> Tom Lane wrote:<br /><blockquote cite="mid9785.1108666900@sss.pgh.pa.us"type="cite"><pre wrap="">pginfo <a class="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre wrap="">Willupgrade to 8.0 solve this type of problems ? </pre></blockquote><pre wrap=""> The problem is probably not Postgres' fault at all. I'm wondering about disks with write cacheing enabled. And you didn't turn off fsync, I trust? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend </pre></blockquote><br />
pginfo <pginfo@t1.unisoftbg.com> writes: > sorry, but we have the case number 3 in with the same problem. > Also this time we do not find any linux box crash nor pg stop or restart. Hmm, well there's *something* mighty curious about the history of this table. The xmin values span a range of almost 400,000 transactions and yet the oids are twenty-three consecutive values. Is this the only table you've inserted into during the last 400K transactions? It's also odd that there's so much empty space (only 37 rows in 60 pages). It's hard to see how that could be unless there were many updates on the table, but judging from the name and contents of the table I can hardly see a reason for lots of updates. How is this table used exactly? Do you do UPDATEs on it? DELETEs? SELECT FOR UPDATE? Do you do VACUUM, or VACUUM FULL, or both? regards, tom lane
Hi Tom,<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo<a class="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a>writes: </pre><blockquote type="cite"><pre wrap="">sorry,but we have the case number 3 in with the same problem. Also this time we do not find any linux box crash nor pg stop or restart. </pre></blockquote><pre wrap=""> Hmm, well there's *something* mighty curious about the history of this table. The xmin values span a range of almost 400,000 transactions and yet the oids are twenty-three consecutive values. Is this the only table you've inserted into during the last 400K transactions? </pre></blockquote> No.<br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us"type="cite"><pre wrap=""> It's also odd that there's so much empty space (only 37 rows in 60 pages). It's hard to see how that could be unless there were many updates on the table, but judging from the name and contents of the table I can hardly see a reason for lots of updates. How is this table used exactly?</pre></blockquote> In this table we store the last value for the ID of part from other tables.<br /> For eachtable we have one constant in this table. We are using the table as sequence.<br /> For Example if we nee to insert thenext record in some table we make:<br /> select constvalue from a_constants_str where constname ='...' for update;<br/> increase the value and make <br /> update a_constants_str set constvalue= (new value) where...<br /><br />It is not so easy as I described, but in general this is the case.<br /> Al this is connected with replications and datasyncronisation and so on.<br /><br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap=""> Doyou do UPDATEs on it?</pre></blockquote> Yes, see the description.<br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us"type="cite"><pre wrap=""> DELETEs?</pre></blockquote> No, never.<br /><blockquotecite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap=""> SELECT FOR UPDATE?</pre></blockquote> Yes.<br/><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><pre wrap=""> Do you do VACUUM, or VACUUM FULL, or both?</pre></blockquote> Only vacuum full analyze once per day.<br /> Also once per4h pg_dump (for pg 7.x we do not have any chance to restore data without full dump. With 8.0 we will test the ne solutionand stop to make pg_dump.<br /><br /> We have > 200 relative big installs for this ERP and all are using pg.<br/> For the last 3 y. we have one time data corruption and in this 3 cases problem with pkey's.<br /> All the time pgis working relative well .<br /><br /> regards,<br /> ivan.<br /><blockquote cite="mid3293.1109026294@sss.pgh.pa.us" type="cite"><prewrap=""> regards, tom lane </pre></blockquote><br />
pginfo <pginfo@t1.unisoftbg.com> writes: > In this table we store the last value for the ID of part from other tables. > For each table we have one constant in this table. We are using the > table as sequence. > For Example if we nee to insert the next record in some table we make: > select constvalue from a_constants_str where constname ='...' for update; > increase the value and make > update a_constants_str set constvalue= (new value) where... > It is not so easy as I described, but in general this is the case. > Al this is connected with replications and data syncronisation and so on. "Connected"? What exactly is hiding under that last comment? One way I could take your report is that you've found a weird interaction between SELECT FOR UPDATE and VACUUM FULL that no one else has seen before. Another way is that you're using some nonstandard backend extension that has nasty bugs in it. It is interesting that you say this system has been working well for years and only recently have you seen problems. To me the obvious question is "what have you changed recently?" It might not be a bogus change in itself, but it could have triggered a bug at lower levels. It's certainly possible that you have your finger on a backend bug, but if so there's not nearly enough information here for anyone to find and fix it. You need to be thinking in terms of how to reproduce the problem so that it can be studied and fixed. "How can I avoid this problem" is exactly the wrong question to be asking, because even if avoiding it is all that concerns you, no one can answer with any confidence until we understand what the failure mechanism is. regards, tom lane
<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo <aclass="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> writes: </pre><blockquotetype="cite"><pre wrap="">In this table we store the last value for the ID of part from other tables. For each table we have one constant in this table. We are using the table as sequence. For Example if we nee to insert the next record in some table we make: select constvalue from a_constants_str where constname ='...' for update; increase the value and make update a_constants_str set constvalue= (new value) where... </pre></blockquote><pre wrap=""> </pre><blockquote type="cite"><prewrap="">It is not so easy as I described, but in general this is the case. Al this is connected with replications and data syncronisation and so on. </pre></blockquote><pre wrap=""> "Connected"? What exactly is hiding under that last comment? </pre></blockquote> We are using separate table for sequencesand not sequences from pg direct, because this is built in application method for making <br /> replication anddata syncro.<br /> I wish only to clarify the reason of using the table and to describe the groud for so many updatesand select for updates.<br /> Sorry for my bad english ):.<br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us"type="cite"><pre wrap=""> One way I could take your report is that you've found a weird interaction between SELECT FOR UPDATE and VACUUM FULL that no one else has seen before. Another way is that you're using some nonstandard backend extension that has nasty bugs in it. It is interesting that you say this system has been working well for years and only recently have you seen problems.</pre></blockquote> Yes, exact.<br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us"type="cite"><pre wrap=""> To me the obvious question is "what have you changed recently?"</pre></blockquote> If I know !<br /> In general we do not make any global changesconnected to database access method.<br /> We are using jdbc (jdbc driver from pg) + jboss (java based applicationserver) + connection pool (biult in jboss).<br /> We are using jdbc with Statement.executeBatch(...) and alsodirect with Statement.executeUpdate(...) .<br /> We are using exact the same ide with oracle without any problem ( oraclehave anoder problems and I prefer pg).<br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap=""> It might not be a bogus change in itself, but it could have triggered a bug at lower levels. It's certainly possible that you have your finger on a backend bug, but if so there's not nearly enough information here for anyone to find and fix it.</pre></blockquote> I am sure (not 100%) that it is bug. That is the reason to report the problem.<br /><br/><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap=""> You need to be thinking in terms ofhow to reproduce the problem so that it can be studied and fixed.</pre></blockquote> You idea was that we have "vacuum full" + update or selectfor update in the same time.<br /> I think it is not the case, because we start vacuum full at 1:00 AM and no one isworking in this time.<br /><br /> Will vacuum full generate this problem if we have locked table in this time? (It is possibleto have locked table in theory)<br /><br /> At this time we do not have info about how to reproduce the problem.<br/> As the first step we will stop using "vacum full" (if needet we will stop using vacuum analyze too) to tryto collect more info.<br /><br /><br /><blockquote cite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap=""> "How can I avoid this problem" is exactly the wrong question to be asking, because even if avoiding it is all that concerns you, no one can answer with any confidence until we understand what the failure mechanism is. </pre></blockquote> Can we set some log parameters to collectthe needet data?<br /> Can you describe more detailed the idea of problem with "vacuum full" + "update" and can someone make patch if this problem exists in theory (if I understand you right)?<br /> We can start using this patch andsee if the problem will be again .<br /><br /> If you have anoder Idea we are ready to collect the needet data.<br /><blockquotecite="mid7012.1109061770@sss.pgh.pa.us" type="cite"><pre wrap=""> regards, tom lane </pre></blockquote> regards,<br /> ivan.<br />
On Tue, 2005-02-22 at 10:33 +0100, pginfo wrote: > > > We are using jdbc (jdbc driver from pg) + jboss (java based > application server) + connection pool (biult in jboss). > ... > Will vacuum full generate this problem if we have locked table in this > time? (It is possible to have locked table in theory) I do not know if this is relevant, but I have seen jboss applications keep sessions in 'Idle in transaction' state, apparently with some locks granted. Would such cases not interfere with vacuum? gnari
<br /><br /> Ragnar Hafstað wrote:<br /><blockquote cite="mid1109073162.17839.124.camel@localhost.localdomain" type="cite"><prewrap="">On Tue, 2005-02-22 at 10:33 +0100, pginfo wrote: </pre><blockquote type="cite"><pre wrap=""> We are using jdbc (jdbc driver from pg) + jboss (java based application server) + connection pool (biult in jboss). ... Will vacuum full generate this problem if we have locked table in this time? (It is possible to have locked table in theory) </pre></blockquote><pre wrap=""> I do not know if this is relevant, but I have seen jboss applications keep sessions in 'Idle in transaction' state, apparently with some locks granted. Would such cases not interfere with vacuum? gnari </pre></blockquote> Only to add,<br /> also keeping sme transactions for long time not commited (possible).<br /> regards,<br/> ivan.<br /><blockquote cite="mid1109073162.17839.124.camel@localhost.localdomain" type="cite"><pre wrap=""> </pre></blockquote><br />
Hi, Ragnar, Ragnar Hafstað schrieb: >>We are using jdbc (jdbc driver from pg) + jboss (java based >>application server) + connection pool (biult in jboss). >>... >>Will vacuum full generate this problem if we have locked table in this >>time? (It is possible to have locked table in theory) > I do not know if this is relevant, but I have seen jboss applications > keep sessions in 'Idle in transaction' state, apparently with some > locks granted. Would such cases not interfere with vacuum? Most of those "Idle in transaction" problems were caused by suboptimal handling of BEGIN in the pgjdbc driver, this should be fixed in current versions of postgres.jar (build 8.0-310). Markus -- markus schaber | dipl. informatiker logi-track ag | rennweg 14-16 | ch 8001 zürich phone +41-43-888 62 52 | fax +41-43-888 62 53 mailto:schabios@logi-track.com | www.logi-track.com
pginfo <pginfo@t1.unisoftbg.com> writes: > You idea was that we have "vacuum full" + update or select for update in > the same time. > I think it is not the case, because we start vacuum full at 1:00 AM and > no one is working in this time. Hmm. AFAICT the duplicate row copies could only be produced by vacuum full, so that's certainly part of the issue. But if vacuum full in isolation were broken, we'd surely know it; so there must be some other contributing factor involved that your setup is exercising but other people are (mostly) not doing. I agree with the plan to use plain vacuum for awhile and see if that makes the problem go away. I think it would have to, but maybe I'm all wet about that. In the meantime I would suggest seeing if you can distill your application down into a test case that other people can run to reproduce the problem. It doesn't matter if the test doesn't make the bug happen very often, but we have to see the problem happening before we have much hope of fixing it. > Will vacuum full generate this problem if we have locked table in this > time? (It is possible to have locked table in theory) No, that's hardly likely. vacuum full deals with locks all the time. > Can you describe more detailed the idea of problem with "vacuum full" + > "update" and can some one make patch if this problem exists in theory > (if I understand you right)? I have no idea what the actual failure mechanism might be. regards, tom lane
<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid10243.1109090250@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo <aclass="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> writes: </pre><blockquotetype="cite"><pre wrap="">You idea was that we have "vacuum full" + update or select for update in the same time. I think it is not the case, because we start vacuum full at 1:00 AM and no one is working in this time. </pre></blockquote><pre wrap=""> Hmm. AFAICT the duplicate row copies could only be produced by vacuum full, so that's certainly part of the issue. But if vacuum full in isolation were broken, we'd surely know it; so there must be some other contributing factor involved that your setup is exercising but other people are (mostly) not doing. I agree with the plan to use plain vacuum for awhile and see if that makes the problem go away. I think it would have to, but maybe I'm all wet about that. </pre></blockquote> Ok, we can still using vacuum full on some installs (with risk to make problems tocustomes).<br /> I will to ask if it will be possible to start some querys (I do not know the query) exactly before runningvacuum full and to save the results in some log file. If it is possible, we will be able to post the results to thelist in case of ne problem and to have some start point for reproducing the problem. My idea is some one more familiarwith pg to send this querys (if it exists) and we will install it in vacuum scripts.<br /><br /><blockquote cite="mid10243.1109090250@sss.pgh.pa.us"type="cite"><pre wrap=""> In the meantime I would suggest seeing if you can distill your application down into a test case that other people can run to reproduce the problem. It doesn't matter if the test doesn't make the bug happen very often, but we have to see the problem happening before we have much hope of fixing it. </pre><blockquote type="cite"><pre wrap="">Will vacuum full generate this problem if we have locked table in this time? (It is possible to have locked table in theory) </pre></blockquote><pre wrap=""> No, that's hardly likely. vacuum full deals with locks all the time. </pre><blockquote type="cite"><pre wrap="">Can you describe more detailed the idea of problem with "vacuum full" + "update" and can some one make patch if this problem exists in theory (if I understand you right)? </pre></blockquote><pre wrap=""> I have no idea what the actual failure mechanism might be. regards, tom lane </pre></blockquote> regards,<br /> ivan.<br /><blockquote cite="mid10243.1109090250@sss.pgh.pa.us"type="cite"><pre wrap=""> </pre></blockquote><br />
pginfo <pginfo@t1.unisoftbg.com> writes: > I will to ask if it will be possible to start some querys (I do not know > the query) exactly before running vacuum full and to save the results in > some log file. If it is possible, we will be able to post the results to > the list in case of ne problem and to have some start point for > reproducing the problem. Well, you should definitely turn on log_statement across the whole installation so that you have a complete record of all SQL commands being issued. Make sure the log includes timestamps and PIDs. I would suggest adding a simple probe for duplicate records to the vacuum script. Maybe something like set enable_indexscan to off;select constname,fid,count(*) from a_constants_strgroup by constname,fid having count(*) > 1; (The indexscan off bit is just paranoia --- I think that an indexscan might mask the presence of multiple copies of what's supposedly a unique key.) Do this just before and just after the vacuum full command. That will at least nail down whether vacuum full is creating the dups, and once we see it happen the trace of the day's SQL commands may give some ideas where to look. regards, tom lane
<br /><br /> Tom Lane wrote:<br /><blockquote cite="mid10956.1109095448@sss.pgh.pa.us" type="cite"><pre wrap="">pginfo <aclass="moz-txt-link-rfc2396E" href="mailto:pginfo@t1.unisoftbg.com"><pginfo@t1.unisoftbg.com></a> writes: </pre><blockquotetype="cite"><pre wrap="">I will to ask if it will be possible to start some querys (I do not know the query) exactly before running vacuum full and to save the results in some log file. If it is possible, we will be able to post the results to the list in case of ne problem and to have some start point for reproducing the problem. </pre></blockquote><pre wrap=""> Well, you should definitely turn on log_statement across the whole installation so that you have a complete record of all SQL commands being issued. Make sure the log includes timestamps and PIDs. I would suggest adding a simple probe for duplicate records to the vacuum script. Maybe something like set enable_indexscan to off;select constname,fid,count(*) from a_constants_strgroup by constname,fid having count(*) >1; (The indexscan off bit is just paranoia --- I think that an indexscan might mask the presence of multiple copies of what's supposedly a unique key.) Do this just before and just after the vacuum full command. That will at least nail down whether vacuum full is creating the dups, and once we see it happen the trace of the day's SQL commands may give some ideas where to look. </pre></blockquote> Ok, it was my idea.<br /> We will do it and install the script in ~100 servers andwill see the result.<br /><blockquote cite="mid10956.1109095448@sss.pgh.pa.us" type="cite"><pre wrap=""> regards,tom lane </pre></blockquote> regards,<br /> ivan.<br />