Thread: Corruption or wrong results with 14.10?
Hi, I have a very strange behavior on 14.10. smrdbprod=# select version(); version ---------------------------------------------------------------------------------------------------------- PostgreSQL 14.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20), 64-bit (1 row) This is the statement which is executed: insert into smrr_mgr.formula_composition_with_codes select crart_id_par, product_par.code_fin code_fin_par, crfrm_chemin, is_primary, seq, crart_id_comp, product_comp.code_fin code_fin_comp, qty_per from smrr_mgr.formula_composition, smrr_mgr.formula, smrr_mgr.product product_par, smrr_mgr.product product_comp where formula_composition.crart_id_par = formula.crart_id and formula_composition.crfrm_chemin = formula.chemin and formula_composition.crart_id_par = product_par.id and formula_composition.crart_id_comp = product_comp.id order by seq; Here are the table definitions: smrdbprod=# \d pg_class Table "pg_catalog.pg_class" Column | Type | Collation | Nullable | Default ---------------------+--------------+-----------+----------+--------- oid | oid | | not null | relname | name | | not null | relnamespace | oid | | not null | reltype | oid | | not null | reloftype | oid | | not null | relowner | oid | | not null | relam | oid | | not null | relfilenode | oid | | not null | reltablespace | oid | | not null | relpages | integer | | not null | reltuples | real | | not null | relallvisible | integer | | not null | reltoastrelid | oid | | not null | relhasindex | boolean | | not null | relisshared | boolean | | not null | relpersistence | "char" | | not null | relkind | "char" | | not null | relnatts | smallint | | not null | relchecks | smallint | | not null | relhasrules | boolean | | not null | relhastriggers | boolean | | not null | relhassubclass | boolean | | not null | relrowsecurity | boolean | | not null | relforcerowsecurity | boolean | | not null | relispopulated | boolean | | not null | relreplident | "char" | | not null | relispartition | boolean | | not null | relrewrite | oid | | not null | relfrozenxid | xid | | not null | relminmxid | xid | | not null | relacl | aclitem[] | | | reloptions | text[] | C | | relpartbound | pg_node_tree | C | | Indexes: "pg_class_oid_index" PRIMARY KEY, btree (oid) "pg_class_relname_nsp_index" UNIQUE CONSTRAINT, btree (relname, relnamespace) "pg_class_tblspc_relfilenode_index" btree (reltablespace, relfilenode) smrdbprod=# \d smrr_mgr.formula_composition_with_codes Table "smrr_mgr.formula_composition_with_codes" Column | Type | Collation | Nullable | Default ---------------+-----------------------+-----------+----------+--------- crart_id_par | numeric | | | code_fin_par | character varying(18) | | | crfrm_chemin | character varying(2) | | | is_primary | character varying(1) | | | seq | numeric(6,0) | | | crart_id_comp | numeric | | | code_fin_comp | character varying(18) | | | qty_per | numeric | | | Indexes: "formula_composition_with_codes_crart_id_comp_index" btree (crart_id_comp) "formula_composition_with_codes_unique_index" UNIQUE, btree (crart_id_par, is_primary, crfrm_chemin, seq, crart_id_comp) smrdbprod=# \d smrr_mgr.formula_composition Table "smrr_mgr.formula_composition" Column | Type | Collation | Nullable | Default ----------------+-----------------------------+-----------+----------+--------- crart_id_par | numeric | | not null | crfrm_chemin | character varying(2) | | not null | seq | numeric(6,0) | | not null | qty | numeric(7,0) | | | assay | numeric(6,2) | | | crart_id_comp | numeric | | | crart_id_solv | numeric | | | qty_per | numeric(21,20) | | | recid | numeric | | | update_date_dl | timestamp without time zone | | | Indexes: "formula_composition_pkey" PRIMARY KEY, btree (crart_id_par, crfrm_chemin, seq) "formula_composition_crart_id_comp2_index" btree (crart_id_comp) "formula_composition_crart_id_par2_index" btree (crart_id_par) "formula_composition_update_date_dl2_index" btree (update_date_dl) smrdbprod=# \d smrr_mgr.formula Table "smrr_mgr.formula" Column | Type | Collation | Nullable | Default --------------+-----------------------------+-----------+----------+--------- crart_id | numeric | | not null | chemin | character varying(2) | | not null | is_primary | character varying(1) | | not null | nb_part | numeric(7,0) | | not null | nb_art | numeric(3,0) | | not null | lower_level | numeric(2,0) | | not null | is_ch_ok | character varying(1) | | not null | dt_creat | timestamp without time zone | | | dt_modif | timestamp without time zone | | | dt_finalized | timestamp without time zone | | | recid | numeric | | | a_dt_ins | timestamp without time zone | | not null | a_dt_upd | timestamp without time zone | | | a_fc_ins | character varying(30) | | not null | a_fc_upd | character varying(30) | | | a_us_ins | character varying(30) | | not null | a_us_upd | character varying(30) | | | dt_validity | timestamp without time zone | | | r_mem_id | numeric | | | o_mem_id | numeric | | | is_active_dl | character varying(1) | | not null | inactive_ts | timestamp without time zone | | | Indexes: "formula_pkey" PRIMARY KEY, btree (crart_id, chemin) "formula_last_upd_dt_index" btree (COALESCE(a_dt_upd, a_dt_ins)) Table "smrr_mgr.product" Column | Type | Collation | Nullable | Default --------------------------+-----------------------------+-----------+----------+--------- id | numeric | | not null | code_fin | character varying(18) | | not null | code_lab | character varying(18) | | | fiipc_code | numeric(2,0) | | | is_experimental | character varying(1) | | not null | is_fl | character varying(1) | | not null | is_pe | character varying(1) | | not null | is_ch | character varying(1) | | not null | prod_src | character varying(1) | | not null | status | character varying(1) | | | grp_cat | character varying(1) | | | gcp | character varying(1) | | | collection | character varying(1) | | | pct_pe_prom | numeric(3,0) | | not null | pct_fl_prom | numeric(3,0) | | not null | pct_fab | numeric(3,0) | | not null | std_avail | numeric(10,0) | | not null | brand | character varying(60) | | | costc_code | numeric(2,0) | | not null | coton_code | numeric(6,0) | | | is_fl_for_blend | character varying(1) | | not null | dt_creat | timestamp without time zone | | | recid | numeric | | | a_dt_ins | timestamp without time zone | | not null | a_dt_upd | timestamp without time zone | | | a_fc_ins | character varying(30) | | not null | a_fc_upd | character varying(30) | | | a_us_ins | character varying(30) | | not null | a_us_upd | character varying(30) | | | remark | character varying(200) | | | dt_transf | timestamp without time zone | | | is_cnc | character varying(1) | | not null | r_mem_id | numeric | | | is_frm | character varying(1) | | not null | is_rest_pe | character varying(1) | | | is_rest_fl | character varying(1) | | | o_mem_id | numeric | | | prod_nat | character varying(1) | | not null | is_food_ingredient | character varying(1) | | not null | is_heart | character varying(1) | | not null | fl_solubility | character varying(20) | | | scipc_code | numeric(2,0) | | | mail_msg | character varying(200) | | | is_oc | character varying(1) | | not null | is_oc_dt_upd | timestamp without time zone | | | is_oc_fc_upd | character varying(30) | | | is_oc_us_upd | character varying(30) | | | ton_s_perception | character varying(2) | | not null | dt_finalized | timestamp without time zone | | | rd_coll_number | character varying(10) | | | fl_rm_intro_pid | character varying(4) | | | fl_rm_intro_zone | character varying(10) | | | fl_rm_intro_cat | character varying(9) | | | fl_rm_intro_sub_cat | character varying(9) | | | fl_rm_intro_is_sensitive | character varying(1) | | | is_active_dl | character varying(1) | | not null | inactive_ts | timestamp without time zone | | | Indexes: "product_pkey" PRIMARY KEY, btree (id) "product_code_fin" btree (code_fin) "product_code_fin_prefix" btree (substr(code_fin::text, 1, 6)) "product_code_fin_prefix_new_serie" btree (substr(code_fin::text, 1, 9)) "product_last_upd_dt_index" btree (COALESCE(a_dt_upd, a_dt_ins)) The target table is empty and when the statement is executed there is a unique key violation: smrdbprod=# truncate smrr_mgr.formula_composition_with_codes; TRUNCATE TABLE smrdbprod=# insert into smrr_mgr.formula_composition_with_codes smrdbprod-# select crart_id_par, smrdbprod-# product_par.code_fin code_fin_par, smrdbprod-# crfrm_chemin, smrdbprod-# is_primary, smrdbprod-# seq, smrdbprod-# crart_id_comp, smrdbprod-# product_comp.code_fin code_fin_comp, smrdbprod-# qty_per smrdbprod-# from smrr_mgr.formula_composition, smrdbprod-# smrr_mgr.formula, smrdbprod-# smrr_mgr.product product_par, smrdbprod-# smrr_mgr.product product_comp smrdbprod-# where formula_composition.crart_id_par = formula.crart_id smrdbprod-# and formula_composition.crfrm_chemin = formula.chemin smrdbprod-# and formula_composition.crart_id_par = product_par.id smrdbprod-# and formula_composition.crart_id_comp = product_comp.id smrdbprod-# order by seq; ERROR: duplicate key value violates unique constraint "formula_composition_with_codes_unique_index" DETAIL: Key (crart_id_par, is_primary, crfrm_chemin, seq, crart_id_comp)=(4775290000, Y, @, 0, 1765660000) already exists. Asking for the conditions in the key violation error message gives only one row: smrdbprod=# select crart_id_par, smrdbprod-# product_par.code_fin code_fin_par, smrdbprod-# crfrm_chemin, smrdbprod-# is_primary, smrdbprod-# seq, smrdbprod-# crart_id_comp, smrdbprod-# product_comp.code_fin code_fin_comp, smrdbprod-# qty_per smrdbprod-# from smrr_mgr.formula_composition, smrdbprod-# smrr_mgr.formula, smrdbprod-# smrr_mgr.product product_par, smrdbprod-# smrr_mgr.product product_comp smrdbprod-# where formula_composition.crart_id_par = formula.crart_id smrdbprod-# and formula_composition.crfrm_chemin = formula.chemin smrdbprod-# and formula_composition.crart_id_par = product_par.id smrdbprod-# and formula_composition.crart_id_comp = product_comp.id smrdbprod-# and crart_id_par = 4775290000 smrdbprod-# and is_primary = 'Y' smrdbprod-# and crfrm_chemin = '@' smrdbprod-# and seq = 0 smrdbprod-# and crart_id_comp = 1765660000 smrdbprod-# order by seq; crart_id_par | code_fin_par | crfrm_chemin | is_primary | seq | crart_id_comp | code_fin_comp | qty_per --------------+--------------+--------------+------------+-----+---------------+---------------+------------------------ 4775290000 | 475860 KN | @ | Y | 0 | 1765660000 | 908290 | 0.00078864353312302800 (1 row) Doing the same with enable_hashjoin = off, runs successfully: smrdbprod=# truncate smrr_mgr.formula_composition_with_codes; TRUNCATE TABLE smrdbprod=# set enable_hashjoin to off; SET smrdbprod=# insert into smrr_mgr.formula_composition_with_codes select crart_id_par, product_par.code_fin code_fin_par, crfrm_chemin, is_primary, seq, crart_id_comp, product_comp.code_fin code_fin_comp, qty_per from smrr_mgr.formula_composition, smrr_mgr.formula, smrr_mgr.product product_par, smrr_mgr.product product_comp where formula_composition.crart_id_par = formula.crart_id and formula_composition.crfrm_chemin = formula.chemin and formula_composition.crart_id_par = product_par.id and formula_composition.crart_id_comp = product_comp.id order by seq; INSERT 0 20756629 I've tried to reproduce this locally and dumped those tables. While importing them I got errors like this: psql:restore.sql:242: ERROR: could not create unique index "formula_pkey" DETAIL: Key (crart_id, chemin)=(12383610000, @) is duplicated. Checking the primary key on the live database gave this: smrdbprod=# reindex index CONCURRENTLY smrr_mgr.formula_pkey; ERROR: could not create unique index "formula_pkey_ccnew" DETAIL: Key (crart_id, chemin)=(21507180000, @) is duplicated. smrdbprod=# But: smrdbprod=# select count(*) from smrr_mgr.formula where crart_id = 21507180000 and chemin = '@'; count ------- 1 (1 row) smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group by crart_id, chemin having count(*) > 1; crart_id | chemin | count ----------+--------+------- (0 rows) What do I see here? Corruption? Thanks for any help Regards Daniel
>smrdbprod=# select count(*) from smrr_mgr.formula where crart_id = 21507180000 and chemin = '@';
> count
>-------
> 1
>(1 row)
>
>smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group by crart_id, chemin having count(*) > 1;
> crart_id | chemin | count
>----------+--------+-------
>(0 rows)
> count
>-------
> 1
>(1 row)
>
>smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group by crart_id, chemin having count(*) > 1;
> crart_id | chemin | count
>----------+--------+-------
>(0 rows)
Small update:
smrdbprod=# set enable_indexscan to off;
SET
smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group by crart_id, chemin having count(*) > 1;
crart_id | chemin | count
-------------+--------+-------
35054630000 | @ | 2
4737310000 | @ | 2
10632380000 | @ | 2
14680880000 | @ | 2
4627230000 | @ | 2
10993780000 | @ | 2
....
I think I know what I have to do.
On Thu, Nov 23, 2023 at 2:29 PM Daniel Westermann (DWE) <daniel.westermann@dbi-services.com> wrote:
smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group by crart_id, chemin having count(*) > 1;crart_id | chemin | count-------------+--------+-------35054630000 | @ | 24737310000 | @ | 210632380000 | @ | 214680880000 | @ | 24627230000 | @ | 210993780000 | @ | 2....I think I know what I have to do.
If you have moved the database from a system with glibc <2.28 to >=2.28, that could be the reason.
On Thu, Nov 23, 2023 at 7:34 AM Torsten Förtsch <tfoertsch123@gmail.com> wrote:
On Thu, Nov 23, 2023 at 2:29 PM Daniel Westermann (DWE) <daniel.westermann@dbi-services.com> wrote:smrdbprod=# select crart_id, chemin, count(*) from smrr_mgr.formula group by crart_id, chemin having count(*) > 1;crart_id | chemin | count-------------+--------+-------35054630000 | @ | 24737310000 | @ | 210632380000 | @ | 214680880000 | @ | 24627230000 | @ | 210993780000 | @ | 2....I think I know what I have to do.If you have moved the database from a system with glibc <2.28 to >=2.28, that could be the reason.
Just one note here... the reason can be _any_ operating systems move or update. It can happen with ICU and it can happen with any version of glibc (this is easily reproducible and we've seen it happen on production PG deployments that didn't involve glibc 2.28)
glibc 2.28 has certainly been the most obvious and impactful case, so the focus is understandable, but there's a bit of a myth that the problem is only with glibc 2.28 (and not ICU or other glibc versions or data structures other than indexes)
The only truly safe way to update an operating system under PosgreSQL is with logical dump/load or logical replication, or continuing to compile and use the identical older version of ICU from the old OS (if you use ICU). I think the ICU folks are generally careful enough that it'll be unlikely for compiler changes and new compiler optimizations to inadvertently change collation on newer operating systems and build toolchains.
Ironically I just did a detailed talk on this topic at PASS Data Summit last week, but unfortunately the recording probably won't be released for awhile. 🙂
-Jeremy