Thread: bug in ALTER TABLE / TYPE
A coworker of mine reported a subtle issue in ATExecAlterColumnType() in tablecmds.c. Suppose we have the following DDL: CREATE TABLE pktable (a int primary key, b int); CREATE TABLE fktable (fk int references pktable, c int); ALTER TABLE pktable ALTER COLUMN a TYPE bigint; Circa line 4891 in current sources, we begin a system table scan to look for pg_depend rows that depend on the column we're modifying. In this case, there are two dependencies on the column: the pg_constraint row for pktable's PK constraint, and the pg_constraint row for fktable's FK constraint. The bug is that we require the systable scan to return the FK constraint before the PK constraint -- if we attempt to remove the PK constraint first, it will fail because the FK constraint still exists and depends on the PK constraint. This bug is difficult to reproduce with a normal postmaster and vanilla sources, as the systable scan is usually implemented via a B+-tree scan on (refclassid, refobjid, refobjsubid). For this particular test case these three fields have equal values for the PK and FK constraint pg_depend rows, so the order in which the two constraints are returned is undefined. It just so happens that the Postgres b+-tree implementation *usually* returns the FK constraint first (per comments in nbtinsert.c, we usually place an equal key value at the end of a chain of equal keys, but stop looking for the end of the chain with a probability of 1%). And of course there is no ordering constraint if the systable scan is implemented via a heap scan (which would happen if, say, we're ignoring indexes on system catalogs in a standalone backend). To reproduce, any of: (1) Run the above SQL in a standalone backend started with the "-P" flag (2) Change "true" to "false" in the third argument to systable_beginscan() at tablecmds.c:4891, which means a heap scan will be used by a normal backend. (3) I've attached a dirty kludge of a patch that shuffles the results of the systable scan with probability 50%. Applying the patch should repro the bug with a normal backend (approx. 1 in 2 times, naturally). I'm not too familiar with the pg_depend code, so I'm not sure the right fix. Comments? -Neil Index: src/backend/commands/tablecmds.c =================================================================== RCS file: /var/lib/cvs/pgsql/src/backend/commands/tablecmds.c,v retrieving revision 1.162 diff -c -r1.162 tablecmds.c *** src/backend/commands/tablecmds.c 28 Jun 2005 05:08:54 -0000 1.162 --- src/backend/commands/tablecmds.c 29 Jun 2005 07:15:07 -0000 *************** *** 4801,4806 **** --- 4801,4809 ---- ScanKeyData key[3]; SysScanDesc scan; HeapTuple depTup; + List *tmp_list = NIL; + List *tmp_list2 = NIL; + ListCell *lc; attrelation = heap_open(AttributeRelationId, RowExclusiveLock); *************** *** 4893,4901 **** while (HeapTupleIsValid(depTup = systable_getnext(scan))) { ! Form_pg_depend foundDep = (Form_pg_depend) GETSTRUCT(depTup); ObjectAddress foundObject; /* We don't expect any PIN dependencies on columns */ if (foundDep->deptype == DEPENDENCY_PIN) elog(ERROR, "cannot alter type of a pinned column"); --- 4896,4941 ---- while (HeapTupleIsValid(depTup = systable_getnext(scan))) { ! tmp_list = lappend(tmp_list, heap_copytuple(depTup)); ! } ! ! foreach (lc, tmp_list) ! { ! if (lnext(lc) != NULL) ! { ! Form_pg_depend foundDep = (Form_pg_depend) GETSTRUCT((HeapTuple) lfirst(lc)); ! Form_pg_depend foundDepNext = (Form_pg_depend) GETSTRUCT((HeapTuple) lfirst(lnext(lc))); ! ! if (foundDep->refclassid == foundDepNext->refclassid && ! foundDep->refobjid == foundDepNext->refobjid && ! foundDep->refobjsubid == foundDepNext->refobjsubid) ! { ! if (random() > (MAX_RANDOM_VALUE / 2)) ! { ! tmp_list2 = lappend(tmp_list2, lfirst(lnext(lc))); ! tmp_list2 = lappend(tmp_list2, lfirst(lc)); ! lc = lnext(lc); ! elog(NOTICE, "choosing to shuffle"); ! continue; ! } ! else ! elog(NOTICE, "choosing not to shuffle"); ! } ! } ! ! tmp_list2 = lappend(tmp_list2, lfirst(lc)); ! } ! ! Assert(list_length(tmp_list) == list_length(tmp_list2)); ! ! foreach (lc, tmp_list2) ! { ! Form_pg_depend foundDep; ObjectAddress foundObject; + depTup = lfirst(lc); + foundDep = (Form_pg_depend) GETSTRUCT(depTup); + /* We don't expect any PIN dependencies on columns */ if (foundDep->deptype == DEPENDENCY_PIN) elog(ERROR, "cannot alter type of a pinned column");
I realize this needs review, but I will put it the queue so we don't forget it. Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --------------------------------------------------------------------------- Neil Conway wrote: > A coworker of mine reported a subtle issue in ATExecAlterColumnType() in > tablecmds.c. Suppose we have the following DDL: > > CREATE TABLE pktable (a int primary key, b int); > CREATE TABLE fktable (fk int references pktable, c int); > ALTER TABLE pktable ALTER COLUMN a TYPE bigint; > > Circa line 4891 in current sources, we begin a system table scan to look > for pg_depend rows that depend on the column we're modifying. In this > case, there are two dependencies on the column: the pg_constraint row > for pktable's PK constraint, and the pg_constraint row for fktable's FK > constraint. The bug is that we require the systable scan to return the > FK constraint before the PK constraint -- if we attempt to remove the PK > constraint first, it will fail because the FK constraint still exists > and depends on the PK constraint. > > This bug is difficult to reproduce with a normal postmaster and vanilla > sources, as the systable scan is usually implemented via a B+-tree scan > on (refclassid, refobjid, refobjsubid). For this particular test case > these three fields have equal values for the PK and FK constraint > pg_depend rows, so the order in which the two constraints are returned > is undefined. It just so happens that the Postgres b+-tree > implementation *usually* returns the FK constraint first (per comments > in nbtinsert.c, we usually place an equal key value at the end of a > chain of equal keys, but stop looking for the end of the chain with a > probability of 1%). And of course there is no ordering constraint if the > systable scan is implemented via a heap scan (which would happen if, > say, we're ignoring indexes on system catalogs in a standalone backend). > > To reproduce, any of: > > (1) Run the above SQL in a standalone backend started with the "-P" flag > > (2) Change "true" to "false" in the third argument to > systable_beginscan() at tablecmds.c:4891, which means a heap scan will > be used by a normal backend. > > (3) I've attached a dirty kludge of a patch that shuffles the results of > the systable scan with probability 50%. Applying the patch should repro > the bug with a normal backend (approx. 1 in 2 times, naturally). > > I'm not too familiar with the pg_depend code, so I'm not sure the right > fix. Comments? > > -Neil > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > I realize this needs review, but I will put it the queue so we don't > forget it. The patch does not need review, as it doesn't even attempt to fix the problem. (I just wrote the patch while analyzing the problem to make the error condition more easily reproduceable). I haven't had a chance to take a look at how this ought to be fixed... -Neil
Patch removed from queue. --------------------------------------------------------------------------- Neil Conway wrote: > A coworker of mine reported a subtle issue in ATExecAlterColumnType() in > tablecmds.c. Suppose we have the following DDL: > > CREATE TABLE pktable (a int primary key, b int); > CREATE TABLE fktable (fk int references pktable, c int); > ALTER TABLE pktable ALTER COLUMN a TYPE bigint; > > Circa line 4891 in current sources, we begin a system table scan to look > for pg_depend rows that depend on the column we're modifying. In this > case, there are two dependencies on the column: the pg_constraint row > for pktable's PK constraint, and the pg_constraint row for fktable's FK > constraint. The bug is that we require the systable scan to return the > FK constraint before the PK constraint -- if we attempt to remove the PK > constraint first, it will fail because the FK constraint still exists > and depends on the PK constraint. > > This bug is difficult to reproduce with a normal postmaster and vanilla > sources, as the systable scan is usually implemented via a B+-tree scan > on (refclassid, refobjid, refobjsubid). For this particular test case > these three fields have equal values for the PK and FK constraint > pg_depend rows, so the order in which the two constraints are returned > is undefined. It just so happens that the Postgres b+-tree > implementation *usually* returns the FK constraint first (per comments > in nbtinsert.c, we usually place an equal key value at the end of a > chain of equal keys, but stop looking for the end of the chain with a > probability of 1%). And of course there is no ordering constraint if the > systable scan is implemented via a heap scan (which would happen if, > say, we're ignoring indexes on system catalogs in a standalone backend). > > To reproduce, any of: > > (1) Run the above SQL in a standalone backend started with the "-P" flag > > (2) Change "true" to "false" in the third argument to > systable_beginscan() at tablecmds.c:4891, which means a heap scan will > be used by a normal backend. > > (3) I've attached a dirty kludge of a patch that shuffles the results of > the systable scan with probability 50%. Applying the patch should repro > the bug with a normal backend (approx. 1 in 2 times, naturally). > > I'm not too familiar with the pg_depend code, so I'm not sure the right > fix. Comments? > > -Neil > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073