Thread: inner query bug
I was advised to send an e-mail after a discussion with neilc on #postgresql. The failure occurs randomly, never on the first time (at least that I've seen) and usually only after doing a bunch of other stuff first (big joins, views, etc). In one case, the backend terminated and restarted, this has only happened once. P4 w/1024 GB ram. RH9 uname -a Linux gecko.paycheckadv.com 2.6.0-0.test9.1.90 #1 Tue Nov 18 09:34:47 EST 2003 i686 i686 i386 GNU/Linux psql output follows: SC2test7=# select * from txstatus,tx txx where txstatus.txid = txx.txid AND txstatus.statuschangetime = (select max(txstatus.statuschangetime) from txstatus where txstatus.txid = txx.txid); FATAL: terminating connection due to administrator command server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. SC2test7=# \d txstatus no connection to the server The connection to the server was lost. Attempting reset: Succeeded. SC2test7=# \d txstatus Table "public.txstatus" Column | Type | Modifiers ------------------+--------------------------+------------------------------------------------------------------ txstatusid | integer | not null default nextval('public.txstatus_txstatusid_seq'::text) statusid | integer | not null txid | integer | not null statuschangetime | timestamp with time zone | not null employeeid | integer | not null Indexes: pk_txstatus_txstatusid primary key btree (txstatusid), idx_txstatus_statuschangetime btree (statuschangetime) Foreign Key constraints: fk_txstatus_statusid FOREIGN KEY (statusid) REFERENCES status(statusid), fk_txstatus_txid FOREIGN KEY (txid) REFERENCES tx(txid), fk_txstatus_employeeid FOREIGN KEY (employeeid) REFERENCES employee(employeeid) SC2test7=# \d tx Table "public.tx" Column | Type | Modifiers -------------------+--------------------------+------------------------------------------------------ txid | integer | not null default nextval('public.tx_txid_seq'::text) txtypeid | integer | not null statusid | integer | not null amount | numeric(6,2) | not null servicecharge | numeric(6,2) | not null customeraccountid | integer | checksid | integer | rent | numeric(6,2) | scheduledpayments | integer | txtimestamp | timestamp with time zone | not null Indexes: pk_tx_txid primary key btree (txid), idx_tx_statusid btree (statusid), idx_tx_txtypeid btree (txtypeid) Foreign Key constraints: fk_tx_txtypeid FOREIGN KEY (txtypeid) REFERENCES txtype(txtypeid), fk_tx_statusid FOREIGN KEY (statusid) REFERENCES status(statusid), fk_tx_customeraccountid FOREIGN KEY (customeraccountid) REFERENCES customeraccount(customeraccountid), fk_tx_checksid FOREIGN KEY (checksid) REFERENCES checks(checksid) select * from txstatus,tx txx where txstatus.txid = txx.txid AND txstatus.statuschangetime = (select max(txstatus.statuschangetime) from txstatus where txstatus.txid = txx.txid); ERROR: variable not found in subplan target list
Andrew Holm-Hansen <andrew@einer.org> writes: > ERROR: variable not found in subplan target list What PG version is this? We've fixed a number of bugs with that symptom, IIRC ... regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > What PG version is this? We've fixed a number of bugs with that > symptom, IIRC ... When we were discussing this on IRC, I made sure to check it was a recent version of PG -- I believe it is 7.4.0. The last reported instance I could see of a bug with this symptom was for 7.4 beta 1 (at which point we thought we'd fixed it). -Neil
Neil Conway <neilc@samurai.com> writes: > When we were discussing this on IRC, I made sure to check it was a > recent version of PG -- I believe it is 7.4.0. The last reported > instance I could see of a bug with this symptom was for 7.4 beta 1 (at > which point we thought we'd fixed it). Yeah, I don't believe it is the same bug. That particular error message suggests that some part of the planner has dropped the ball in building target lists for plan nodes, but that could cover a lot of territory. The fact that Andrew is finding the bug difficult to reproduce reliably is odd too; maybe an uninitialized-variable problem? regards, tom lane
Andrew Holm-Hansen <andrew@einer.org> writes: > SC2test7=# select * from txstatus,tx txx where txstatus.txid = txx.txid > AND txstatus.statuschangetime = (select max(txstatus.statuschangetime) > from txstatus where txstatus.txid = txx.txid); FATAL: terminating > connection due to administrator command The "due to administrator command" bit leaves me still scratching my head a little. AFAICS that message can only appear in response to a SIGTERM signal. You might want to root around and see if there is anything in your system that might try to SIGTERM backends unexpectedly. > select * from txstatus,tx txx where txstatus.txid = txx.txid AND > txstatus.statuschangetime = (select max(txstatus.statuschangetime) from > txstatus where txstatus.txid = txx.txid); > ERROR: variable not found in subplan target list This behavior, however, I do understand now; including the fact that it's not very repeatable. Your test case involved enough tables (after view expansion) to result in GEQO planning being invoked, which would cause the selected plan to vary somewhat randomly. The failure would occur if a SubPlan got used in a clause of a hash join --- and the symptoms would be different depending on whether it was on the inner or outer side of the join. This problem is new in 7.4 because prior releases wouldn't consider hash join for join clauses any more complicated than "var = var". I plan to apply the attached minimal patch to 7.4 branch, and a more extensive cleanup to HEAD. regards, tom lane *** src/backend/executor/nodeHashjoin.c.orig Thu Sep 25 02:57:59 2003 --- src/backend/executor/nodeHashjoin.c Tue Nov 25 13:59:45 2003 *************** *** 417,423 **** */ hjstate->hj_InnerHashKeys = (List *) ExecInitExpr((Expr *) hashNode->hashkeys, ! innerPlanState(hjstate)); ((HashState *) innerPlanState(hjstate))->hashkeys = hjstate->hj_InnerHashKeys; --- 417,423 ---- */ hjstate->hj_InnerHashKeys = (List *) ExecInitExpr((Expr *) hashNode->hashkeys, ! (PlanState *) hjstate); ((HashState *) innerPlanState(hjstate))->hashkeys = hjstate->hj_InnerHashKeys; *** src/backend/optimizer/plan/createplan.c.orig Wed Aug 27 08:44:12 2003 --- src/backend/optimizer/plan/createplan.c Tue Nov 25 14:00:42 2003 *************** *** 1093,1104 **** hashclauses = order_qual_clauses(root, hashclauses); /* ! * Extract the inner hash keys (right-hand operands of the ! * hashclauses) to put in the Hash node. */ innerhashkeys = NIL; foreach(hcl, hashclauses) ! innerhashkeys = lappend(innerhashkeys, get_rightop(lfirst(hcl))); /* We don't want any excess columns in the hashed tuples */ disuse_physical_tlist(inner_plan, best_path->jpath.innerjoinpath); --- 1093,1106 ---- hashclauses = order_qual_clauses(root, hashclauses); /* ! * Extract the inner hash keys (right-hand operands of the hashclauses) ! * to put in the Hash node. Must do a deep copy in case there are ! * subplans in the hash keys. */ innerhashkeys = NIL; foreach(hcl, hashclauses) ! innerhashkeys = lappend(innerhashkeys, ! copyObject(get_rightop(lfirst(hcl)))); /* We don't want any excess columns in the hashed tuples */ disuse_physical_tlist(inner_plan, best_path->jpath.innerjoinpath);
I tried copying the text of the patch into a file and running patch on it, but apparently I'm somehow impaired. Manually patched the relevant files, and it appears to be working just fine. Thanks for all your help, it was really much more than I expected. Andrew Holm-Hansen andrew@einer.org On Tue, 2003-11-25 at 13:11, Tom Lane wrote: > Andrew Holm-Hansen <andrew@einer.org> writes: > > SC2test7=# select * from txstatus,tx txx where txstatus.txid = txx.txid > > AND txstatus.statuschangetime = (select max(txstatus.statuschangetime) > > from txstatus where txstatus.txid = txx.txid); FATAL: terminating > > connection due to administrator command > > The "due to administrator command" bit leaves me still scratching my > head a little. AFAICS that message can only appear in response to a > SIGTERM signal. You might want to root around and see if there is > anything in your system that might try to SIGTERM backends unexpectedly. > > > select * from txstatus,tx txx where txstatus.txid = txx.txid AND > > txstatus.statuschangetime = (select max(txstatus.statuschangetime) from > > txstatus where txstatus.txid = txx.txid); > > ERROR: variable not found in subplan target list > > This behavior, however, I do understand now; including the fact that > it's not very repeatable. Your test case involved enough tables (after > view expansion) to result in GEQO planning being invoked, which would > cause the selected plan to vary somewhat randomly. The failure would > occur if a SubPlan got used in a clause of a hash join --- and the > symptoms would be different depending on whether it was on the inner or > outer side of the join. This problem is new in 7.4 because prior > releases wouldn't consider hash join for join clauses any more > complicated than "var = var". > > I plan to apply the attached minimal patch to 7.4 branch, and a more > extensive cleanup to HEAD. > > regards, tom lane > > *** src/backend/executor/nodeHashjoin.c.orig Thu Sep 25 02:57:59 2003 > --- src/backend/executor/nodeHashjoin.c Tue Nov 25 13:59:45 2003 > *************** > *** 417,423 **** > */ > hjstate->hj_InnerHashKeys = (List *) > ExecInitExpr((Expr *) hashNode->hashkeys, > ! innerPlanState(hjstate)); > ((HashState *) innerPlanState(hjstate))->hashkeys = > hjstate->hj_InnerHashKeys; > > --- 417,423 ---- > */ > hjstate->hj_InnerHashKeys = (List *) > ExecInitExpr((Expr *) hashNode->hashkeys, > ! (PlanState *) hjstate); > ((HashState *) innerPlanState(hjstate))->hashkeys = > hjstate->hj_InnerHashKeys; > > *** src/backend/optimizer/plan/createplan.c.orig Wed Aug 27 08:44:12 2003 > --- src/backend/optimizer/plan/createplan.c Tue Nov 25 14:00:42 2003 > *************** > *** 1093,1104 **** > hashclauses = order_qual_clauses(root, hashclauses); > > /* > ! * Extract the inner hash keys (right-hand operands of the > ! * hashclauses) to put in the Hash node. > */ > innerhashkeys = NIL; > foreach(hcl, hashclauses) > ! innerhashkeys = lappend(innerhashkeys, get_rightop(lfirst(hcl))); > > /* We don't want any excess columns in the hashed tuples */ > disuse_physical_tlist(inner_plan, best_path->jpath.innerjoinpath); > --- 1093,1106 ---- > hashclauses = order_qual_clauses(root, hashclauses); > > /* > ! * Extract the inner hash keys (right-hand operands of the hashclauses) > ! * to put in the Hash node. Must do a deep copy in case there are > ! * subplans in the hash keys. > */ > innerhashkeys = NIL; > foreach(hcl, hashclauses) > ! innerhashkeys = lappend(innerhashkeys, > ! copyObject(get_rightop(lfirst(hcl)))); > > /* We don't want any excess columns in the hashed tuples */ > disuse_physical_tlist(inner_plan, best_path->jpath.innerjoinpath);