Thread: PostgreSQL 8.3.4 reproducible crash
Hello.<br /><br />Here is the SQL to reproduce the server crash:<br /><br /><br />CREATE SCHEMA bug1 AUTHORIZATION postgres;<br/><br />SET search_path = bug1, pg_catalog;<br /><br />CREATE FUNCTION bug1.domain_check (integer) RETURNS boolean<br/> AS <br />$body$<br />SELECT $1 <> 0<br />$body$<br />LANGUAGE sql IMMUTABLE STRICT;<br /><br />CREATEDOMAIN bug1."domain" AS integer<br /> CONSTRAINT "check" CHECK (bug1.domain_check(VALUE));<br /><br />CREATETYPE bug1.composite AS (<br /> id domain<br />);<br /><br />select '(1)'::bug1.composite;<br /><br />
Dmitry Koterov wrote: > Hello. > > Here is the SQL to reproduce the server crash: > > > CREATE SCHEMA bug1 AUTHORIZATION postgres; > > SET search_path = bug1, pg_catalog; > > CREATE FUNCTION bug1.domain_check (integer) RETURNS boolean > AS > $body$ > SELECT $1 <> 0 > $body$ > LANGUAGE sql IMMUTABLE STRICT; > > CREATE DOMAIN bug1."domain" AS integer > CONSTRAINT "check" CHECK (bug1.domain_check(VALUE)); > > CREATE TYPE bug1.composite AS ( > id domain > ); > > select '(1)'::bug1.composite; Reproducible on 8.2 as well. It's crashing in execQual.c, on line 299 (on REL_8_3_STABLE):/* * In a read-only function, use the surrounding query's snapshot;* otherwise take a new snapshot for each query. The snapshot should * include a fresh command ID so that all workto date in this transaction * is visible. We copy in both cases so that postquel_end can * unconditionally do FreeSnapshot.*/if (fcache->readonly_func)>>> snapshot = CopySnapshot(ActiveSnapshot);else{ CommandCounterIncrement(); snapshot = CopySnapshot(GetTransactionSnapshot());} because ActiveSnapshot is NULL. ActiveSnapshot has not yet been set, because the input function and domain checking is done in the parse analyze phase. I propose that we simply add add a NULL-check there: *** src/backend/executor/functions.c 1 Jan 2008 19:45:49 -0000 1.120 --- src/backend/executor/functions.c 10 Dec 2008 13:13:25 -0000 *************** *** 295,301 **** * is visible. We copy in both cases so that postquel_end can * unconditionally do FreeSnapshot. */ ! if (fcache->readonly_func) snapshot = CopySnapshot(ActiveSnapshot); else { --- 295,301 ---- * is visible. We copy in both cases so that postquel_end can * unconditionally do FreeSnapshot. */ ! if (fcache->readonly_func && ActiveSnapshot != NULL) snapshot = CopySnapshot(ActiveSnapshot); else { 8.1 and before are not vulnerable to this, but only because the domain check wasn't invoked at all in cases like this: > postgres=# select '(0)'::bug1.composite; > composite > ----------- > (0) > (1 row) That was a known issue that was fixed in 8.2. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas wrote: > Dmitry Koterov wrote: >> select '(1)'::bug1.composite; > > Reproducible on 8.2 as well. And HEAD too. > because ActiveSnapshot is NULL. ActiveSnapshot has not yet been set, > because the input function and domain checking is done in the parse > analyze phase. I think the right fix is to ensure that ActiveSnapshot is set. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > I think the right fix is to ensure that ActiveSnapshot is set. The proximate cause is certainly lack of a snapshot, but I think there might be some other interesting issues here too. The crash comes when the SQL-language domain check function demands a snapshot --- but if you doselect '1'::bug1.domain; it works fine! The reason for the discrepancy is that parse_coerce.c handles a coercion to a domain by coercing the literal to the underlying type (int4 in this case) and then setting up a *runtime* domain coercion using a CoerceToDomain expression node. So in that case the domain check doesn't occur till runtime when a snapshot will be available. But in the crashing case we simply invoke record_in, which invokes domain_in, which invokes the SQL function. So there seem to be a couple of things we might want to do: 1. Ensure that a snapshot is set before doing parse analysis of any non-utility command. (We *must* not set a snap before LOCK or a transaction control command, and it seems best to not do it for any utility command.) One issue here is that this would change the behavior for mixed utility and non-utility commands in a single query string; though I'm not sure how much that matters. 2. Treat coercion to a record type as being something that should occur at runtime, ie build a RowExpr instead of a constant. This would produce more uniform behavior in terms of when domain constraints get applied. I think the core issue is really whether it is important to postpone domain constraint checks. Consider something like create domain d as int;create view v as select '-1'::d;alter domain d add constraint "c" check (value > 0);select * fromv; Right now you get an error at the SELECT, but that seems a bit surprising. It's even more surprising that the CREATE still works if you made the constraint first. And a novice might reasonably wonder why the domain check is postponed when the underlying type's checks occur instantly --- for example, this fails outright:create view v as select 'z'::d; So this is all a bit odd to start with, and then on top of that we have the issue that the check timing changes if you put the domain inside a record. Comments? regards, tom lane
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > I propose that we simply add add a NULL-check there: Certainly not. A read-only function has no business causing a snapshot to become established. regards, tom lane
On Wed, 2008-12-10 at 14:12 -0500, Tom Lane wrote: > create domain d as int; > create view v as select '-1'::d; > alter domain d add constraint "c" check (value > 0); > select * from v; > > Right now you get an error at the SELECT, but that seems a bit > surprising. It's even more surprising that the CREATE still works if > you made the constraint first. And a novice might reasonably wonder why > the domain check is postponed when the underlying type's checks occur > instantly --- for example, this fails outright: > create view v as select 'z'::d; > > So this is all a bit odd to start with, and then on top of that we have > the issue that the check timing changes if you put the domain inside a > record. > > Comments? > Does the standard provide guidance here? I took a look, and it's difficult to tell, because it uses words like "evaluation" (and I don't think that a view is required to actually evaluate anything). It also talks about deferrable and non-deferrable, which indicate that the constraint should apply at insertion time. Standard aside... To me, it seems reasonable that something like the CREATE VIEW above should fail, because you're specifying a literal of type "d" (invoking the type selector for "d" on a value representation of unknown type), and it is invalid in the domain "d". However, a similar construction: create view v as select cast('-1'::int AS d); seems slightly different to me, because the value already has a type, and the exception is raised from the explicit cast. It's the same as if it casted some variable "x" instead of '-1'::int, because variables already have types. Consider something like this: create view v as select 1::int/0::int; Here, I see division as a function that can raise an exception, similar to how an explicit cast can raise an exception. There's no expectation that the view will evaluate 1/0 at CREATE VIEW time, because 0 might be some variable "x" (ranging over some underlying table) that can't possibly be evaluated at view creation time. In other words, I see casts as functions that may or may not raise an exception during evaluation, and that should not be evaluated at view creation time. However, I do not see type selectors ('representation'::type) as functions, because they do not have an argument of a specific type. I think type selectors should be evaluated at view creation time, because the value must become a variable at that time. Regards,Jeff Davis
Tom Lane wrote: > 1. Ensure that a snapshot is set before doing parse analysis of any > non-utility command. (We *must* not set a snap before LOCK or a > transaction control command, and it seems best to not do it for any > utility command.) One issue here is that this would change the behavior > for mixed utility and non-utility commands in a single query string; > though I'm not sure how much that matters. I think this is the easiest way out, and the most robust -- we won't be bitten by some other operation that the parser may think of doing. (Note that utility commands have their snapshot set in PortalRunUtility). Also, perhaps this would let us clean the mess in pg_plan_queries. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Tom Lane wrote: >> 1. Ensure that a snapshot is set before doing parse analysis of any >> non-utility command. > I think this is the easiest way out, and the most robust -- we won't be > bitten by some other operation that the parser may think of doing. Yeah. I think we probably have to do that in any case because we have an assumption that datatype input routines are allowed to make use of a snapshot (see comments in fastpath.c for instance). The fact that no one's noticed this crash before suggests that none of the common ones actually do, but I don't think we want to back off that assumption. There's still a question of whether we want to alter the treatment of record-type input to make the handling of embedded domains more uniform, but that's something for the future. regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > I think this is the easiest way out, and the most robust -- we won't be > bitten by some other operation that the parser may think of doing. > (Note that utility commands have their snapshot set in > PortalRunUtility). Also, perhaps this would let us clean the mess in > pg_plan_queries. After looking at this some more, it seems the problem is that the mess moves to the callers --- at least the ones that can't count on a snap being already set. Either that or we put the samw make-a-snap-if-needed logic into pg_analyze_and_rewrite; which doesn't seem very attractive because then we get two separate snapshot creation cycles for parse analysis and planning. It seems better to do it in the callers so that only one cycle is needed. Another possibility is to merge pg_analyze_and_rewrite and pg_plan_queries into a single function, but this doesn't appear to be convenient for all callers. The good news is that my concern about changing the snapshotting behavior of multiple-command query strings seems unfounded. That is only relevant to exec_simple_query and it's already set up so that parse analysis, planning, and execution happen one command at a time. So if a plannable statement comes before a utility statement, a transaction snapshot would've been acquired before the utility command executes anyway. regards, tom lane
Dmitry Koterov wrote: > Hello. > > Here is the SQL to reproduce the server crash: > > > CREATE SCHEMA bug1 AUTHORIZATION postgres; > > SET search_path = bug1, pg_catalog; > > CREATE FUNCTION bug1.domain_check (integer) RETURNS boolean > AS > $body$ > SELECT $1 <> 0 > $body$ > LANGUAGE sql IMMUTABLE STRICT; > > CREATE DOMAIN bug1."domain" AS integer > CONSTRAINT "check" CHECK (bug1.domain_check(VALUE)); > > CREATE TYPE bug1.composite AS ( > id domain > ); > > select '(1)'::bug1.composite; This has been fixed in CVS HEAD but I am unsure if and how far it was backpatched. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +