Thread: BUG #12106: pg_dump does not resolve circular dependency
The following bug has been logged on the website: Bug reference: 12106 Logged by: OndÅej Bouda Email address: bouda@edookit.com PostgreSQL version: 9.3.5 Operating system: Windows 7 Description: A function may reference itself via default values of some of its arguments. It is rather tricky to create one, but it is possible: -- first, define the function without argument defaults CREATE FUNCTION self_loop(a INT) RETURNS INT AS $$ SELECT 2*a; $$ LANGUAGE SQL; -- then, replace it with the argument defaults CREATE OR REPLACE FUNCTION self_loop(a INT = self_loop(1)) RETURNS INT AS $$ SELECT 2*a; $$ LANGUAGE SQL; Although the real usage is questionable, such a function is perfectly legal both in terms of documented limitations and the actual implementation. The problem is that pg_dump does not handle the circular dependency: ... pg_dump: reading dependency data pg_dump: [sorter] WARNING: could not resolve dependency loop among these items: pg_dump: [sorter] FUNCTION self_loop (ID 183 OID 41120) ... Indeed, in the resulting dump, it emits a straight CREATE statement with defaults right away - the same as the CREATE OR REPLACE above - which does not succeed, obviously: ERROR: function self_loop(integer) does not exist LINE 1: CREATE FUNCTION self_loop(a integer DEFAULT self_loop(1)) RE... ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. Even worse, pg_dump returns 0 as the exit code, even though the dump is not usable. It merely issues a warning on output. Generally, the problem might occur with multiple functions, creating a non-trivial circular dependency (e.g., foo(int = bar(1)) and bar(int = foo(2))). I suggest fixing pg_dump such that in case it finds a circular dependency, it cuts it by ignoring some of the attributes in the cycle (argument default, in this case) when emitting the CREATE statement, then emitting ALTER/REPLACE statements which will fix the ignored attributes. I don't know, however, whether it is even possible - currently, pg_depend does not contain information about the attribute causing the dependency; it could be extended (not sure about complexity, though). I will analyze whether this is the only problem of this kind, or whether there are more similar situations. Tested with pg_dump 9.3.5 and 9.4rc1 (which is not in the selectbox in the web form for reporting bugs, by the way).
bouda@edookit.com writes: > A function may reference itself via default values of some of its arguments. > It is rather tricky to create one, but it is possible: Unless you can demonstrate a credible use-case for this, I don't think it's worth adding complexity to pg_dump for. regards, tom lane
Tom Lane-2 wrote > bouda@ > writes: >> A function may reference itself via default values of some of its >> arguments. >> It is rather tricky to create one, but it is possible: > > Unless you can demonstrate a credible use-case for this, I don't think > it's worth adding complexity to pg_dump for. Assuming: "Even worse, pg_dump returns 0 as the exit code, even though the dump is not usable. It merely issues a warning on output." is confirmed I would assume that it would need to be fixed. I'm not against simply letting pg_dump error out but it should do so completely. I imagine it would be reasonably simple to disallow CREATE FUNCTION to reference the function being created in the default expressions but the indirect circular dependency would definite be trickier. The uninformed problem in both cases is dealing with function overloading - in addition to simply tracking the cycle in the later scenario. But ideally we wouldn't create something that pg_dump cannot restore...and not just allow pg_dump to fail and figure that the user is going to test their restore code. OTOH it is a function so unless itself is a dependency a warning would make sense because the user, upon being informed of the issue, could manually issue the same CREATE/REPLACE SQL command pair to re-construct the function manually if they really need the capability. David J. -- View this message in context: http://postgresql.nabble.com/BUG-12106-pg-dump-does-not-resolve-circular-dependency-tp5828685p5828698.html Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.
David G Johnston <david.g.johnston@gmail.com> writes: > Tom Lane-2 wrote >> Unless you can demonstrate a credible use-case for this, I don't think >> it's worth adding complexity to pg_dump for. > Assuming: "Even worse, pg_dump returns 0 as the exit code, even though > the dump is not usable. It merely issues a warning on output." is > confirmed I would assume that it would need to be fixed. "Not usable" is hyperbole. The actual effect would be that the particular function would fail to restore, but the rest of it would be fine. I should note that it's not like we've never seen any other cases of pg_dump failing to deal with circular dependencies. Issuing a warning and pushing on is *way* better than refusing the dump entirely. regards, tom lane
On Sun, 30 Nov 2014 17:31:14 +0100, Tom Lane <tgl@sss.pgh.pa.us> wrote: > bouda@edookit.com writes: >> A function may reference itself via default values of some of its >> arguments. >> It is rather tricky to create one, but it is possible: > > Unless you can demonstrate a credible use-case for this, I don't think > it's worth adding complexity to pg_dump for. > > regards, tom lane I admin I have no real world use case for the mentioned situation, although I think it should be documented as a pg_dump limitation, at least. I suggest adding a new section called Limitations on the pg_dump documentation page, which would list all currently known limitations. Looking at the object attributes and their possible dependencies on other objects, I found these cases not handled by pg_dump (i.e., resulting in an erroneous dump): 1) Function argument default referring to other objects, causing a circular dependency (even referring to the function itself) - this is the originally reported one. 2) Domain default calling a function - which therefore must return that domain type. E.g., CREATE DOMAIN dom AS INT; CREATE FUNCTION get_default_dom() RETURNS dom AS $$ SELECT 42::dom; $$ LANGUAGE sql; ALTER DOMAIN dom SET DEFAULT get_default_dom(); For these, pg_dump emits CREATE FUNCTION first, then CREATE DOMAIN. The CREATE FUNCTION fails as the domain has not been created yet. Worse, it even emits no warnings - the pg_dump output looks entirely clear. 3) View referring to itself (or other view which in turn refers to the first one...). E.g., CREATE VIEW selfview (o) AS SELECT NULL::regclass; CREATE OR REPLACE VIEW selfview (o) AS SELECT 'selfview'::regclass; Here, pg_dump emits CREATE VIEW selfview AS SELECT 'selfview'::regclass AS o; - which fails (ERROR: relation "selfview" does not exist). Similarly to 2), no warnings are given, i.e., one finds out only when actually trying to restore the dump. Generally, any object may be referenced by expressions like 'seq'::regclass within: - table column defaults - domain defaults - function argument defaults - view definitions Any of them may result in a circular dependency which pg_dump is unable to cut. I am aware of the fact all these cases might look silly (except for case 2), maybe), but still - mentioning them in the docs might help. In addition, it would be nice if pg_dump could at least detect the circular dependency in cases 2) and 3) and issue a warning, similarly to case 1). Ondřej Bouda P.S. Sorry for not coming out with a patch right away - I am a newbie here...
On 11/30/14 12:54 PM, Tom Lane wrote: > "Not usable" is hyperbole. The actual effect would be that the particular > function would fail to restore, but the rest of it would be fine. > > I should note that it's not like we've never seen any other cases of > pg_dump failing to deal with circular dependencies. Issuing a warning > and pushing on is *way* better than refusing the dump entirely. Maybe there should be an option to fail or not fail in these cases. Both cases are useful, but I tend to agree that it should fail hard by default.