Thread: pg_depend explained
Has anyone written a in-depth description on how to traverse the pg_depend tree? The 'a' and 'i' deptype really makes it hard to figure out the dependency order, a topological sort does not work. My latest attempt involved trying to group by all objects connected to each other via deptype 'a' or 'i', and replacing all such nodes in the tree with the "source node", i.e. the node which according to the topological order could be created first. Am I on the right path, trying to "fuse" the internal/auto objects together, replacing them with the top most object in the tree? Or is there a simplier way to figure out the order in which objects can be created? I need a general solution, not a custom-made query for each regclass, which is quite trivial but feels far from bullet proof, I want something only relying on pg_depend, since it should be the safest method of them all. -- Best regards, Joel Jacobson Glue Finance
Joel Jacobson <joel@gluefinance.com> writes: > Has anyone written a in-depth description on how to traverse the pg_depend tree? Try reading the code in src/backend/catalog/dependency.c. regards, tom lane
2011/1/11 Tom Lane <tgl@sss.pgh.pa.us>: > Try reading the code in src/backend/catalog/dependency.c. I've tried but failed to figure it out anyway. The focus in dependency.c is to find out dependencies of a given object. What I want to do is something slighly different. I need to figure out the order of creation of all objects, not just the dependencies for a single object. Basically, I want to do "order by oid", but since you cannot trust the oid order (like we did in pre-7.3), I need to get the sorted list using pg_depend somehow. I guess I could make findDependentObjects() callable from sql and call it for each and every object, but that's a quite big project, I was hoping it was possible to do it in plain sql, or at least only by relying on plpgsql/plperl. I've implemented tsort() in plperl, but that didn't really help me much, since you need to jump around in the tree when you encounter 'internal' and 'auto' nodes. My last resort is to sort by oid, but I really don't want to do that since it would render the entire solution unreliable and I would never feel comfortable using it in the production environment. This is the last component I need in order to complete the work on the pov-project http://www.github.com/gluefinance/pov I would highly appreciate your help, I feel a bit lame since I've spent over two days working on this. It's not difficult if you are allowed to build specialized queries for each class, but my goal is a general query only relying on pg_depend. -- Best regards, Joel Jacobson Glue Finance
Joel Jacobson <joel@gluefinance.com> writes: > I need to figure out the order of creation of all objects, not just > the dependencies for a single object. In that case try pg_dump's pg_dump_sort.c. You will never get "the" order of creation of objects, because that isn't tracked; but you can find out what a safe order would be. regards, tom lane
On Jan11, 2011, at 16:54 , Joel Jacobson wrote: > Has anyone written a in-depth description on how to traverse the pg_depend tree? > The 'a' and 'i' deptype really makes it hard to figure out the > dependency order, a topological sort does not work. Could you give an example of the kind of trouble you're experiencing trying to use a topological sort? best regards, Florian Pflug
2011/1/11 Florian Pflug <fgp@phlo.org>: > Could you give an example of the kind of trouble you're experiencing trying > to use a topological sort? Let's say you have a table t and a view v. The view v is defined as select * from t; If we put all objects in a tree, with the public schema as the root, both v and t will directly under the root, but in reality, v cannot be created before t. This is the reason why a normal topological sort doesn't work. You have to look at the deptype and sort nodes having "internal" edges between them differently. The pg_dump source code of course contains all the logic necessary to do the trick, but it's not that easy to follow. I guess it's time for plan B, sorting based on oid, no biggie, it will work for my purpose, but it's damn ugly. -- Best regards, Joel Jacobson Glue Finance
On Jan11, 2011, at 23:55 , Joel Jacobson wrote: > 2011/1/11 Florian Pflug <fgp@phlo.org>: >> Could you give an example of the kind of trouble you're experiencing trying >> to use a topological sort? > > Let's say you have a table t and a view v. > The view v is defined as select * from t; > If we put all objects in a tree, with the public schema as the root, > both v and t will directly under the root, but in reality, v cannot be > created before t. AFAICS, you get the following dependencies (apart from the obvious NORMAL dependencies from the pg_class entries of t and v on public) t (pg_class) <--[INTERNAL]-- t (pg_type) /°\ | [NORMAL] | _RETURN (pg_rewrite) | |[NORMAL] [INTERNAL] | | \./ \./ v (pg_class) <--[INTERNAL]-- v (pg_type) INTERNAL dependencies mark objects which spring into existence once the referenced (target in my diagram) object is created. You can thus fold a node I (the INTERNALly-depending object) into a node O (the object created by the user) if there is an INTERNAL dependency from I to O. The diagram then becomes v (pg_class) --[NORMAL]--> t (pg_class) Which correctly states that t must be created before v. > This is the reason why a normal topological sort doesn't work. > You have to look at the deptype and sort nodes having "internal" edges > between them differently. I suggest you try to node-folding strategy and see how far it gets you. > I guess it's time for plan B, sorting based on oid, no biggie, it will > work for my purpose, but it's damn ugly. That will probably crash and burn once the OIDs have wrapped around once. best regards, Florian Pflug
2011/1/12 Florian Pflug <fgp@phlo.org>: > I suggest you try to node-folding strategy and see how far it gets you. Good suggestion! :-) That's exactly what I've been trying to do, but failed miserably :-( I have written a thorough description of my problem and put it on my github: https://github.com/gluefinance/pov/tree/master/doc In the example, pg_depend.sql, we want to reach the following conclusions in an algorithmic way, only by using pg_depend as in-data: table t1, function f1, sequence s1 have no dependencies, other than the public schema table t2 depends on t1 and table t3 depends on f1 view v1 depends on t1 and view v2 depends on t2 view v3 depends on both v1 and v2 view v4 depends on v3 and f1 The topological tree automatically generated from the data in pg_depend is presented in pg_depend.dot, pg_depend.svg and pg_depend.png. The actual topological tree (possible order of creation) created manually is presented in pg_depend_actual.dot, pg_depend_actual.svg and pg_depend_actual.png. The automatically created objects, such as primary key indexes, constraints and triggers, have been ignored in this graph, as they are implicitly created when creating the "base objects". Objective: Define a general algorithm taking ONLY the pg_depend data as input, generating a valid topological directional graph, including at least the nodes in pg_depend_actual.dot, but might as well include all nodes, although it is not necessary, it's certainly won't hurt. It will be necessary to look not only at the nodes (objects) and edges (obj->refobj), but also the deptype for each edge. List of files: pg_depend.sql : A small but sufficient database model of tables, sequences, functions and views. pg_depend.dot : Digraph in DOT language (plaintext format), generated automatically only by using data from pg_data. pg_depend.png : PNG generated using GraphViz with pg_depend.dot as input pg_depend.svg : SVG generated using GraphViz with pg_depend.dot as input pg_depend_actual.dot : Digraph in DOT language (plaintext format), generated manually, shows the actual possible order of creation, only including the "base objects". pg_depend_actual.png : PNG generated using GraphViz with pg_depend_actual.dot as input pg_depend_actual.svg : SVG generated using GraphViz with pg_depend_actual.dot as input -- Best regards, Joel Jacobson Glue Finance
Excerpts from Joel Jacobson's message of mié ene 12 07:07:35 -0300 2011: > The automatically created objects, such as primary key indexes, > constraints and triggers, have been ignored in this graph, as they are > implicitly created when creating the "base objects". FWIW this idea fails when you consider stuff such as circular foreign keys (and I suppose there are other, more common cases). If you really want something general you need to break those apart. (This is the explanation for the “break the loop” code in pg_dump I imagine) -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
2011/1/12 Alvaro Herrera <alvherre@commandprompt.com>: > FWIW this idea fails when you consider stuff such as circular foreign > keys (and I suppose there are other, more common cases). If you really > want something general you need to break those apart. (This is the > explanation for the “break the loop” code in pg_dump I imagine) Good point. Yes, the algorithm must be able to "break the loop", i.e. remove the edges causing the loops. This has already been properly solved in pg_dump, but I frankly don't really understand exactly how this is done, at least not the "general rule" on how to do it, even after spending hours studying the source code. Also, circular dependencies seems impossible for some object classes, such as functions, views, constraints and triggers. None of them can possibly refer to them self. If you only need to create/drop objects of these classes (like in my case within the pov-project), it's not important to "break the loop" in a clever way, i.e. simply removing any of the edges, not necessarily the best suited one, will do for my purpose. I have updated the example with two circular relations: -- Circular dependencies: CREATE TABLE tselfref ( id int not null PRIMARY KEY, parentid int not null REFERENCES tselfref(id) ); CREATE TABLE tcircular ( id int not null PRIMARY KEY, id2 int not null REFERENCES tselfref(id) ); ALTER TABLE tselfref ADD COLUMN id2 int not null REFERENCES tcircular ( id ); I have also updated pd_depend.[sql|dot|png|svg] and pg_depend_actual.[dot|png|svg] with the circular references. The dotted edges in pg_depend_actual are the edges which must be removed to "break the loop". Any ideas on how to design an algorithm to transform the digraph pg_depend into pg_depend_actual are highly appreciated. -- Best regards, Joel Jacobson Glue Finance
Joel Jacobson <joel@gluefinance.com> writes: > Also, circular dependencies seems impossible for some object classes, > such as functions, views, constraints and triggers. regression=# create table tt(f1 int, f2 int); CREATE TABLE regression=# create view v1 as select * from tt; CREATE VIEW regression=# create view v2 as select * from v1; CREATE VIEW regression=# create or replace view v1 as select * from v2; CREATE VIEW regression=# drop view v1; ERROR: cannot drop view v1 because other objects depend on it DETAIL: view v2 depends on view v1 HINT: Use DROP ... CASCADE to drop the dependent objects too. regression=# drop view v2; ERROR: cannot drop view v2 because other objects depend on it DETAIL: view v1 depends on view v2 HINT: Use DROP ... CASCADE to drop the dependent objects too. This isn't particularly *useful*, maybe, but it's hardly "impossible". And if we analyzed function dependencies in any detail, circular dependencies among functions would be possible (and useful). regards, tom lane
2011/1/12 Tom Lane <tgl@sss.pgh.pa.us>: > This isn't particularly *useful*, maybe, but it's hardly "impossible". > And if we analyzed function dependencies in any detail, circular > dependencies among functions would be possible (and useful). Thanks Tom for clarifying, this makes me even more motivated into implementing the "creation order"-algorithm using only sql/plpgsql and pg_depend. If you have any ideas on how to do this, in addition to reading the dependency.c and pg_dump_sort.c source code, they would be highly appreciated. Any tips on articles on graph algorithms which not only take edges (node->node) as indata, but also a "edge type" as indata (i.e. the deptype in pg_depend) would also be very useful. I have only found algorithms to do sorting on "normal" directional graphs, where all edges are threated the same. -- Best regards, Joel Jacobson Glue Finance
Joel Jacobson <joel@gluefinance.com> writes: > 2011/1/12 Tom Lane <tgl@sss.pgh.pa.us>: >> This isn't particularly *useful*, maybe, but it's hardly "impossible". >> And if we analyzed function dependencies in any detail, circular >> dependencies among functions would be possible (and useful). > Thanks Tom for clarifying, this makes me even more motivated into > implementing the "creation order"-algorithm using only sql/plpgsql and > pg_depend. > If you have any ideas on how to do this, in addition to reading the > dependency.c and pg_dump_sort.c source code, they would be highly > appreciated. I've sometimes found it useful to think of internal dependencies as acting like normal dependencies pointing in the other direction. I'm not sure that would do much to solve your problem, but it might be worth trying. regards, tom lane
2011/1/12 Tom Lane <tgl@sss.pgh.pa.us>: > I've sometimes found it useful to think of internal dependencies as > acting like normal dependencies pointing in the other direction. > I'm not sure that would do much to solve your problem, but it might > be worth trying. Tom, you are a genious! No, seriously, I mean it, this is awesome, it worked! YES! You totally saved my day! Thank you! Finally! I'm so happy! :-) :-) :-) This was the little piece of code: CASE WHEN DepType ~ '^(a|ni|in|an|na)$' THEN --- Swap edges ELSE -- Do not swap edges END Look at the attached svg graph how beautiful the automatically generated graph look like now! :-) The tsort of the objects now sort all the normal objects in a creatable order! Here is the result of the tsort (only including the normal objects (the one I care about (I don't have to create the internal/auto objects, nor drop them))): The query below can both produce a DOT-format graph and a tsort of the creatable order of objects: WITH NewObjectOids AS ( SELECT * FROM pg_depend WHERE deptype <> 'p' EXCEPT SELECT * FROM pg_depend_before ), NewObjectOidsAggDepType AS ( SELECT classid,objid,objsubid,refclassid,refobjid,refobjsubid,array_to_string(array_agg(deptype),'') AS deptype FROM NewObjectOids GROUP BY classid,objid,objsubid,refclassid,refobjid,refobjsubid ), NewObjects AS ( SELECT CASE WHEN DepType ~ '^(a|ni|in|an|na)$' THEN pg_describe_object(classid,objid,0) || ' ' || classid || '.' || objid ELSE pg_describe_object(refclassid,refobjid,0) || ' ' || refclassid || '.' || refobjid END AS RefObj, CASE WHEN DepType ~ '^(a|ni|in|an|na)$' THEN pg_describe_object(refclassid,refobjid,0) || ' ' || refclassid || '.' || refobjid ELSE pg_describe_object(classid,objid,0) || ' ' || classid || '.' || objid END AS Obj, DepType FROM NewObjectOidsAggDepType ), DepDigraph AS ( SELECT DISTINCT RefObj, Obj, DepType FROM NewObjects WHERE RefObj <> Obj ), DotFormat AS ( SELECT 'digraph pg_depend {' AS diagraph UNION ALL SELECT ' "' || RefObj || '" -> "' || Obj || '" [' || CASE WHEN array_to_string(array_agg(DepType),'') = 'n' THEN 'color=black' WHEN array_to_string(array_agg(DepType),'') = 'i' THEN 'color=red' WHEN array_to_string(array_agg(DepType),'') = 'a' THEN 'color=blue' WHEN array_to_string(array_agg(DepType),'') ~ '^(ni|in)$' THEN 'color=green' WHEN array_to_string(array_agg(DepType),'') ~ '^(na|an)$' THEN 'color=yellow' ELSE 'style=dotted' END || ' label=' || array_to_string(array_agg(DepType),'') || ']' FROM DepDigraph GROUP BY RefObj, Obj UNION ALL SELECT '}' ), TopoSort AS (SELECT unnest FROM unnest((SELECT tsort(array_to_string(array_agg(RefObj || ';' || Obj),';'),';',2) FROM DepDigraph))) SELECT * FROM TopoSort; sequence s1 1259.23359 function f1(integer) 1255.23358 table t3 1259.23371 table t1 1259.23353 view v1 1259.23378 table t2 1259.23361 view v2 1259.23382 view v3 1259.23386 view v4 1259.23390 -- Best regards, Joel Jacobson Glue Finance
Attachment
On Wed, Jan 12, 2011 at 2:06 PM, Joel Jacobson <joel@gluefinance.com> wrote: > Tom, you are a genious! No, seriously, I mean it, this is awesome, it > worked! YES! You totally saved my day! Thank you! Finally! I'm so > happy! :-) :-) :-) <stage whisper> Hey, guys, I think it worked...! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Excerpts from Joel Jacobson's message of mié ene 12 16:06:24 -0300 2011: > The query below can both produce a DOT-format graph and a tsort of the > creatable order of objects: > > WITH > NewObjectOids AS ( > SELECT * FROM pg_depend WHERE deptype <> 'p' > EXCEPT > SELECT * FROM pg_depend_before > ), I think this code should live in the Wiki somewhere: http://wiki.postgresql.org/wiki/Snippets -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Jan 12, 2011 at 08:06:24PM +0100, Joel Jacobson wrote: > 2011/1/12 Tom Lane <tgl@sss.pgh.pa.us>: > > I've sometimes found it useful to think of internal dependencies as > > acting like normal dependencies pointing in the other direction. > > I'm not sure that would do much to solve your problem, but it might > > be worth trying. > > Tom, you are a genious! No, seriously, I mean it, this is awesome, it > worked! YES! You totally saved my day! Thank you! Finally! I'm so > happy! :-) :-) :-) > > This was the little piece of code: > > CASE WHEN DepType ~ '^(a|ni|in|an|na)$' THEN > --- Swap edges > ELSE > -- Do not swap edges > END > > Look at the attached svg graph how beautiful the automatically > generated graph look like now! :-) > > The tsort of the objects now sort all the normal objects in a creatable order! > > Here is the result of the tsort (only including the normal objects > (the one I care about (I don't have to create the internal/auto > objects, nor drop them))): > > The query below can both produce a DOT-format graph and a tsort of the > creatable order of objects: > > WITH > NewObjectOids AS ( > SELECT * FROM pg_depend WHERE deptype <> 'p' > EXCEPT > SELECT * FROM pg_depend_before To what does pg_depend_before refer? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
(sorry for top posting, iPhone + drunk) pg_depend_before is a select * from pg_depend before creating the test db model Sent from my iPhone On 12 jan 2011, at 20:36, David Fetter <david@fetter.org> wrote: > On Wed, Jan 12, 2011 at 08:06:24PM +0100, Joel Jacobson wrote: >> 2011/1/12 Tom Lane <tgl@sss.pgh.pa.us>: >>> I've sometimes found it useful to think of internal dependencies as >>> acting like normal dependencies pointing in the other direction. >>> I'm not sure that would do much to solve your problem, but it might >>> be worth trying. >> >> Tom, you are a genious! No, seriously, I mean it, this is awesome, it >> worked! YES! You totally saved my day! Thank you! Finally! I'm so >> happy! :-) :-) :-) >> >> This was the little piece of code: >> >> CASE WHEN DepType ~ '^(a|ni|in|an|na)$' THEN >> --- Swap edges >> ELSE >> -- Do not swap edges >> END >> >> Look at the attached svg graph how beautiful the automatically >> generated graph look like now! :-) >> >> The tsort of the objects now sort all the normal objects in a creatable order! >> >> Here is the result of the tsort (only including the normal objects >> (the one I care about (I don't have to create the internal/auto >> objects, nor drop them))): >> >> The query below can both produce a DOT-format graph and a tsort of the >> creatable order of objects: >> >> WITH >> NewObjectOids AS ( >> SELECT * FROM pg_depend WHERE deptype <> 'p' >> EXCEPT >> SELECT * FROM pg_depend_before > > To what does pg_depend_before refer? > > Cheers, > David. > -- > David Fetter <david@fetter.org> http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fetter@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate
(sorry for top posting, iPhone + drunk) pg_depend_before is a select * from pg_depend before creating the test db model Sent from my iPhone On 12 jan 2011, at 20:36, David Fetter <david@fetter.org> wrote: > On Wed, Jan 12, 2011 at 08:06:24PM +0100, Joel Jacobson wrote: >> 2011/1/12 Tom Lane <tgl@sss.pgh.pa.us>: >>> I've sometimes found it useful to think of internal dependencies as >>> acting like normal dependencies pointing in the other direction. >>> I'm not sure that would do much to solve your problem, but it might >>> be worth trying. >> >> Tom, you are a genious! No, seriously, I mean it, this is awesome, it >> worked! YES! You totally saved my day! Thank you! Finally! I'm so >> happy! :-) :-) :-) >> >> This was the little piece of code: >> >> CASE WHEN DepType ~ '^(a|ni|in|an|na)$' THEN >> --- Swap edges >> ELSE >> -- Do not swap edges >> END >> >> Look at the attached svg graph how beautiful the automatically >> generated graph look like now! :-) >> >> The tsort of the objects now sort all the normal objects in a creatable order! >> >> Here is the result of the tsort (only including the normal objects >> (the one I care about (I don't have to create the internal/auto >> objects, nor drop them))): >> >> The query below can both produce a DOT-format graph and a tsort of the >> creatable order of objects: >> >> WITH >> NewObjectOids AS ( >> SELECT * FROM pg_depend WHERE deptype <> 'p' >> EXCEPT >> SELECT * FROM pg_depend_before > > To what does pg_depend_before refer? > > Cheers, > David. > -- > David Fetter <david@fetter.org> http://fetter.org/ > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter > Skype: davidfetter XMPP: david.fetter@gmail.com > iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics > > Remember to vote! > Consider donating to Postgres: http://www.postgresql.org/about/donate
On Wed, Jan 12, 2011 at 09:09:31PM +0100, Joel Jacobson wrote: > (sorry for top posting, No worries. > iPhone + drunk) A dangerous combination indeed. I hear water, NSAIDs and time can help with the hangover ;) > pg_depend_before is a select * from pg_depend before creating the > test db model Please put a self-contained example on the snippets page, and please also to check that it actually runs before doing so. You'd mangled some aliases in the query you sent, which leads me to believe you hadn't actually tried running it. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
2011/1/13 David Fetter <david@fetter.org>: > Please put a self-contained example on the snippets page, and please > also to check that it actually runs before doing so. You'd mangled > some aliases in the query you sent, which leads me to believe you > hadn't actually tried running it. I actually hadn't really solved the problem at the time I wrote my last email, it turned out I had to do things a bit differently to avoid running into problems with corner cases. I will put together a self-contained example like you suggested and get back shortly :-) -- Best regards, Joel Jacobson Glue Finance
2011/1/12 Alvaro Herrera <alvherre@commandprompt.com>: > I think this code should live in the Wiki somewhere: > http://wiki.postgresql.org/wiki/Snippets This file contains only the relevant remapping of pg_depend, folding the internal linkages properly: https://github.com/gluefinance/pov/blob/master/sql/schema/pov/views/pg_depend_remapped.sql It can be tested stand-alone and does not require any other components from the pov project. Can I create a wiki snippet myself or do I need some kind of admin access? -- Best regards, Joel Jacobson Glue Finance
On Fri, Jan 14, 2011 at 09:39, Joel Jacobson <joel@gluefinance.com> wrote: > 2011/1/12 Alvaro Herrera <alvherre@commandprompt.com>: >> I think this code should live in the Wiki somewhere: >> http://wiki.postgresql.org/wiki/Snippets > > This file contains only the relevant remapping of pg_depend, folding > the internal linkages properly: > > https://github.com/gluefinance/pov/blob/master/sql/schema/pov/views/pg_depend_remapped.sql > > It can be tested stand-alone and does not require any other components > from the pov project. > > Can I create a wiki snippet myself or do I need some kind of admin access? Absolutely, no admin access required. As long as you have (or create) a community account, you can edit or create pages. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/