Thread: Finding referecing and referenced tables, adaptation from David Fetter's solution

A coworker of mine* was looking for a way to quickly and easily be
able to tell which tables were referencing particular table(s) she
wanted to load (for unit testing). Using the examples from David
Fetter**, she submitted a revised version that seems to work quite
well. With her permission, I have posted her changes and comments.

Here is what she had to say about her changes (lightly-edited to mask
table names etc)

"The queries linked in the original post by David Fetter do not appear
to take into account cases where table A references table B, and B
references A.  (Or longer cycles.) In our environment, we have a lot
of those.
Using the original query as a cheat sheet, I wrote a query that
essentially takes as parameter the name of the table whose "ancestors"
you want to find, and comes up with a list of everything it depends
on, but stops as soon as it detects a cycle.

To get a list of all the FK relationships that start at the table of interest:

SELECT referenced_table, depth, chain FROM table_dependencies WHERE
origin_table = '<table name>';

To get a simple list of all tables that a given table requires due to
FK dependencies (which is what I need for my unit tests):

SELECT DISTINCT referenced_table FROM table_dependencies WHERE
origin_table = '<table name>';

To get a partial ordering of the dependencies -- which should tell you
in what order the tables should be populated in order to avoid FK
violations -- I think you can do this:

SELECT referenced_table, MIN(depth) AS path_length
FROM table_dependencies
WHERE origin_table = '<table name>'
GROUP BY referenced_table
ORDER BY path_length DESC;"

-- Here is the view she came up with:

CREATE OR REPLACE VIEW table_dependencies AS (
WITH RECURSIVE t AS (
    SELECT
        c.oid AS origin_id,
        c.oid::regclass::text AS origin_table,
        c.oid AS referencing_id,
        c.oid::regclass::text AS referencing_table,
        c2.oid AS referenced_id,
        c2.oid::regclass::text AS referenced_table,
        ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
    FROM pg_catalog.pg_constraint AS co
    INNER JOIN pg_catalog.pg_class AS c
    ON c.oid = co.conrelid
    INNER JOIN pg_catalog.pg_class AS c2
    ON c2.oid = co.confrelid
-- Add this line as "parameter" if you want to make a one-off query
-- or a function instead of a view
--        WHERE c.oid::regclass::text = '<table name>'

    UNION ALL
    SELECT
        t.origin_id,
        t.origin_table,
        t.referenced_id AS referencing_id,
        t.referenced_table AS referencing_table,
        c3.oid AS referenced_id,
        c3.oid::regclass::text AS referenced_table,
        t.chain || c3.oid::regclass AS chain
    FROM pg_catalog.pg_constraint AS co
    INNER JOIN pg_catalog.pg_class AS c3
    ON c3.oid = co.confrelid
    INNER JOIN t
    ON t.referenced_id = co.conrelid
    WHERE
        -- prevent infinite recursion by pruning paths where the last entry in
        -- the path already appears somewhere else in the path
        NOT (
            ARRAY[ t.chain[array_upper(t.chain, 1)] ] -- an array
containing the last element
            <@                                        -- "is contained by"
            t.chain[1:array_upper(t.chain, 1) - 1]    -- a slice of the chain,
                                                      -- from element 1 to n-1
        )
)
SELECT  origin_table,
        referenced_table,
        array_upper(chain,1) AS "depth",
        array_to_string(chain,',') as chain
FROM t
);


If anyone has any fixes or changes, or knows of a better way to get
the referencers/referencees, we'd be interested in hearing about them.

* Jenny van Hoof deserves the credit for the changes
** http://people.planetpostgresql.org/dfetter/index.php?/archives/27-Key-Tree.html

On 30/07/11 10:45, bricklen wrote:
> [...]
> CREATE OR REPLACE VIEW table_dependencies AS (
> WITH RECURSIVE t AS (
>      SELECT
>          c.oid AS origin_id,
>          c.oid::regclass::text AS origin_table,
>          c.oid AS referencing_id,
>          c.oid::regclass::text AS referencing_table,
>          c2.oid AS referenced_id,
>          c2.oid::regclass::text AS referenced_table,
>          ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
>      FROM pg_catalog.pg_constraint AS co
>      INNER JOIN pg_catalog.pg_class AS c
>      ON c.oid = co.conrelid
>      INNER JOIN pg_catalog.pg_class AS c2
>      ON c2.oid = co.confrelid
> [...]
I am curious about the explicit use of INNER JOINs, I find them
cumbersome, so I rewrote the code to remove them, I know in some
situations that they can improve performance - but was this the case
here, or is there some other subtlety that I have missed?
[...]
   SELECT
         c1.oid::regclass::text  AS origin_table,
         c2.oid                  AS referenced_id,
         c2.oid::regclass::text  AS referenced_table,
         ARRAY[c1.oid::regclass,c2.oid::regclass] AS chain
     FROM
         pg_catalog.pg_constraint    AS co,
         pg_catalog.pg_class         AS c1,
         pg_catalog.pg_class         AS c2
     WHERE
         c1.oid = co.conrelid AND
         c2.oid = co.confrelid
[...]

Cheers,
Gavin

On 30/07/11 10:45, bricklen wrote:
> A coworker of mine* was looking for a way to quickly and easily be
> able to tell which tables were referencing particular table(s) she
> wanted to load (for unit testing). Using the examples from David
> Fetter**, she submitted a revised version that seems to work quite
> well. With her permission, I have posted her changes and comments.
>
> Here is what she had to say about her changes (lightly-edited to mask
> table names etc)
>
> "The queries linked in the original post by David Fetter do not appear
> to take into account cases where table A references table B, and B
> references A.  (Or longer cycles.) In our environment, we have a lot
> of those.
> Using the original query as a cheat sheet, I wrote a query that
> essentially takes as parameter the name of the table whose "ancestors"
> you want to find, and comes up with a list of everything it depends
> on, but stops as soon as it detects a cycle.
>
> To get a list of all the FK relationships that start at the table of interest:
>
> SELECT referenced_table, depth, chain FROM table_dependencies WHERE
> origin_table = '<table name>';
>
> To get a simple list of all tables that a given table requires due to
> FK dependencies (which is what I need for my unit tests):
>
> SELECT DISTINCT referenced_table FROM table_dependencies WHERE
> origin_table = '<table name>';
>
> To get a partial ordering of the dependencies -- which should tell you
> in what order the tables should be populated in order to avoid FK
> violations -- I think you can do this:
>
> SELECT referenced_table, MIN(depth) AS path_length
> FROM table_dependencies
> WHERE origin_table = '<table name>'
> GROUP BY referenced_table
> ORDER BY path_length DESC;"
>
> -- Here is the view she came up with:
>
> CREATE OR REPLACE VIEW table_dependencies AS (
> WITH RECURSIVE t AS (
>      SELECT
>          c.oid AS origin_id,
>          c.oid::regclass::text AS origin_table,
>          c.oid AS referencing_id,
>          c.oid::regclass::text AS referencing_table,
>          c2.oid AS referenced_id,
>          c2.oid::regclass::text AS referenced_table,
>          ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
>      FROM pg_catalog.pg_constraint AS co
>      INNER JOIN pg_catalog.pg_class AS c
>      ON c.oid = co.conrelid
>      INNER JOIN pg_catalog.pg_class AS c2
>      ON c2.oid = co.confrelid
> -- Add this line as "parameter" if you want to make a one-off query
> -- or a function instead of a view
> --        WHERE c.oid::regclass::text = '<table name>'
>
>      UNION ALL
>      SELECT
>          t.origin_id,
>          t.origin_table,
>          t.referenced_id AS referencing_id,
>          t.referenced_table AS referencing_table,
>          c3.oid AS referenced_id,
>          c3.oid::regclass::text AS referenced_table,
>          t.chain || c3.oid::regclass AS chain
>      FROM pg_catalog.pg_constraint AS co
>      INNER JOIN pg_catalog.pg_class AS c3
>      ON c3.oid = co.confrelid
>      INNER JOIN t
>      ON t.referenced_id = co.conrelid
>      WHERE
>          -- prevent infinite recursion by pruning paths where the last entry in
>          -- the path already appears somewhere else in the path
>          NOT (
>              ARRAY[ t.chain[array_upper(t.chain, 1)] ] -- an array
> containing the last element
>              <@                                        -- "is contained by"
>              t.chain[1:array_upper(t.chain, 1) - 1]    -- a slice of the chain,
>                                                        -- from element 1 to n-1
>          )
> )
> SELECT  origin_table,
>          referenced_table,
>          array_upper(chain,1) AS "depth",
>          array_to_string(chain,',') as chain
> FROM t
> );
>
>
> If anyone has any fixes or changes, or knows of a better way to get
> the referencers/referencees, we'd be interested in hearing about them.
>
> * Jenny van Hoof deserves the credit for the changes
> ** http://people.planetpostgresql.org/dfetter/index.php?/archives/27-Key-Tree.html
>
I just realized that the 3rd & 4th line will always show the same values
as the 1st & 2nd lines, as only the column headings change!  Is this
intentional?

c.oid                   AS origin_id,
         c.oid::regclass::text   AS origin_table,
         c.oid                   AS referencing_id,
         c.oid::regclass::text   AS referencing_table,

Cheers,
Gavin

On 30/07/11 10:45, bricklen wrote:
> A coworker of mine* was looking for a way to quickly and easily be
> able to tell which tables were referencing particular table(s) she
> wanted to load (for unit testing). Using the examples from David
> Fetter**, she submitted a revised version that seems to work quite
> well. With her permission, I have posted her changes and comments.
>
> Here is what she had to say about her changes (lightly-edited to mask
> table names etc)
>
> "The queries linked in the original post by David Fetter do not appear
> to take into account cases where table A references table B, and B
> references A.  (Or longer cycles.) In our environment, we have a lot
> of those.
> Using the original query as a cheat sheet, I wrote a query that
> essentially takes as parameter the name of the table whose "ancestors"
> you want to find, and comes up with a list of everything it depends
> on, but stops as soon as it detects a cycle.
>
> To get a list of all the FK relationships that start at the table of interest:
>
> SELECT referenced_table, depth, chain FROM table_dependencies WHERE
> origin_table = '<table name>';
>
> To get a simple list of all tables that a given table requires due to
> FK dependencies (which is what I need for my unit tests):
>
> SELECT DISTINCT referenced_table FROM table_dependencies WHERE
> origin_table = '<table name>';
>
> To get a partial ordering of the dependencies -- which should tell you
> in what order the tables should be populated in order to avoid FK
> violations -- I think you can do this:
>
> SELECT referenced_table, MIN(depth) AS path_length
> FROM table_dependencies
> WHERE origin_table = '<table name>'
> GROUP BY referenced_table
> ORDER BY path_length DESC;"
>
> -- Here is the view she came up with:
>
> CREATE OR REPLACE VIEW table_dependencies AS (
> WITH RECURSIVE t AS (
>      SELECT
>          c.oid AS origin_id,
>          c.oid::regclass::text AS origin_table,
>          c.oid AS referencing_id,
>          c.oid::regclass::text AS referencing_table,
>          c2.oid AS referenced_id,
>          c2.oid::regclass::text AS referenced_table,
>          ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
>      FROM pg_catalog.pg_constraint AS co
>      INNER JOIN pg_catalog.pg_class AS c
>      ON c.oid = co.conrelid
>      INNER JOIN pg_catalog.pg_class AS c2
>      ON c2.oid = co.confrelid
> -- Add this line as "parameter" if you want to make a one-off query
> -- or a function instead of a view
> --        WHERE c.oid::regclass::text = '<table name>'
>
>      UNION ALL
>      SELECT
>          t.origin_id,
>          t.origin_table,
>          t.referenced_id AS referencing_id,
>          t.referenced_table AS referencing_table,
>          c3.oid AS referenced_id,
>          c3.oid::regclass::text AS referenced_table,
>          t.chain || c3.oid::regclass AS chain
>      FROM pg_catalog.pg_constraint AS co
>      INNER JOIN pg_catalog.pg_class AS c3
>      ON c3.oid = co.confrelid
>      INNER JOIN t
>      ON t.referenced_id = co.conrelid
>      WHERE
>          -- prevent infinite recursion by pruning paths where the last entry in
>          -- the path already appears somewhere else in the path
>          NOT (
>              ARRAY[ t.chain[array_upper(t.chain, 1)] ] -- an array
> containing the last element
>              <@                                        -- "is contained by"
>              t.chain[1:array_upper(t.chain, 1) - 1]    -- a slice of the chain,
>                                                        -- from element 1 to n-1
>          )
> )
> SELECT  origin_table,
>          referenced_table,
>          array_upper(chain,1) AS "depth",
>          array_to_string(chain,',') as chain
> FROM t
> );
>
>
> If anyone has any fixes or changes, or knows of a better way to get
> the referencers/referencees, we'd be interested in hearing about them.
>
> * Jenny van Hoof deserves the credit for the changes
> ** http://people.planetpostgresql.org/dfetter/index.php?/archives/27-Key-Tree.html
>
Interesting, this is an area I just starting to get into. I have run the
code without error message in both pg9.1beta3 and pg8.4.8 - I will
investigate firther.

By the way, what version of pg are you using?  As I noticed somethings
that work in 9.0 do not work in either 8.4 or 9.1 - but I haven't had
time to investigate why yet.


Cheers,
Gavin

On 30 Jul 2011, at 13:49, Gavin Flower wrote:

> On 30/07/11 10:45, bricklen wrote:
>> [...]
>> CREATE OR REPLACE VIEW table_dependencies AS (
>> WITH RECURSIVE t AS (
>>     SELECT
>>         c.oid AS origin_id,
>>         c.oid::regclass::text AS origin_table,
>>         c.oid AS referencing_id,
>>         c.oid::regclass::text AS referencing_table,
>>         c2.oid AS referenced_id,
>>         c2.oid::regclass::text AS referenced_table,
>>         ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
>>     FROM pg_catalog.pg_constraint AS co
>>     INNER JOIN pg_catalog.pg_class AS c
>>     ON c.oid = co.conrelid
>>     INNER JOIN pg_catalog.pg_class AS c2
>>     ON c2.oid = co.confrelid
>> [...]
> I am curious about the explicit use of INNER JOINs, I find them cumbersome, so I rewrote the code to remove them, I
knowin some situations that they can improve performance - but was this the case here, or is there some other subtlety
thatI have missed? 

Explicit inner joins provide a means to separate the join conditions from other result filtering conditions. Each join
iskept with its relevant conditions even, so it's immediately clear which conditions pertain to which joins. 

I find in general explicit inner joins improve readability of queries over implicit joins, especially when the joins
geta little more complicated. 

Perhaps what you find cumbersome about them is just a matter of formatting?

Alban Hertroys

--
The scale of a problem often equals the size of an ego.



!DSPAM:737,4e35223312092713185518!



On 30 Jul 2011, at 12:17, Gavin Flower wrote:

>> CREATE OR REPLACE VIEW table_dependencies AS (
>> WITH RECURSIVE t AS (
>>     SELECT
>>         c.oid AS origin_id,
>>         c.oid::regclass::text AS origin_table,
>>         c.oid AS referencing_id,
>>         c.oid::regclass::text AS referencing_table,
>>         c2.oid AS referenced_id,
>>         c2.oid::regclass::text AS referenced_table,
>>         ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
>>     FROM pg_catalog.pg_constraint AS co
>>     INNER JOIN pg_catalog.pg_class AS c
>>     ON c.oid = co.conrelid
>>     INNER JOIN pg_catalog.pg_class AS c2
>>     ON c2.oid = co.confrelid
>> -- Add this line as "parameter" if you want to make a one-off query
>> -- or a function instead of a view
>> --        WHERE c.oid::regclass::text = '<table name>'
>>
>>     UNION ALL
>>     SELECT
>>         t.origin_id,
>>         t.origin_table,
>>         t.referenced_id AS referencing_id,
>>         t.referenced_table AS referencing_table,
>>         c3.oid AS referenced_id,
>>         c3.oid::regclass::text AS referenced_table,
>>         t.chain || c3.oid::regclass AS chain
>>     FROM pg_catalog.pg_constraint AS co
>>     INNER JOIN pg_catalog.pg_class AS c3
>>     ON c3.oid = co.confrelid
>>     INNER JOIN t
>>     ON t.referenced_id = co.conrelid

> I just realized that the 3rd & 4th line will always show the same values as the 1st & 2nd lines, as only the column
headingschange!  Is this intentional? 
>
> c.oid                   AS origin_id,
>        c.oid::regclass::text   AS origin_table,
>        c.oid                   AS referencing_id,
>        c.oid::regclass::text   AS referencing_table,


Only the 'root'-nodes of the recursive tree are going through that part of the UNION. Those don't have an origin. It's
amatter of choice what to do in that case. Common choices are to make root nodes reference themselves or to set their
originsto NULL. 
Either case has cons and pros that usually depend on how the query results are used.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4e3523b412093530528260!



On 31/07/11 21:42, Alban Hertroys wrote:
> On 30 Jul 2011, at 12:17, Gavin Flower wrote:
>
>>> CREATE OR REPLACE VIEW table_dependencies AS (
>>> WITH RECURSIVE t AS (
>>>      SELECT
>>>          c.oid AS origin_id,
>>>          c.oid::regclass::text AS origin_table,
>>>          c.oid AS referencing_id,
>>>          c.oid::regclass::text AS referencing_table,
>>>          c2.oid AS referenced_id,
>>>          c2.oid::regclass::text AS referenced_table,
>>>          ARRAY[c.oid::regclass,c2.oid::regclass] AS chain
>>>      FROM pg_catalog.pg_constraint AS co
>>>      INNER JOIN pg_catalog.pg_class AS c
>>>      ON c.oid = co.conrelid
>>>      INNER JOIN pg_catalog.pg_class AS c2
>>>      ON c2.oid = co.confrelid
>>> -- Add this line as "parameter" if you want to make a one-off query
>>> -- or a function instead of a view
>>> --        WHERE c.oid::regclass::text = '<table name>'
>>>
>>>      UNION ALL
>>>      SELECT
>>>          t.origin_id,
>>>          t.origin_table,
>>>          t.referenced_id AS referencing_id,
>>>          t.referenced_table AS referencing_table,
>>>          c3.oid AS referenced_id,
>>>          c3.oid::regclass::text AS referenced_table,
>>>          t.chain || c3.oid::regclass AS chain
>>>      FROM pg_catalog.pg_constraint AS co
>>>      INNER JOIN pg_catalog.pg_class AS c3
>>>      ON c3.oid = co.confrelid
>>>      INNER JOIN t
>>>      ON t.referenced_id = co.conrelid
>> I just realized that the 3rd&  4th line will always show the same values as the 1st&  2nd lines, as only the column
headingschange!  Is this intentional? 
>>
>> c.oid                   AS origin_id,
>>         c.oid::regclass::text   AS origin_table,
>>         c.oid                   AS referencing_id,
>>         c.oid::regclass::text   AS referencing_table,
>
> Only the 'root'-nodes of the recursive tree are going through that part of the UNION. Those don't have an origin.
It'sa matter of choice what to do in that case. Common choices are to make root nodes reference themselves or to set
theirorigins to NULL. 
> Either case has cons and pros that usually depend on how the query results are used.
>
> Alban Hertroys
[...]
Thanks, I missed that...


On 31/07/11 21:36, Alban Hertroys wrote:
On 30 Jul 2011, at 13:49, Gavin Flower wrote:

On 30/07/11 10:45, bricklen wrote:
[...]
CREATE OR REPLACE VIEW table_dependencies AS (
WITH RECURSIVE t AS (   SELECT       c.oid AS origin_id,       c.oid::regclass::text AS origin_table,       c.oid AS referencing_id,       c.oid::regclass::text AS referencing_table,       c2.oid AS referenced_id,       c2.oid::regclass::text AS referenced_table,       ARRAY[c.oid::regclass,c2.oid::regclass] AS chain   FROM pg_catalog.pg_constraint AS co   INNER JOIN pg_catalog.pg_class AS c   ON c.oid = co.conrelid   INNER JOIN pg_catalog.pg_class AS c2   ON c2.oid = co.confrelid
[...]
I am curious about the explicit use of INNER JOINs, I find them cumbersome, so I rewrote the code to remove them, I know in some situations that they can improve performance - but was this the case here, or is there some other subtlety that I have missed?

Explicit inner joins provide a means to separate the join conditions from other result filtering conditions. Each join is kept with its relevant conditions even, so it's immediately clear which conditions pertain to which joins.

I find in general explicit inner joins improve readability of queries over implicit joins, especially when the joins get a little more complicated.

Perhaps what you find cumbersome about them is just a matter of formatting?

Alban Hertroys
[...]
Formatting, and habits gained before I fully realized the significance of explicit inner joins.  I appreciate your answer (keeping the joins separate from the filter conditions) - I think I might adopt the same approach, though I hate the cumbersome formatting.

Q. What is the difference between an Experienced Developer and a Trainee?
A. A
n Experienced Developer makes mistakes at a faster rate!
(Or am I being to cynical???)