Thread: Slow queries from information schema
I'm aware you already know that information_schema is slow [1] [2], so I just want to expose/document another case and tests I did. I'm using the following view to check what tables depend on what other tables. CREATE VIEW raw_relation_tree AS SELECT tc_p.table_catalog AS parent_catalog, tc_p.table_schema AS parent_schema, tc_p.table_name AS parent_table, tc_c.table_catalog AS child_catalog, tc_c.table_schema AS child_schema, tc_c.table_name AS child_table FROM information_schema.referential_constraints AS rc NATURAL JOIN information_schema.table_constraints AS tc_c LEFT JOIN information_schema.table_constraints AS tc_p ON rc.unique_constraint_catalog = tc_p.constraint_catalog AND rc.unique_constraint_schema = tc_p.constraint_schema AND rc.unique_constraint_name = tc_p.constraint_name ; test=# select count(*) from raw_relation_tree; count ------- 11 (1 row) An EXPLAIN ANALYZE for a simple SELECT on each of the FROM tables give: referential_constraints: ~9ms. table_constraints: ~24ms. The result, on the above view: ~80ms. Fair enough. But if I apply a condition: SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE parent_schema <> child_schema; it takes ~2 seconds (!) to complete. I tried using an alternate table_constraints definition by creating my own view and changing UNION to UNION ALL (as per [2]) The results were: table_constraints using UNION ALL has the same number of rows as the UNION version. table_constraints now take about 4 ms (as expected). VIEW raw_relation_tree is now 110 ms. VIEW raw_relation_tree WHERE parent_schema <> child_schema: 3.3 sec. EXPLAIN results are way too long to post here. If it is ok, I'll gladly post them. Using 8.3.6. [1] http://archives.postgresql.org/pgsql-bugs/2008-12/msg00144.php [2] http://archives.postgresql.org/pgsql-performance/2008-05/msg00062.php
Octavio Alvarez <alvarezp@alvarezp.ods.org> writes: > The result, on the above view: ~80ms. Fair enough. But if I apply a > condition: > SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE > parent_schema <> child_schema; > it takes ~2 seconds (!) to complete. I'm not sure I'm seeing the exact same case as you, but what I see here is that 8.3 puts the join condition involving _pg_keysequal() at the top of the tree where it will be executed quite a lot of times (way more than the planner expects, because of bad rowcount estimates below) ... and _pg_keysequal() is implemented in a depressingly inefficient way. CVS HEAD seems to avoid this trap in the same case, but I'm not entirely convinced whether it's getting better rowcount estimates or just got lucky. Anyway it seems to help a great deal if you use a less sucky definition of the function, such as create or replace function information_schema._pg_keysequal(smallint[], smallint[]) RETURNS boolean LANGUAGE sql STRICT IMMUTABLE AS 'select $1 <@ $2 and $2 <@ $1'; regards, tom lane
On Sat, 2009-02-14 at 15:02 -0500, Tom Lane wrote: > Octavio Alvarez <alvarezp@alvarezp.ods.org> writes: > > The result, on the above view: ~80ms. Fair enough. But if I apply a > > condition: > > SELECT * FROM ___pgnui_relation_tree.raw_relation_tree WHERE > > parent_schema <> child_schema; > > it takes ~2 seconds (!) to complete. > > I'm not sure I'm seeing the exact same case as you, but what I see here > is that 8.3 puts the join condition involving _pg_keysequal() at the > top of the tree where it will be executed quite a lot of times (way > more than the planner expects, because of bad rowcount estimates below) > ... and _pg_keysequal() is implemented in a depressingly inefficient way. > > CVS HEAD seems to avoid this trap in the same case, but I'm not entirely > convinced whether it's getting better rowcount estimates or just got > lucky. > > Anyway it seems to help a great deal if you use a less sucky definition > of the function, such as > > create or replace function information_schema._pg_keysequal(smallint[], smallint[]) RETURNS boolean > LANGUAGE sql STRICT IMMUTABLE AS > 'select $1 <@ $2 and $2 <@ $1'; Wow! Just tried it with the UNION (the original) version of information_schema.table_constraints and it drastically reduced the total runtime to 309 ms! I also tested it with UNION ALL and it took 1.6 sec. (and yet, 50% of the previous time with UNION ALL).