Thread: Very slow INFORMATION_SCHEMA

Very slow INFORMATION_SCHEMA

From
Ernesto
Date:
Hi,

I'm porting an application written with pretty portable SQL, but tested
almost exclusively on MySQL.

I'm wondering why would this query take about 90 seconds to return 74 rows?


SELECT INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_NAME,
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.TABLE_NAME,
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.TABLE_NAME,
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_TYPE,
    INFORMATION_SCHEMA.KEY_COLUMN_USAGE.COLUMN_NAME,
INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_TABLE_NAME,
INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_COLUMN_NAME
    FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS,
INFORMATION_SCHEMA.KEY_COLUMN_USAGE
    WHERE
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_NAME=INFORMATION_SCHEMA.KEY_COLUMN_USAGE.CONSTRAINT_NAME
        AND
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_SCHEMA=INFORMATION_SCHEMA.KEY_COLUMN_USAGE.CONSTRAINT_SCHEMA
        AND
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_SCHEMA='mydbname'
        AND
INFORMATION_SCHEMA.TABLE_CONSTRAINTS.CONSTRAINT_TYPE='FOREIGN KEY'
        ORDER BY INFORMATION_SCHEMA.TABLE_CONSTRAINTS.TABLE_NAME,
INFORMATION_SCHEMA.KEY_COLUMN_USAGE.ORDINAL_POSITION

An equivalent query with the same data set on the same server takes a
couple of milliseconds on MySQL 5.
Is it something I'm doing wrong or it's just that PostgreSQL
INFORMATION_SCHEMA is not optimized for speed? BTW, what I'm trying to
do is get some info on every FOREIGN KEY in a database.

It's PostgreSQL 8.2.7 on Fedora 8 64, Athlon 64 X2 3600+.

Ernesto


Re: Very slow INFORMATION_SCHEMA

From
Tom Lane
Date:
Ernesto <equistango@gmail.com> writes:
> I'm wondering why would this query take about 90 seconds to return 74 rows?

EXPLAIN ANALYZE might tell you something.

Is this really the query you're running?  Because these two columns
don't exist:

> INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_TABLE_NAME,
> INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_COLUMN_NAME

Leaving those out, I get sub-second runtimes for 70-odd foreign key
constraints, on much slower hardware than I think you are using.

            regards, tom lane

Re: Very slow INFORMATION_SCHEMA

From
Mario Weilguni
Date:
Tom Lane schrieb:
> Ernesto <equistango@gmail.com> writes:
>
>> I'm wondering why would this query take about 90 seconds to return 74 rows?
>>
>
> EXPLAIN ANALYZE might tell you something.
>
> Is this really the query you're running?  Because these two columns
> don't exist:
>
>
>> INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_TABLE_NAME,
>> INFORMATION_SCHEMA.KEY_COLUMN_USAGE.REFERENCED_COLUMN_NAME
>>
>
> Leaving those out, I get sub-second runtimes for 70-odd foreign key
> constraints, on much slower hardware than I think you are using.
>
I can confirm this for a quite larger result set (4020 rows) for a DB
with 410 tables and a lot of foreign key constraints.

                                    QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
 Sort  (cost=1062.00..1062.00 rows=1 width=192) (actual
time=30341.257..30343.195 rows=4020 loops=1)
   Sort Key: table_constraints.table_name,
((ss.x).n)::information_schema.cardinal_number
   ->  Nested Loop  (cost=889.33..1061.99 rows=1 width=192) (actual
time=308.004..30316.302 rows=4020 loops=1)
         ->  Nested Loop  (cost=889.33..1057.70 rows=1 width=132)
(actual time=307.984..30271.700 rows=4020 loops=1)
               Join Filter: ((table_constraints.constraint_name)::text =
((ss.conname)::information_schema.sql_identifier)::text)
               ->  Subquery Scan table_constraints  (cost=887.99..926.75
rows=1 width=128) (actual time=278.247..293.392 rows=554 loops=1)
                     Filter: (((constraint_schema)::text =
'public'::text) AND ((constraint_type)::text = 'FOREIGN KEY'::text))
                     ->  Unique  (cost=887.99..912.21 rows=969
width=259) (actual time=276.915..288.848 rows=4842 loops=1)
                           ->  Sort  (cost=887.99..890.41 rows=969
width=259) (actual time=276.911..279.536 rows=4842 loops=1)
                                 Sort Key: constraint_catalog,
constraint_schema, constraint_name, table_catalog, table_schema,
table_name, constraint_type, is_deferr
able, initially_deferred
                                 ->  Append  (cost=118.46..839.92
rows=969 width=259) (actual time=1.971..48.601 rows=4842 loops=1)
                                       ->  Subquery Scan "*SELECT* 1"
(cost=118.46..238.72 rows=224 width=259) (actual time=1.970..14.931
rows=1722 loops=1)
                                             ->  Hash Join
(cost=118.46..236.48 rows=224 width=259) (actual time=1.968..12.556
rows=1722 loops=1)
                                                   Hash Cond:
(c.connamespace = nc.oid)
                                                   ->  Hash Join
(cost=116.97..227.42 rows=224 width=199) (actual time=1.902..8.206
rows=1722 loops=1)
                                                         Hash Cond:
(r.relnamespace = nr.oid)
                                                         ->  Hash Join
(cost=115.50..222.49 rows=329 width=139) (actual time=1.839..5.760
rows=1722 loops=1)
                                                               Hash
Cond: (c.conrelid = r.oid)
                                                               ->  Seq
Scan on pg_constraint c  (cost=0.00..97.23 rows=1723 width=75) (actual
time=0.004..1.195 rows=1
723 loops=1)
                                                               ->  Hash
(cost=110.28..110.28 rows=418 width=72) (actual time=1.823..1.823
rows=458 loops=1)
                                                                     ->
Seq Scan on pg_class r  (cost=0.00..110.28 rows=418 width=72) (actual
time=0.012..1.437 rows=
458 loops=1)

Filter: ((relkind = 'r'::"char") AND (pg_has_role(relowner,
'USAGE'::text) OR has_table_pri
vilege(oid, 'INSERT'::text) OR has_table_privilege(oid, 'UPDATE'::text)
OR has_table_privilege(oid, 'DELETE'::text) OR has_table_privilege(oid,
'REFERENCES'::text) OR
 has_table_privilege(oid, 'TRIGGER'::text)))
                                                         ->  Hash
(cost=1.27..1.27 rows=15 width=68) (actual time=0.051..0.051 rows=15
loops=1)
                                                               ->  Seq
Scan on pg_namespace nr  (cost=0.00..1.27 rows=15 width=68) (actual
time=0.010..0.032 rows=15 l
oops=1)

Filter: (NOT pg_is_other_temp_schema(oid))
                                                   ->  Hash
(cost=1.22..1.22 rows=22 width=68) (actual time=0.049..0.049 rows=22
loops=1)
                                                         ->  Seq Scan on
pg_namespace nc  (cost=0.00..1.22 rows=22 width=68) (actual
time=0.008..0.022 rows=22 loops=1
)
                                       ->  Subquery Scan "*SELECT* 2"
(cost=125.45..601.21 rows=745 width=138) (actual time=2.621..29.380
rows=3120 loops=1)
                                             ->  Hash Join
(cost=125.45..593.76 rows=745 width=138) (actual time=2.618..24.938
rows=3120 loops=1)
                                                   Hash Cond:
(a.attrelid = r.oid)
                                                   ->  Seq Scan on
pg_attribute a  (cost=0.00..419.55 rows=5551 width=6) (actual
time=0.009..8.111 rows=3399 loops=1)
                                                         Filter:
(attnotnull AND (attnum > 0) AND (NOT attisdropped))
                                                   ->  Hash
(cost=121.78..121.78 rows=294 width=136) (actual time=2.578..2.578
rows=458 loops=1)
                                                         ->  Hash Join
(cost=1.46..121.78 rows=294 width=136) (actual time=0.073..2.085
rows=458 loops=1)
                                                               Hash
Cond: (r.relnamespace = nr.oid)
                                                               ->  Seq
Scan on pg_class r  (cost=0.00..115.76 rows=431 width=72) (actual
time=0.011..1.358 rows=458 lo
ops=1)

Filter: ((relkind = 'r'::"char") AND (pg_has_role(relowner,
'USAGE'::text) OR has_table_privilege
(oid, 'SELECT'::text) OR has_table_privilege(oid, 'INSERT'::text) OR
has_table_privilege(oid, 'UPDATE'::text) OR has_table_privilege(oid,
'DELETE'::text) OR has_table
_privilege(oid, 'REFERENCES'::text) OR has_table_privilege(oid,
'TRIGGER'::text)))
                                                               ->  Hash
(cost=1.27..1.27 rows=15 width=68) (actual time=0.051..0.051 rows=15
loops=1)

                                                                     ->
Seq Scan on pg_namespace nr  (cost=0.00..1.27 rows=15 width=68) (actual
time=0.010..0.033 row
s=15 loops=1)

Filter: (NOT pg_is_other_temp_schema(oid))
               ->  Nested Loop  (cost=1.34..130.80 rows=6 width=321)
(actual time=0.040..52.244 rows=1845 loops=554)
                     ->  Nested Loop  (cost=1.34..128.42 rows=8
width=261) (actual time=0.017..16.949 rows=1251 loops=554)
                           Join Filter: (pg_has_role(r.relowner,
'USAGE'::text) OR has_table_privilege(c.oid, 'SELECT'::text) OR
has_table_privilege(c.oid, 'INSERT'::
text) OR has_table_privilege(c.oid, 'UPDATE'::text) OR
has_table_privilege(c.oid, 'REFERENCES'::text))
                           ->  Hash Join  (cost=1.34..109.27 rows=46
width=193) (actual time=0.009..5.149 rows=1251 loops=554)
                                 Hash Cond: (c.connamespace = nc.oid)
                                 ->  Seq Scan on pg_constraint c
(cost=0.00..103.69 rows=1008 width=133) (actual time=0.007..2.765
rows=1302 loops=554)
                                       Filter: (contype = ANY
('{p,u,f}'::"char"[]))
                                 ->  Hash  (cost=1.33..1.33 rows=1
width=68) (actual time=0.022..0.022 rows=1 loops=1)
                                       ->  Seq Scan on pg_namespace nc
(cost=0.00..1.33 rows=1 width=68) (actual time=0.016..0.019 rows=1 loops=1)
                                             Filter: ('public'::text =
((nspname)::information_schema.sql_identifier)::text)
                           ->  Index Scan using pg_class_oid_index on
pg_class r  (cost=0.00..0.39 rows=1 width=76) (actual time=0.005..0.006
rows=1 loops=693054)
                                 Index Cond: (r.oid = c.conrelid)
                                 Filter: (relkind = 'r'::"char")
                     ->  Index Scan using pg_namespace_oid_index on
pg_namespace nr  (cost=0.00..0.28 rows=1 width=68) (actual
time=0.004..0.005 rows=1 loops=693054)
                           Index Cond: (nr.oid = r.relnamespace)
                           Filter: (NOT pg_is_other_temp_schema(oid))
         ->  Index Scan using pg_attribute_relid_attnum_index on
pg_attribute a  (cost=0.00..4.27 rows=1 width=70) (actual
time=0.006..0.007 rows=1 loops=4020)
               Index Cond: ((ss.roid = a.attrelid) AND (a.attnum =
(ss.x).x))
               Filter: (NOT attisdropped)
 Total runtime: 30346.174 ms
(60 rows)
X-AntiVirus: checked by AntiVir MailGuard (Version: 8.0.0.18; AVE: 8.1.0.37; VDF: 7.0.3.243)

This is Postgresql 8.2.4, on a Dual-Core XEON 3.6GHz. With nested_loops
off, I get a very fast response (330ms).

Regards,
    Mario Weilguni

Re: Very slow INFORMATION_SCHEMA

From
Tom Lane
Date:
Mario Weilguni <mweilguni@sime.com> writes:
> I can confirm this for a quite larger result set (4020 rows) for a DB
> with 410 tables and a lot of foreign key constraints.
> ...
> This is Postgresql 8.2.4, on a Dual-Core XEON 3.6GHz. With nested_loops
> off, I get a very fast response (330ms).

FWIW, it looks like 8.3 is significantly smarter about this example
--- it's able to push the toplevel conditions on CONSTRAINT_SCHEMA
and CONSTRAINT_TYPE down inside the UNION, where 8.2 fails to do so.

Which is not to say that there's not more left to do on optimizing
the information_schema views.  In this particular case, for example,
I wonder why the UNION in INFORMATION_SCHEMA.TABLE_CONSTRAINTS isn't a
UNION ALL.  There's probably a lot more such micro-optimizations that
could be done if anyone was motivated to look at it.

            regards, tom lane