Thread: problems with access into system catalogs

problems with access into system catalogs

From
Thomas Markus
Date:
Hi,

I have 2 systems with CentOS 5.5 (2.6.18) x86_64, postgresql-9.0.6 64bit

both systems contains the same content. But one system make troubles.
some system tables (eg pg_catalog.pg_class or pg_attribute) contain much
dead rows and all simple query take much time on one system. the other
one is fast.

one problem is this query:
psql MYDB -c "explain analyze SELECT column_name FROM
information_schema.columns WHERE table_schema='public' AND
table_name='person'"

fast system:
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=96.39..196.63 rows=1 width=64) (actual
time=10.953..12.785 rows=24 loops=1)
    ->  Hash Left Join  (cost=96.39..196.34 rows=1 width=68) (actual
time=10.937..12.541 rows=24 loops=1)
          Hash Cond: (t.typbasetype = bt.oid)
          Join Filter: (t.typtype = 'd'::"char")
          ->  Nested Loop  (cost=0.00..99.93 rows=1 width=73) (actual
time=0.821..2.359 rows=24 loops=1)
                ->  Nested Loop  (cost=0.00..99.62 rows=1 width=68)
(actual time=0.811..2.097 rows=24 loops=1)
                      Join Filter: (pg_has_role(c.relowner,
'USAGE'::text) OR has_column_privilege(c.oid, a.attnum, 'SELECT, INSERT,
UPDATE, REFERENCES'::text))
                      ->  Nested Loop  (cost=0.00..85.94 rows=1 width=8)
(actual time=0.749..1.904 rows=1 loops=1)
                            Join Filter: (c.relnamespace = nc.oid)
                            ->  Seq Scan on pg_namespace nc
(cost=0.00..4.00 rows=1 width=4) (actual time=0.082..0.502 rows=1 loops=1)
                                  Filter: ((NOT
pg_is_other_temp_schema(oid)) AND
(((nspname)::information_schema.sql_identifier)::text = 'public'::text))
                            ->  Seq Scan on pg_class c (cost=0.00..81.91
rows=3 width=12) (actual time=0.658..1.389 rows=1 loops=1)
                                  Filter: ((c.relkind = ANY
('{r,v}'::"char"[])) AND
(((c.relname)::information_schema.sql_identifier)::text = 'person'::text))
                      ->  Index Scan using
pg_attribute_relid_attnum_index on pg_attribute a  (cost=0.00..13.61
rows=4 width=74) (actual time=0.048..0.103 rows=24 loops=1)
                            Index Cond: ((a.attrelid = c.oid) AND
(a.attnum > 0))
                            Filter: (NOT a.attisdropped)
                ->  Index Scan using pg_type_oid_index on pg_type t
(cost=0.00..0.29 rows=1 width=13) (actual time=0.004..0.005 rows=1 loops=24)
                      Index Cond: (t.oid = a.atttypid)
          ->  Hash  (cost=77.14..77.14 rows=1540 width=4) (actual
time=10.082..10.082 rows=1540 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 55kB
                ->  Hash Join  (cost=4.57..77.14 rows=1540 width=4)
(actual time=0.317..7.597 rows=1540 loops=1)
                      Hash Cond: (bt.typnamespace = nbt.oid)
                      ->  Seq Scan on pg_type bt (cost=0.00..51.40
rows=1540 width=8) (actual time=0.003..2.894 rows=1540 loops=1)
                      ->  Hash  (cost=3.14..3.14 rows=114 width=4)
(actual time=0.294..0.294 rows=95 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 4kB
                            ->  Seq Scan on pg_namespace nbt
(cost=0.00..3.14 rows=114 width=4) (actual time=0.004..0.147 rows=95
loops=1)
    ->  Index Scan using pg_namespace_oid_index on pg_namespace nt
(cost=0.00..0.28 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=24)
          Index Cond: (nt.oid = t.typnamespace)
  Total runtime: 13.130 ms
(29 rows)



but the problematic:
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Hash Left Join  (cost=153770.68..475719.06 rows=33 width=64) (actual
time=716.941..1993.033 rows=24 loops=1)
    Hash Cond: (t.typbasetype = bt.oid)
    Join Filter: (t.typtype = 'd'::"char")
    ->  Hash Join  (cost=58705.27..374606.83 rows=33 width=69) (actual
time=373.927..1649.818 rows=24 loops=1)
          Hash Cond: (t.typnamespace = nt.oid)
          ->  Nested Loop  (cost=58700.55..374601.66 rows=33 width=73)
(actual time=373.519..1649.339 rows=24 loops=1)
                ->  Hash Join  (cost=58700.55..374338.49 rows=33
width=68) (actual time=373.496..1649.061 rows=24 loops=1)
                      Hash Cond: (a.attrelid = c.oid)
                      Join Filter: (pg_has_role(c.relowner,
'USAGE'::text) OR has_column_privilege(c.oid, a.attnum, 'SELECT, INSERT,
UPDATE, REFERENCES'::text))
                      ->  Seq Scan on pg_attribute a
(cost=0.00..294729.54 rows=5575330 width=74) (actual
time=0.011..1279.604 rows=8452 loops=1)
                            Filter: ((NOT attisdropped) AND (attnum > 0))
                      ->  Hash  (cost=58700.34..58700.34 rows=17
width=8) (actual time=358.699..358.699 rows=1 loops=1)
                            Buckets: 1024  Batches: 1  Memory Usage: 1kB
                            ->  Nested Loop  (cost=0.00..58700.34
rows=17 width=8) (actual time=0.786..358.693 rows=1 loops=1)
                                  Join Filter: (c.relnamespace = nc.oid)
                                  ->  Seq Scan on pg_namespace nc
(cost=0.00..4.12 rows=1 width=4) (actual time=0.184..0.523 rows=1 loops=1)
                                        Filter: ((NOT
pg_is_other_temp_schema(oid)) AND
(((nspname)::information_schema.sql_identifier)::text = 'public'::text))
                                  ->  Seq Scan on pg_class c
(cost=0.00..58671.00 rows=2018 width=12) (actual time=0.590..358.151
rows=1 loops=1)
                                        Filter: ((c.relkind = ANY
('{r,v}'::"char"[])) AND
(((c.relname)::information_schema.sql_identifier)::text = 'person'::text))
                ->  Index Scan using pg_type_oid_index on pg_type t
(cost=0.00..7.96 rows=1 width=13) (actual time=0.005..0.006 rows=1 loops=24)
                      Index Cond: (t.oid = a.atttypid)
          ->  Hash  (cost=3.21..3.21 rows=121 width=4) (actual
time=0.364..0.364 rows=121 loops=1)
                Buckets: 1024  Batches: 1  Memory Usage: 5kB
                ->  Seq Scan on pg_namespace nt  (cost=0.00..3.21
rows=121 width=4) (actual time=0.009..0.189 rows=121 loops=1)
    ->  Hash  (cost=69682.20..69682.20 rows=1547136 width=4) (actual
time=341.761..341.761 rows=1569 loops=1)
          Buckets: 131072  Batches: 2  Memory Usage: 29kB
          ->  Hash Join  (cost=4.72..69682.20 rows=1547136 width=4)
(actual time=0.370..339.147 rows=1569 loops=1)
                Hash Cond: (bt.typnamespace = nbt.oid)
                ->  Seq Scan on pg_type bt  (cost=0.00..48404.36
rows=1547136 width=8) (actual time=0.005..334.474 rows=1569 loops=1)
                ->  Hash  (cost=3.21..3.21 rows=121 width=4) (actual
time=0.342..0.342 rows=121 loops=1)
                      Buckets: 1024  Batches: 1  Memory Usage: 5kB
                      ->  Seq Scan on pg_namespace nbt (cost=0.00..3.21
rows=121 width=4) (actual time=0.004..0.162 rows=121 loops=1)
  Total runtime: 1993.769 ms
(33 rows)



I tried to cluster, reindex and vacuum (+full) pg_class without success
psql MYDB -c "vacuum verbose pg_catalog.pg_class"

fast system:
INFO:  "pg_class": found 1 removable, 2106 nonremovable row versions in
45 out of 45 pages
DETAIL:  0 dead row versions cannot be removed yet.

other:
INFO:  "pg_class": found 0 removable, 1547667 nonremovable row versions
in 31587 out of 31587 pages
DETAIL:  1545530 dead row versions cannot be removed yet.

a "select count(*) from pg_catalog.pg_class" returns only 2137

how can i remove then dead rows and fasten acccess to system catalogs?


Thomas


Re: problems with access into system catalogs

From
Craig Ringer
Date:
First, thank-you for an excellent complete question with versions,
EXPLAIN ANALYZE, and exact messages.

My reply is interleaved below.

On 07/26/2012 03:44 PM, Thomas Markus wrote:
> I have 2 systems with CentOS 5.5 (2.6.18) x86_64, postgresql-9.0.6 64bit
>
> both systems contains the same content. But one system make troubles.
> some system tables (eg pg_catalog.pg_class or pg_attribute) contain
> much dead rows and all simple query take much time on one system. the
> other one is fast.

[snip]

> other:
> INFO:  "pg_class": found 0 removable, 1547667 nonremovable row
> versions in 31587 out of 31587 pages
> DETAIL:  1545530 dead row versions cannot be removed yet.
>
> a "select count(*) from pg_catalog.pg_class" returns only 2137
>

OK, so you have lots of bloat in the system catalogs. A few things to
check for:

- Do you have any uncommitted two phase transactions?  Run:
   SELECT * from pg_prepared_xacts ;

- Do you have any long-lived 'IDLE IN TRANSACTION' connections ? Try:
   SELECT * FROM pg_stat_activity WHERE current_query = '<IDLE> in
transaction' AND xact_start > current_timestamp - '1 minute'::interval;

Either of those can prevent vacuum from cleaning things up.

Do you very frequently create and drop tables, indexes, etc? Say, using
a database unit testing framework?

--
Craig Ringer

Re: problems with access into system catalogs

From
Thomas Markus
Date:
Hi,

see below

Am 26.07.2012 10:25, schrieb Craig Ringer:
> First, thank-you for an excellent complete question with versions,
> EXPLAIN ANALYZE, and exact messages.
;)
>
> - Do you have any uncommitted two phase transactions?  Run:
>   SELECT * from pg_prepared_xacts ;
hm yes, i stopped all applications this morning but this query shows:
  transaction | gid                                              |
prepared            | owner  | database

-------------+----------------------------------------------------------------------------------------------+-------------------------------+--------+-----------
     49737548 |
131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTdm
| 2012-01-05 07:49:30.78583+01  | xxx | db1
     49737549 |
131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTg0
| 2012-01-05 07:49:30.789382+01 | xxx | db2

system time is valid (Thu Jul 26 10:38:12 CEST 2012). so may 1st is
really old
Should I restart the instance?

>
> - Do you have any long-lived 'IDLE IN TRANSACTION' connections ? Try:
>   SELECT * FROM pg_stat_activity WHERE current_query = '<IDLE> in
> transaction' AND xact_start > current_timestamp - '1 minute'::interval;
none, and there was a disconnect from all clients this morning
>
> Either of those can prevent vacuum from cleaning things up.
>
> Do you very frequently create and drop tables, indexes, etc? Say,
> using a database unit testing framework?
no, its a live system with normal olap access



Thomas

Re: problems with access into system catalogs

From
Craig Ringer
Date:
On 07/26/2012 04:39 PM, Thomas Markus wrote:
Hi,

see below

Am 26.07.2012 10:25, schrieb Craig Ringer:
- Do you have any uncommitted two phase transactions?  Run:
  SELECT * from pg_prepared_xacts ;
hm yes, i stopped all applications this morning but this query shows:
 transaction | gid                                              | prepared            | owner  | database
-------------+----------------------------------------------------------------------------------------------+-------------------------------+--------+-----------
    49737548 | 131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTdm | 2012-01-05 07:49:30.78583+01  | xxx | db1
    49737549 | 131075_MS03ZjAwMDAwMTpjZmVlOjRlZDg3MTk2OjY4NGJlN2I=_N2YwMDAwMDE6Y2ZlZTo0ZWQ4NzE5Njo2ODRiZTg0 | 2012-01-05 07:49:30.789382+01 | xxx | db2

system time is valid (Thu Jul 26 10:38:12 CEST 2012). so may 1st is really old
Should I restart the instance?

Nope, and it wouldn't help anyway. Prepared but uncommitted two phase transactions are a permanent and persistent part of the database. They only go away when a COMMIT PREPARED  or ROLLBACK PREPARED is issued. See:

  http://www.postgresql.org/docs/9.1/static/sql-prepare-transaction.html

I cannot advise you on what to do without knowing what created those transactions and why.

Do you very frequently create and drop tables, indexes, etc? Say, using a database unit testing framework?
no, its a live system with normal olap access

Weird, then I don't know how the catalogs would get so fat.

I don't think temporary tables create writes to the catalog heap, but I can't think what else it'd be.

--
Craig Ringer

Re: problems with access into system catalogs

From
Thomas Markus
Date:
Hi Craig,

thanks, that was the tip

a rollback prepared on these does the job

best regards
Thomas

Re: problems with access into system catalogs

From
Tom Lane
Date:
Thomas Markus <t.markus@proventis.net> writes:
> a rollback prepared on these does the job

If you aren't intentionally using prepared transactions, it's a good
idea to disable them by setting max_prepared_transactions to zero
(DB restart required).  That prevents you from accidentally shooting
yourself in the foot like this.

            regards, tom lane