Re: Databases vs Schemas - Mailing list pgsql-performance

From Chris Kratz
Subject Re: Databases vs Schemas
Date
Msg-id 3642025c0910100744i1019e3c3xfdb77f603a8a2cce@mail.gmail.com
Whole thread Raw
In response to Re: Databases vs Schemas  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Databases vs Schemas  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-performance
On Fri, Oct 9, 2009 at 11:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Scott Carey <scott@richrelevance.com> writes:
> I've got 200,000 tables in one db (8.4), and some tools barely work.  The
> system catalogs get inefficient when large and psql especially has trouble.
> Tab completion takes forever, even if I make a schema "s" with one table in
> it and type "s." and try and tab complete -- its as if its scanning all
> without a schema qualifier or using an index.

The tab-completion queries have never been vetted for performance
particularly :-(

Just out of curiosity, how much does this help?

alter function pg_table_is_visible(oid) cost 10;

(You'll need to do it as superuser --- if it makes things worse, just
set the cost back to 1.)

> Sometimes it does not match
> valid tables at all, and sometimes regex matching fails too ('\dt
> schema.*_*_*' intermittently flakes out if it returns a lot of matches).

There are some arbitrary "LIMIT 1000" clauses in those queries, which
probably explains this ... but taking them out would likely cause
libreadline to get indigestion ...

                       regards, tom lane
 
We ran into this exact situation with a pg 8.3 database and a very large number of tables.  psql would wait for 20 to 30 seconds if the user was unlucky enough to hit the tab key.  After doing some research with query logging, explain analyze and some trial and error, we came to the same conclusion.  Altering the cost for the pg_table_is_visible function to 10 fixed our performance problem immediately.  It appears that when the cost was set to 1, that the query optimizer first ran the function over the entire pg_class table.  By increasing the cost, it now only runs the function over the rows returned by the other items in the where clause.

-chris

pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Databases vs Schemas
Next
From: Mark Mielke
Date:
Subject: Re: UUID as primary key