On Wed, 2008-10-08 at 09:34 -0400, Merlin Moncure wrote:
> On Wed, Oct 8, 2008 at 9:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > wstrzalka <wstrzalka@gmail.com> writes:
> >> the 15483 process is stats collector. At the moment server is almost
> >> idle but the stats collector is constantly taking 15-17% of CPU.
> >
> >> I don't know if it matters at all, but maybe the reason is that the
> >> cluster is very large in the term of relation number (many schemes
> >> with identical table set).
> >
> >> select count(*) from pg_class;
> >> count
> >> --------
> >> 257477
> >
> > Ouch. You might want to consider a schema redesign. Usually, if you've
> > got a lot of tables with the same column-set, it's better to combine
> > them into one big table with an additional key column.
> >
> > I'm sure the stats collector runtime is directly tied to having so many
> > tables --- it's trying to keep stats on each one of them individually.
>
> Unfortunately there are other competing issues with huge tables, like
> long vacuums. There's been some work to mitigate this, but we haven't
> turned the corner yet. IMNSHO, though, table partitioning is a
> feature that should be used...cautiously.
Siebel has 20,000 tables in its data model and that's the biggest I know
of. However, I've seen partitioned designs with more than 100,000
tables. 250,000 is a lot for Postgres, but we should be designing
Postgres to cope with up to 1,000,000 tables or partitions. There's lots
of data out there and if it doesn't come to us it will go elsewhere.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support