Home > mailing lists

Re: Thousands of tables versus on table? - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Thousands of tables versus on table?
Date	June 5, 2007 18:59:40
Msg-id	10525.1181080765@sss.pgh.pa.us Whole thread Raw
In response to	Re: Thousands of tables versus on table? (david@lang.hm)
Responses	Re: Thousands of tables versus on table? Re: Thousands of tables versus on table? Re: Thousands of tables versus on table?
List	pgsql-performance

Tree view

david@lang.hm writes:
> however I really don't understand why it is more efficiant to have a 5B
> line table that you do a report/query against 0.1% of then it is to have
> 1000 different tables of 5M lines each and do a report/query against 100%
> of.

Essentially what you are doing when you do that is taking the top few
levels of the index out of the database and putting it into the
filesystem; plus creating duplicative indexing information in the
database's system catalogs.

The degree to which this is a win is *highly* debatable, and certainly
depends on a whole lot of assumptions about filesystem performance.
You also need to assume that constraint-exclusion in the planner is
pretty doggone cheap relative to the table searches, which means it
almost certainly will lose badly if you carry the subdivision out to
the extent that the individual tables become small.  (This last could
be improved in some cases if we had a more explicit representation of
partitioning, but it'll never be as cheap as one more level of index
search.)

I think the main argument for partitioning is when you are interested in
being able to drop whole partitions cheaply.

            regards, tom lane

pgsql-performance by date:

From: "Steinar H. Gunderson"
Date: 05 June 2007, 18:38:55
Subject: Re: performance drop on 8.2.4, reverting to 8.1.4

From: "Steinar H. Gunderson"
Date: 05 June 2007, 19:06:18
Subject: Re: Thousands of tables versus on table?

Re: Thousands of tables versus on table? - Mailing list pgsql-performance

Previous

Next