Re: Thousands of tables versus on table? - Mailing list pgsql-performance

From david@lang.hm
Subject Re: Thousands of tables versus on table?
Date
Msg-id Pine.LNX.4.64.0706041631490.11737@asgard.lang.hm
Whole thread Raw
In response to Re: Thousands of tables versus on table?  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: Thousands of tables versus on table?  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-performance
On Mon, 4 Jun 2007, Scott Marlowe wrote:

> Gregory Stark wrote:
>>  "Thomas Andrews" <tandrews@soliantconsulting.com> writes:
>>
>>
>> >  I guess my real question is, does it ever make sense to create thousands
>> >  of
>> >  tables like this?
>> >
>>
>>  Sometimes. But usually it's not a good idea.
>>
>>  What you're proposing is basically partitioning, though you may not
>>  actually
>>  need to put all the partitions together for your purposes. Partitioning's
>>  main
>>  benefit is in the management of the data. You can drop and load partitions
>>  in
>>  chunks rather than have to perform large operations on millions of
>>  records.
>>
>>  Postgres doesn't really get any faster by breaking the tables up like
>>  that. In
>>  fact it probably gets slower as it has to look up which of the thousands
>>  of
>>  tables you want to work with.
>>
>
> That's not entirely true.  PostgreSQL can be markedly faster using
> partitioning as long as you always access it by referencing the partitioning
> key in the where clause.  So, if you partition the table by date, and always
> reference it with a date in the where clause, it will usually be noticeably
> faster.  OTOH, if you access it without using a where clause that lets it
> pick partitions, then it will be slower than one big table.
>
> So, while this poster might originally think to have one table for each user,
> resulting in thousands of tables, maybe a compromise where you partition on
> userid ranges would work out well, and keep each partition table down to some
> 50-100 thousand rows, with smaller indexes to match.
>

what if he doesn't use the postgres internal partitioning, but instead
makes his code access the tables named responsesNNNNN where NNNNN is the
id of the customer?

this is what it sounded like he was asking initially.

David Lang

pgsql-performance by date:

Previous
From: "Gregory Stewart"
Date:
Subject: PostgreSQL not fully utilizing system resources?
Next
From: Bill Moran
Date:
Subject: Re: PostgreSQL not fully utilizing system resources?