Re: At what point does a big table start becoming too big? - Mailing list pgsql-general

From Merlin Moncure
Subject Re: At what point does a big table start becoming too big?
Date
Msg-id CAHyXU0yWrJEcd2rn_+9TEWHevQ=8njUcp_W0kRj0niKvJgSWUw@mail.gmail.com
Whole thread Raw
In response to At what point does a big table start becoming too big?  (Nick <nboutelier@gmail.com>)
Responses Re: At what point does a big table start becoming too big?  (Chris Travers <chris.travers@gmail.com>)
List pgsql-general
On Wed, Aug 22, 2012 at 6:06 PM, Nick <nboutelier@gmail.com> wrote:
> I have a table with 40 million rows and haven't had any performance issues yet.
>
> Are there any rules of thumb as to when a table starts getting too big?
>
> For example, maybe if the index size is 6x the amount of ram, if the table is 10% of total disk space, etc?

Well, that begs the question: ...and do what?  I guess you probably
mean partitioning.

Partitioning doesn't reduce index size -- it makes total index size
*bigger* since you have to duplicate higher nodes in the index --
unless you can exploit the table structure around the partition so
that less fields have to be indexed.

Where partitioning helps is by speeding certain classes of bulk
operations like deleting a bunch of rows -- maybe you can set it up so
that a partition can be dropped instead for a huge efficiency win.
Partitioning also helps by breaking up administrative operations such
as vacuum, analyze, cluster, create index, reindex, etc. So I'd argue
that it's time to start thinking about plan 'b' when you find yourself
getting concerned about performance of those operations.

Partitioning aside, the way to reduce the number of rows you're
dealing with is to explore reorganizing your data: classic
normalization or use of arrays are a couple of examples of things you
can try.

merlin


pgsql-general by date:

Previous
From: Sébastien Lorion
Date:
Subject: Re: Amazon High I/O instances
Next
From: Craig Ringer
Date:
Subject: Re: Amazon High I/O instances