Re: At what point does a big table start becoming too big? - Mailing list pgsql-general

From Bill Moran
Subject Re: At what point does a big table start becoming too big?
Date
Msg-id 20120823093726.77decb0dc8ac41460c95b677@potentialtech.com
Whole thread Raw
In response to Re: At what point does a big table start becoming too big?  ("Martin French" <Martin.French@romaxtech.com>)
Responses Re: At what point does a big table start becoming too big?  ("Martin French" <Martin.French@romaxtech.com>)
Re: At what point does a big table start becoming too big?  (Jeff Janes <jeff.janes@gmail.com>)
Re: At what point does a big table start becoming too big?  (Ondrej Ivanič <ondrej.ivanic@gmail.com>)
List pgsql-general
In response to "Martin French" <Martin.French@romaxtech.com>:
> >
> > I have a table with 40 million rows and haven't had any performance
> > issues yet.
> >
> > Are there any rules of thumb as to when a table starts getting too big?
> >
> > For example, maybe if the index size is 6x the amount of ram, if the
> > table is 10% of total disk space, etc?
>
> My rule here is that a table is too big when performance starts degrading beyond an acceptable level.

The challenge there is that if you wait until performance degrades
beyond an acceptable level, you've allowed yourself to get into a
situation where clients are upset and frustrated, and fixing the
problem is difficult because there's so much data to manipulate to
rearrange things.

And the advice I have along those lines is to establish now what
constitutes unacceptable performance, and put some sort of monitoring
and tracking in place to know what your performance degradation looks
like and predict when you'll have to react.  For example, a MRTG
graph that runs an experimental query once a day during off hours and
graphs the time it takes vs. the # of rows in the table will prove
a valuable tool that can sometimes predict exactly when you'll have
to change things before it becomes a problem.  Other tricks work as
well, such as having the application send an email any time a process
takes more than 50% of the allowable maximum time.

The key is to have visibility into what's going on so your guesses
are at least informed.  People will often point out that no monitoring
or trend tracking is 100% accurate, but if it allows you to predict
and plan for 90% of the future issues, you'll have that much more time
available to deal with the 10% that you don't expect.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/


pgsql-general by date:

Previous
From: "Gauthier, Dave"
Date:
Subject: Re: Can column name aliases be supported?
Next
From: Sébastien Lorion
Date:
Subject: Re: Amazon High I/O instances