Re: Some ideas for comment - Mailing list pgsql-performance

From Chris Hoover
Subject Re: Some ideas for comment
Date
Msg-id 1d219a6f05082413265e872134@mail.gmail.com
Whole thread Raw
In response to Re: Some ideas for comment  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
Responses Re: Some ideas for comment  (Jens-Wolfhard Schicke <ml+pgsql-performance@asco.de>)
List pgsql-performance
On 8/24/05, Merlin Moncure <merlin.moncure@rcsonline.com> wrote:
> > Ok, there is always a lot of talk about tuning PostgreSQL on linux and
> > how PostgreSQL uses the linux kernel cache to cache the tables and
> > indexes.
> [...]
> >
> > 1.  Implement a partition type layout using views and rules - This
> > will allow me to have one table in each view with the "active" data,
> > and the inactive data stored by year in other tables.
> >
> > So I would have the following (for each major table):
> >
> > Table View as
> > select * from active_table
> > union all
> > select * from table_2005
> > union all
> > select * from table_2004
> > etc.
>
> Linux does a pretty good job of deciding what to cache.  I don't think
> this will help much.  You can always look at partial indexes too.
>
Yes, but won't this help create the need to store less?  If I have
1,000.000 rows in a table, but only 4,000 are active, if I move those
4 to another table and link the tables via a view, should that not
help keep the 9,996,000 rows out of the kernel cache (the majority of
the time at least)?

This would mean I have more room for other objects and hopefully less
turn over in the cache, and less disk i/o.

Yes?
[...]
> I would strongly consider adding more memory :).
Unfortunately, it looks like 12GB is all our Dell servers can handle. :(

>
> > I don't have real numbers to give you, but we know that our systems
> > are hurting i/o wise and we are growing by about 2GB+ per week (net).
> > We actually grow by about 5GB/week/server.  However, when I run my
> > weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
> > end up getting about 3GB back.  Unfortunately, I do not have the i/o
> > bandwidth to vacuum during the day as it causes major slowdowns on our
> > system.  Each night, I do run a vacuum analyze across all db's to try
> > and help.  I also have my fsm parameters set high (8000000 fsm pages,
> > and 5000 fsm relations) to try and compensate.
>
> Generally, you can reduce data turnover for the same workload by
> normalizing your database.  IOW, try and make your database more
> efficient in the way it stores data.
>
That's the ultimate goal, but this database structure was developed
and released into production before I started work here.  I'm trying
to slowly change it into a better db, but it is  a slow process.
Normalization does not make it at the top of the priority list,
unfortunately.

> > Right now, we are still on 7.3.4.  However, these ideas would be
> > implemented as part of an upgrade to 8.x (plus, we'll initialize the
> > new clusters with a C locale).
> > > 2.  I am also thinking of recommending we collapse all databases in a
> > cluster into one "mega" database.  I can then use schema's and views
> > to control database access and ensure that no customer can see another
> > customers data.
>
> hm. keep in mind views are tightly bound to the tables they are created
> with (views can't 'float' over tables in different schemas).  pl/pgsql
> functions can, though.  This is a more efficient use of server
> resources, IMO, but not a windfall.

This I know.  Each schema would have to have a "custom" set of views
replacing the tables with the view programmed to only return that
customers data.

I was thinking all of the tables in schema my_tables and the views all
querying the tables stored in the my_tables schema.  I would add an
identifying column to each table so that I can differentiate the data.

Chris

pgsql-performance by date:

Previous
From: Mark Fox
Date:
Subject: Performance indexing of a simple query
Next
From: Jignesh Shah
Date:
Subject: Re: Read/Write block sizes