Re: Table Partitioning in Postgres: - Mailing list pgsql-general

From Curt Sampson
Subject Re: Table Partitioning in Postgres:
Date
Msg-id Pine.NEB.4.51.0302181059461.452@angelic-vtfw.cvpn.cynic.net
Whole thread Raw
In response to Re: Table Partitioning in Postgres:  ("Bodanapu, Sravan" <Sravan.Bodanapu@NextelPartners.com>)
List pgsql-general
On Mon, 17 Feb 2003, Bodanapu, Sravan wrote:

> Thanks Curt!!! The data was actually taken out of Oracle database and then
> dumped into Postgres database
> using bulk copy. Most of the tables were very large ( having around 20-30
> million rows and around
> 200-300 columns in each ). In Oracle, these tables were partitioned into
> chunks to get maximum performance.

Oh, well that is indeed a reasonably large database, from the sound of
it. (Most people here who say "large" about large databases have these
little toy two-million row things like the little test one I keep on my
laptop. :-))

> 1. When a table is created in postgres, it will always create the
> datafile in /pgdata/base/16975 or 16976 directory. What does 16975 and
> 16976 mean ? Is there a way that the datafile(for table/data/index)
> gets generated in different directories instead of one. If yes, how ?

The 16975 and 16976 are the OIDs (Object IDs) of a couple of databases.
You can install and run the contrib/oid2name program to easily work out
which number corresponds to which database. In each of those directories,
you'll find files with similar numbers; those contain table data and index
data. The oid2name program will also help you work that out.

> 2.    Is there a way to limit a datafile size ( say 3GB ) ? This is a
> concept in Ingres that you can span the data
>     across different files.

In fact, you'll find that postgres limits data file size to 1 GB, and
creates new files for every 1 GB chunk of a table. So you might see a
pair of files So you might see a pair of files called 43561 and 43561.1
which are two parts of that table.

So if you wanted to move things around, you could always shut down
pgsql, move files off to another filesystem, and add symlinks pointing
to them. Note that doing a re-index on an index, or clustering a table,
will end up moving it back to the main directory, usually. The symlink
thing is really just a bit of a kluge that you can use until tablespaces
come.

> 3.    Please suggest us some tips for setting up a big database to acheive
> maximum performance ?

It's really very application dependent. Get a good understanding of
how disks work, how operating systems use disks, and how postgres uses
files, analyze your application to see just what it's likely to be doing
in the way of disk I/O (sizes, locations and proportions of reads and
writes on the physical disk) and then experiment with some tests to see
how you can do it better. I know this is all very vague, but there's no
"generic" advice that will get you anything near the best performance
for your particular application.

About a year ago, after not having touched postgresql much in about
three or four years, I did this for a large application. I'd already
come in with some database knowledge and a fair bit of OS knowledge,
and I found it took me about three weeks of work and experimentation
to figure out how to get good performance out of postgresql for the
application in question.

Note also that you can also learn a lot from studying the documentation
on other database systems, and comparing that with how postgresql works.
I found, for example, _Inside SQL Server 7_ to be very useful for
learning more about database performance in general. (I happened to be
running a large app on that in my last job.)

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

pgsql-general by date:

Previous
From: Will Trillich
Date:
Subject: Re: DATABASE EXAMPLES?
Next
From: Will Trillich
Date:
Subject: Re: Inquiry From Form [pgsql]