Re: Table Partitioning in Postgres: - Mailing list pgsql-general
From | Curt Sampson |
---|---|
Subject | Re: Table Partitioning in Postgres: |
Date | |
Msg-id | Pine.NEB.4.51.0302181059461.452@angelic-vtfw.cvpn.cynic.net Whole thread Raw |
In response to | Re: Table Partitioning in Postgres: ("Bodanapu, Sravan" <Sravan.Bodanapu@NextelPartners.com>) |
List | pgsql-general |
On Mon, 17 Feb 2003, Bodanapu, Sravan wrote: > Thanks Curt!!! The data was actually taken out of Oracle database and then > dumped into Postgres database > using bulk copy. Most of the tables were very large ( having around 20-30 > million rows and around > 200-300 columns in each ). In Oracle, these tables were partitioned into > chunks to get maximum performance. Oh, well that is indeed a reasonably large database, from the sound of it. (Most people here who say "large" about large databases have these little toy two-million row things like the little test one I keep on my laptop. :-)) > 1. When a table is created in postgres, it will always create the > datafile in /pgdata/base/16975 or 16976 directory. What does 16975 and > 16976 mean ? Is there a way that the datafile(for table/data/index) > gets generated in different directories instead of one. If yes, how ? The 16975 and 16976 are the OIDs (Object IDs) of a couple of databases. You can install and run the contrib/oid2name program to easily work out which number corresponds to which database. In each of those directories, you'll find files with similar numbers; those contain table data and index data. The oid2name program will also help you work that out. > 2. Is there a way to limit a datafile size ( say 3GB ) ? This is a > concept in Ingres that you can span the data > across different files. In fact, you'll find that postgres limits data file size to 1 GB, and creates new files for every 1 GB chunk of a table. So you might see a pair of files So you might see a pair of files called 43561 and 43561.1 which are two parts of that table. So if you wanted to move things around, you could always shut down pgsql, move files off to another filesystem, and add symlinks pointing to them. Note that doing a re-index on an index, or clustering a table, will end up moving it back to the main directory, usually. The symlink thing is really just a bit of a kluge that you can use until tablespaces come. > 3. Please suggest us some tips for setting up a big database to acheive > maximum performance ? It's really very application dependent. Get a good understanding of how disks work, how operating systems use disks, and how postgres uses files, analyze your application to see just what it's likely to be doing in the way of disk I/O (sizes, locations and proportions of reads and writes on the physical disk) and then experiment with some tests to see how you can do it better. I know this is all very vague, but there's no "generic" advice that will get you anything near the best performance for your particular application. About a year ago, after not having touched postgresql much in about three or four years, I did this for a large application. I'd already come in with some database knowledge and a fair bit of OS knowledge, and I found it took me about three weeks of work and experimentation to figure out how to get good performance out of postgresql for the application in question. Note also that you can also learn a lot from studying the documentation on other database systems, and comparing that with how postgresql works. I found, for example, _Inside SQL Server 7_ to be very useful for learning more about database performance in general. (I happened to be running a large app on that in my last job.) cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
pgsql-general by date: