RE: Question about databases in alternate locations... - Mailing list pgsql-general
From | Culberson, Philip |
---|---|
Subject | RE: Question about databases in alternate locations... |
Date | |
Msg-id | A95EFC3B707BD311986C00A0C9E95B6A01AB1CB9@datmail03.dat.com Whole thread Raw |
In response to | Question about databases in alternate locations... (Richard J Kuhns <rjk@grauel.com>) |
List | pgsql-general |
I've worked with various versions of Oracle for several years and can share some of my experiences with their "system catalog" implementation. They use a fairly simple design in which a database instance consists of 1 .. n tablespaces (that can contain any type of database object) which in turn consists of 1 .. n datafiles. There is a system table which essentially holds the name of the physical file (full path), file_id, tablespace it belongs to, and sizing information. Our database (110 Gig) is split up into 24 tablespaces and these are further split into 73 datafiles. Moving the physical location of a datafile is fairly straight forward. 1) ALTER TABLESPACE foo OFFLINE 2) Move the physical file using OS command 3) ALTER TABLESPACE foo RENAME DATAFILE '/old/file' TO '/new/file' 4) ALTER TABLESPACE foo ONLINE Most of our datafiles run about 2 Gig, so the longest part of this is actually doing the move. One headache is if you want to completely change the locations of ALL your files. This involves editing all of the paths and is definitely prone to error. This is also where you rabidly curse the DBA who decided to have path names that are 140 characters long! A second headache is moving databases from one server to another. You are then forced into having the exact same file structure on the second machine. This can be somewhat amusing if you have a different hardware configuration which doesn't have the same number of disks, etc. This second problem is further complicated by some of the backup solutions available for Oracle. The one that we have uses the system catalog to locate and backup the appropriate files. This again means that if you want to restore the backup to another server, it must be configured exactly as the first. I think that Thomas' fear of having thousands of entries in the system catalog for datafiles is alleviated in the Oracle implementation by the use of the tablespace. Tablespaces can contain any number of database objects, so by using a reasonable tablespace layout, one can keep the number of actual datafiles to a manageable level. One thing that has been definitely useful is the ability to do load balancing based on what tablespaces are "hot". Our system is somewhat of a cross between OLTP and a data warehouse (don't get me started) so the data becomes pretty static after, say, about 30 days. By monitoring which datafiles are being accessed the most, they can be moved to different locations on the storage array to avoid contention and maximize throughput. My first reaction to the suggestion of a pg_location like table was "ARGH, NO!", but after nursing my sprained back from that violent knee jerk reaction and actually thinking about it, I talked myself into thinking it'd probably be a good idea. If we had our online system built on top of Postgres, we would need a filesystem with 110+ Gig of disk space and there would be roughly 3,500 files in its single data directory. Having the ability to organize tables, indices, etc into tablespaces, and then distributing the datafiles in some quasi intelligent fashion is truly pretty powerful. Phil Culberson
pgsql-general by date: