Re: Tablespaces and NFS - Mailing list pgsql-performance
From | Peter Koczan |
---|---|
Subject | Re: Tablespaces and NFS |
Date | |
Msg-id | 4544e0330709201150m6c97e8c2hfbccd6a7991200b8@mail.gmail.com Whole thread Raw |
In response to | Re: Tablespaces and NFS (Carlos Moreno <moreno_pg@mochima.com>) |
Responses |
Re: Tablespaces and NFS
|
List | pgsql-performance |
> Anyway... One detail I don't understand --- why do you claim that > "You can't take advantage of the shared file system because you can't > share tablespaces among clusters or servers" ??? I say that because you can't set up two servers to point to the same tablespace (i.e. you can't have server A and server B both point to the tablespace in /mnt/nfs/postgres/), which basically defeats one of the main purposes of using a shared file system, seeing, using, and editing files from anywhere. This is ill-advised and probably won't work for 2 reasons. - Postgres tablespaces require empty directories to for initialization. If you create a tablespace on server A, it puts files in the previously empty directory. If you then try to create a tablespace on server B pointing to the same location, it won't work since the directory is no longer empty. You can get around this, in theory, but you'd either have to directly mess with system tables or fool Postgres into thinking that each server independently created that tablespace (to which anyone will say, NO!!!!). - If you do manage to fool postgres into having two servers pointing at the same tablespace, the servers really, REALLY won't play nice with these shared resources, since they have no knowledge of each other (i mean, two clusters on the same server don't play nice with memory). Basically, if they compete for the same file, either I/O will be EXTREMELY slow because of file-locking mechanisms in the file system, or you open things up to race conditions and data corruption. In other words: BAD!!!! I know this doesn't fully apply to you, but I thought I should explain my points betters since you asked so nicely :-) > This seems to be the killer point --- mainly because the network > connection is a 100Mbps (around 10 MB/sec --- less than 1/4 of > the performance we'd expect from an internal hard drive). If at > least it was a Gigabit connection, I might still be tempted to > retry the experiment. I was thinking that *maybe* the latencies > and contention due to heads movements (in the order of the millisec) > would take precedence and thus, a network-distributed cluster of > hard drives would end up winning. If you get decently fast disks, or put some slower disks in RAID 10, you'll easily get >100 MB/sec (and that's a conservative estimate). Even with a Gbit network, you'll get, in theory 128 MB/sec, and that's assuming that the NFS'd disks aren't a bottleneck. > We're clear that that would be the *optimal* solution --- problem > is, there's a lot of client-side software that we would have to > change; I'm first looking for a "transparent" solution in which > I could distribute the load at a hardware level, seeing the DB > server as a single entity --- the ideal solution, of course, > being the use of tablespaces with 4 or 6 *internal* hard disks > (but that's not an option with our current web hoster). I sadly don't know enough networking to tell you tell the client software "no really, I'm over here." However, one of the things I'm fond of is using a module to store connection strings, and dynamically loading said module on the client side. For instance, with Perl I use... use DBI; use DBD::Pg; use My::DBs; my $dbh = DBI->connect($My::DBs::mydb); Assuming that the module and its entries are kept up to date, it will "just work." That way, there's only 1 module to change instead of n client apps. I can have a new server with a new name up without changing any client code. > Anyway, I'll keep working on alternative solutions --- I think > I have enough evidence to close this NFS door. That's probably for the best.
pgsql-performance by date: