Re: tablespaces and DB administration - Mailing list pgsql-hackers
From | pgsql@mohawksoft.com |
---|---|
Subject | Re: tablespaces and DB administration |
Date | |
Msg-id | 16436.24.91.171.78.1085679292.squirrel@mail.mohawksoft.com Whole thread Raw |
In response to | Re: tablespaces and DB administration (Andreas Pflug <pgadmin@pse-consulting.de>) |
Responses |
Re: tablespaces and DB administration
|
List | pgsql-hackers |
> In the age of inexpensive RAID, tablespaces have more or less lost their > relevance regarding performance. pgsql's philosophy respects this by > leaving the storage work up to the OS and disk subsystem. Even having > the xlog on a different spindle won't help too much; you'll probably be > better off if you stuff all your spindles in one raid on most systems. > For worse, splitting two disks into separate storage areas to have xlog > separated would degrade safety for very little performance gain. So the > advise is: one disk, no alternative. 2 to 20 disks: use a single raid. > more disks: examine your access patterns carefully before you believe > you can do the job better than your raid controller. > > This leaves table spaces as a mere administrative feature, many (most) > installations will happily live without that. > > Regards, > Andreas Sorry, I just can't leave this one alone. Having multiple spindles i.e. separate data paths to separate disks and disk systems makes a big difference. Take this simple program: >>>>>> testio.c >>>>>>>>>>>>>>>>>>>>>>>>>> #include <unistd.h> #include <stdio.h> /* gcc testio.c -o testio */ int main(int argc, char **argv) { int i; int blocks; FILE *files[16]; int block[512]; int foo[512]; int nblocks = atoi(argv[1]); printf("Using %d 2K blocks, total file size %d\n", nblocks, sizeof(block)*nblocks); for(i=2; i <argc; i++) files[i]=fopen(argv[i], "w+b"); for(blocks=0; blocks < nblocks; blocks++) for(i=2;i < argc; i++) fwrite(block, 1, sizeof(block), files[i]); for(i=2; i < argc; i++) fseek(files[i], 0, SEEK_SET); for(blocks=0; blocks < nblocks; blocks++) for(i=2; i < argc;i++) fread(foo, 1, sizeof(foo), files[i]); } <<<<<<<<<<<<<<<<<<<< The purpose of the program is to write out files, and read them back in. It is crude, obviously, and not a good general test, but it does show the effect of which I am writing. On my test system, I have a bunch of disks, but I'll use "/home" and "/vol01" as examples: vol01 is ext2 and home is reiserfs, and both are IBM ultra SCSI 10K RPM disks. [root@node1 markw]# time ./testio 100000 /home/tmp/test.dat Using 100000 2K blocks, total file size 204800000 real 0m6.790s user 0m0.290s sys 0m4.120s [root@node1 markw]# time ./testio 100000 /vol01/tmp/test.dat Using 100000 2K blocks, total file size 204800000 real 0m7.274s user 0m0.210s sys 0m1.940s As you can see, they are fairly well matched +- filesystem issues. Now, lets run the program across two disks: [root@node1 markw]# time ./testio 100000 /vol01/tmp/test.dat /home/tmp/test.dat Using 100000 2K blocks, total file size 204800000 real 0m12.012s user 0m0.610s sys 0m6.820s As you can see, it looks like almost double the time, and you might be looking at this as proof that you are right. Actually, it is mostly an artifact of program. Now, look ate the results if I write two files to the same volume: [root@node1 markw]# time ./testio 100000 /vol01/tmp/test.dat /vol01/tmp/test.dat1 Using 100000 2K blocks, total file size 204800000 real 0m19.298s user 0m0.680s sys 0m3.990s As you can see, it takes a LOT longer. That's because the disk has to do a lot more work, the hardware data path has to carry twice as much data, and you have to manage multiple accesses to a single object, the disk. It doesn't take much imagination to see what this would mean for pg_xlog.
pgsql-hackers by date: