Thread: tablespaces and DB administration

tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

26 May 2004, 17:11:52

Now that it looks like tablespaces will become a practical reality, I want
to suggest some changes in documented procedure and preferred setup. The
reason why I suggest these changes is to enlighten new PostgreSQL users to
the new features and, perhaps, make a more enterprise-familiar environment
for experienced DBAs.

First, we keep the standard PostgreSQL directory the way it has always
been with template0, template1, pg_xlog, pg_clog, etc. in the same place.
We can refer to this as the "system" directory. This makes sense because
all the system level stuff is there. User databases should be discouraged
from the system, and users should be encouraged to create and use separate
tablespaces for their databases.

This is usefull for a couple of reasons: 1st, it makes monitoring database
size easier while also simplifying xlog monitoring, as well as makes a
logical separation from the PostgreSQL operational core and its external
data. 2nd, this is very similar to how other databases view data.

What do you all think?

Re: tablespaces and DB administration

From

Tom Lane

Date:

26 May 2004, 17:33:54

pgsql@mohawksoft.com writes:
> First, we keep the standard PostgreSQL directory the way it has always
> been with template0, template1, pg_xlog, pg_clog, etc. in the same place.
> We can refer to this as the "system" directory. This makes sense because
> all the system level stuff is there. User databases should be discouraged
> from the system, and users should be encouraged to create and use separate
> tablespaces for their databases.

Why?

This seems to me to be pushing complexity onto users whether they
want/need it or not.  I think that only a relatively small number of
installations will have any use for tablespaces, and we should not
try to coerce average users into worrying about them.
        regards, tom lane

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

26 May 2004, 20:00:42

> pgsql@mohawksoft.com writes:
>> First, we keep the standard PostgreSQL directory the way it has always
>> been with template0, template1, pg_xlog, pg_clog, etc. in the same
>> place.
>> We can refer to this as the "system" directory. This makes sense because
>> all the system level stuff is there. User databases should be
>> discouraged
>> from the system, and users should be encouraged to create and use
>> separate
>> tablespaces for their databases.
>
> Why?
>
> This seems to me to be pushing complexity onto users whether they
> want/need it or not.  I think that only a relatively small number of
> installations will have any use for tablespaces, and we should not
> try to coerce average users into worrying about them.

I forgot to specify that tablepaces should be on separate volumes. (sorry)
If all they have is one volume, no worries, but instructing the use of
alternate volumes for system and data will improve performance by
separating WAL and data operations.

Tablespaces are a familiar construct to experienced DBAs who may not be
familiar with PostgreSQL. PostgreSQL being similar to other databases will
have it better "make sense" to new users.

Users are primarily, if not stupid, ignorant. They will read the absolute
minimum needed to achieve a goal and little else. I say this with the
utmost respect, because I and probably everyone else on this group is
guilty of the same thing. So, the "preferred" installation procedure, i.e.
the one with the easy to follow directions, should showcase features the
user should know, and leave the user in a good place. IMHO, the user's
database on one volume and pg_xlog on another is a better starting place.

BTW: Is there a public spec on what will be tablespace compatible and how?
For instance: will is be possible to create a table on a separate
tablespace than the DB? Will it be possible to create an index on a
separate tablespace than the table?

Re: tablespaces and DB administration

From

Christopher Kings-Lynne

Date:

26 May 2004, 22:31:51

> BTW: Is there a public spec on what will be tablespace compatible and how?
> For instance: will is be possible to create a table on a separate
> tablespace than the DB? Will it be possible to create an index on a
> separate tablespace than the table?

(Since Gavin hasn't replied yet)

1. There are two default tablespaces, global and default.  Global is for 
shared catalogs, default is for all other catalog data.

2. By default databases are created in the default tablespace, you can 
override this.

3. By default schemas are created in the database's tablepsace, but this 
can be overridden.

5. Tables and sequences are created in the schema's tablespace by 
default, but this can be overridden

6. Indexes and SERIAL sequences are created in the table's tablespace by 
default, but this can be overridden.

When pg_dumping, care is taken so that the tablespace qualification is 
only dumped if the object is in its 'non-default' tablespace.

Chris

Re: tablespaces and DB administration

From

James Robinson

Date:

27 May 2004, 00:43:59

On May 26, 2004, at 7:14 PM, pgsql@mohawksoft.com wrote:

>> pgsql@mohawksoft.com writes:
>>> First, we keep the standard PostgreSQL directory the way it has 
>>> always
>>> been with template0, template1, pg_xlog, pg_clog, etc. in the same
>>> place.
>>> We can refer to this as the "system" directory. This makes sense 
>>> because
>>> all the system level stuff is there. User databases should be
>>> discouraged
>>> from the system, and users should be encouraged to create and use
>>> separate
>>> tablespaces for their databases.
>>
>> Why?
>>
>> This seems to me to be pushing complexity onto users whether they
>> want/need it or not.  I think that only a relatively small number of
>> installations will have any use for tablespaces, and we should not
>> try to coerce average users into worrying about them.
>
> I forgot to specify that tablepaces should be on separate volumes. 
> (sorry)
> If all they have is one volume, no worries, but instructing the use of
> alternate volumes for system and data will improve performance by
> separating WAL and data operations.
>
> Tablespaces are a familiar construct to experienced DBAs who may not be
> familiar with PostgreSQL. PostgreSQL being similar to other databases 
> will
> have it better "make sense" to new users.
>
> Users are primarily, if not stupid, ignorant. They will read the 
> absolute
> minimum needed to achieve a goal and little else. I say this with the
> utmost respect, because I and probably everyone else on this group is
> guilty of the same thing. So, the "preferred" installation procedure, 
> i.e.
> the one with the easy to follow directions, should showcase features 
> the
> user should know, and leave the user in a good place. IMHO, the user's
> database on one volume and pg_xlog on another is a better starting 
> place.

Yes, that is generally the case (prefer pg_xlog on separate spindle), 
but no
need to *forcibly* overcomplicate things if the box has only one 
spindle,
or if they have only one single RAID'd partition configured. We should
continue to err on the side of keeping the path to a functional system
nice and simple, yet still offering superb functionality. Oracle gets 
this
wrong. pg_autovacuum is another good step in this direction.

----
James Robinson
Socialserve.com

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

27 May 2004, 06:30:28

James Robinson wrote:

>>
>>
>> Users are primarily, if not stupid, ignorant. They will read the 
>> absolute
>> minimum needed to achieve a goal and little else. I say this with the
>> utmost respect, because I and probably everyone else on this group is
>> guilty of the same thing. So, the "preferred" installation procedure, 
>> i.e.
>> the one with the easy to follow directions, should showcase features the
>> user should know, and leave the user in a good place. IMHO, the user's
>> database on one volume and pg_xlog on another is a better starting 
>> place.
>
>
> Yes, that is generally the case (prefer pg_xlog on separate spindle), 
> but no
> need to *forcibly* overcomplicate things if the box has only one spindle,
> or if they have only one single RAID'd partition configured. We should
> continue to err on the side of keeping the path to a functional system
> nice and simple, yet still offering superb functionality. Oracle gets 
> this
> wrong. pg_autovacuum is another good step in this direction.

In the age of inexpensive RAID, tablespaces have more or less lost their 
relevance regarding performance. pgsql's philosophy respects this by 
leaving the storage work up to the OS and disk subsystem. Even having 
the xlog on a different spindle won't help too much; you'll probably be 
better off if you stuff all your spindles in one raid on most systems. 
For worse, splitting two disks into separate storage areas to have xlog 
separated would degrade safety for very little performance gain. So the 
advise is: one disk, no alternative. 2 to 20 disks: use a single raid. 
more disks: examine your access patterns carefully before you believe 
you can do the job better than your raid controller.

This leaves table spaces as a mere administrative feature, many (most) 
installations will happily live without that.

Regards,
Andreas

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 09:28:19

> Yes, that is generally the case (prefer pg_xlog on separate spindle),
> but no
> need to *forcibly* overcomplicate things if the box has only one
> spindle,
> or if they have only one single RAID'd partition configured. We should
> continue to err on the side of keeping the path to a functional system
> nice and simple, yet still offering superb functionality. Oracle gets
> this
> wrong. pg_autovacuum is another good step in this direction.
>
> ----
> James Robinson
> Socialserve.com
>
I wouldn't say "forcibly."

While I think everyone would agree, Oracle is the most arcane and
difficult system ever devised by man, one can look at it and bring away
the far too few good things it does do. Similarly, MSSQL likes to user
data and system data separated. Setting user expectation is important,
setting baseline usage by recommendation is important.

One of the places PostgreSQL errs is its default configuration. It is far
to conservitive. The default setup is typically how most users operate
their software. Oracle, while being the worst, has one benefit, it has no
recognizable default, every DBA must slog through endless options to set
it up. In the process they are made aware of the various tuning options
and the flexability of the system.

PostgreSQL, on the other hand, installs easily, and can be used almost
immediately. This is not a problem, per se', but the user is not exposed
to how to make the system perform well unless they read the manual beyond
the "installation." (which we know they won't)  Compounding the problem,
various parameters in the default configuration file are too small for
serious servers.

I would love to re-write the installation chapter and perhaps add a
section on generic database tuning. Specifically, like my "PostgreSQL for
Windows" CD a couple years ago, I like to see three sections: "Personal,"
"Workgroup," and "server" configuration tips. I think we've all seen bad
reviews because people used PostgreSQL's default configuration.

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 09:41:29

> James Robinson wrote:
>
>>>
>>>
>>> Users are primarily, if not stupid, ignorant. They will read the
>>> absolute
>>> minimum needed to achieve a goal and little else. I say this with the
>>> utmost respect, because I and probably everyone else on this group is
>>> guilty of the same thing. So, the "preferred" installation procedure,
>>> i.e.
>>> the one with the easy to follow directions, should showcase features
>>> the
>>> user should know, and leave the user in a good place. IMHO, the user's
>>> database on one volume and pg_xlog on another is a better starting
>>> place.
>>
>>
>> Yes, that is generally the case (prefer pg_xlog on separate spindle),
>> but no
>> need to *forcibly* overcomplicate things if the box has only one
>> spindle,
>> or if they have only one single RAID'd partition configured. We should
>> continue to err on the side of keeping the path to a functional system
>> nice and simple, yet still offering superb functionality. Oracle gets
>> this
>> wrong. pg_autovacuum is another good step in this direction.
>
>
> In the age of inexpensive RAID, tablespaces have more or less lost their
> relevance regarding performance. pgsql's philosophy respects this by
> leaving the storage work up to the OS and disk subsystem. Even having
> the xlog on a different spindle won't help too much; you'll probably be
> better off if you stuff all your spindles in one raid on most systems.
> For worse, splitting two disks into separate storage areas to have xlog
> separated would degrade safety for very little performance gain. So the
> advise is: one disk, no alternative. 2 to 20 disks: use a single raid.
> more disks: examine your access patterns carefully before you believe
> you can do the job better than your raid controller.
>
> This leaves table spaces as a mere administrative feature, many (most)
> installations will happily live without that.
>
> Regards,
> Andreas

I would say that this is almost completely misinformed. Depending on the
OS and the hardware, of course, a write on one spindle may not affect the
performance of another.

There are so many great things that happen when you have separate
spindles. The OS manages the file systems separately, the device drivers
may be separate, and if the low-level I/O device driver is even different,
then you get your own bus mastering I/O buffers. All around good things
happen when you have separate spindles.

A single postgresql process may not see much benefit, because it does not
do background I/O, but multiple postgresql processes will perform better
because multiple I/O requests can be issued and processed simultaneously.
If you got good SMP in your kernel, even better.

Re: tablespaces and DB administration

From

Josh Berkus

Date:

27 May 2004, 14:03:57

"Mohawksoft":

> I forgot to specify that tablepaces should be on separate volumes. (sorry)
> If all they have is one volume, no worries, but instructing the use of
> alternate volumes for system and data will improve performance by
> separating WAL and data operations.

... and the place for this is the documentation, maybe with a nice script to 
help automate it.   Twisting users' arms will just get us a lot of angry 
e-mail.  Plus force separation of tablespaces is not appropriate for many 
kinds of installations:
-- the 1MB 2-table database of someone's DVD collection;
-- the 700GB database running off a $75,000 NAS (which appears to the OS as a 
single volume)

Also, you're getting confused here ... Tablespaces has nothing to do with the 
location of pg_xlog.

> Tablespaces are a familiar construct to experienced DBAs who may not be
> familiar with PostgreSQL. PostgreSQL being similar to other databases will
> have it better "make sense" to new users.

I'll have to disagree with you there.  I used to be a certified MSSQL admin, 
and I can tell you that not one in 25 members of MSDN Database forum had any 
idea how to use the analogous feature on MSSQL -- despite it being operative 
since 1998.   So forcing new users to deal with tablespaces, even if they 
don't need them, is a great way to get new users to adopt MySQL.

> So, the "preferred" installation procedure, i.e.
> the one with the easy to follow directions, should showcase features the
> user should know, and leave the user in a good place. 

No, the "preferred" newbie installation is the one that gets them up and 
running and playing with PostgreSQL in the minimum amount of time.  Setup is 
boring, confusing, and often frustrating, and each step we add to the 
required minimum setup loses us another dozen newbies who weren't sure if 
they are ready to upgrade from MS Access or MySQL.  Heck, for the CDs we're 
making to hand out at conventions, we're running PostgreSQL on Knoppix so 
that users don't have to install *anything*.

Now, if you want to add a "power user setup" to the Tutorial in our official 
docs, please do!   We could use more guides.   But don't force the guy with 
the personal DVD database to set things up like he's running Ameritrade.

Also, consider that about half our users install from packages: RPMs and Deb 
packages (and soon MSI as well).   Those users aren't going to be going 
through a manual installation procedure at all, so your efforts to "educate" 
them through proper database setup won't get very far.

> IMHO, the user's
> database on one volume and pg_xlog on another is a better starting place.

For some setups, yes.   For others, no.  It depends on your hardware and 
application.   And, as I said above, Tablespaces will not determine the 
location of pg_xlog AFAIK.

> BTW: Is there a public spec on what will be tablespace compatible and how?
> For instance: will is be possible to create a table on a separate
> tablespace than the DB? Will it be possible to create an index on a
> separate tablespace than the table?

This is in the archives of this list.     The whole point of tablespaces is to 
allow placing individual tables and indexes on seperate volumes.   AFAIR, 
we're not that concerned about whole databases, which have always been easily 
re-locatable via symlinks and/or mounts.

P.S. if you signed your e-mails, I'd stop calling you "mohawksoft".

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

27 May 2004, 14:10:50

pgsql@mohawksoft.com wrote:

>>James Robinson wrote:
>>
>>    
>>
>>>>Users are primarily, if not stupid, ignorant. They will read the
>>>>absolute
>>>>minimum needed to achieve a goal and little else. I say this with the
>>>>utmost respect, because I and probably everyone else on this group is
>>>>guilty of the same thing. So, the "preferred" installation procedure,
>>>>i.e.
>>>>the one with the easy to follow directions, should showcase features
>>>>the
>>>>user should know, and leave the user in a good place. IMHO, the user's
>>>>database on one volume and pg_xlog on another is a better starting
>>>>place.
>>>>        
>>>>
>>>Yes, that is generally the case (prefer pg_xlog on separate spindle),
>>>but no
>>>need to *forcibly* overcomplicate things if the box has only one
>>>spindle,
>>>or if they have only one single RAID'd partition configured. We should
>>>continue to err on the side of keeping the path to a functional system
>>>nice and simple, yet still offering superb functionality. Oracle gets
>>>this
>>>wrong. pg_autovacuum is another good step in this direction.
>>>      
>>>
>>In the age of inexpensive RAID, tablespaces have more or less lost their
>>relevance regarding performance. pgsql's philosophy respects this by
>>leaving the storage work up to the OS and disk subsystem. Even having
>>the xlog on a different spindle won't help too much; you'll probably be
>>better off if you stuff all your spindles in one raid on most systems.
>>For worse, splitting two disks into separate storage areas to have xlog
>>separated would degrade safety for very little performance gain. So the
>>advise is: one disk, no alternative. 2 to 20 disks: use a single raid.
>>more disks: examine your access patterns carefully before you believe
>>you can do the job better than your raid controller.
>>
>>This leaves table spaces as a mere administrative feature, many (most)
>>installations will happily live without that.
>>
>>Regards,
>>Andreas
>>    
>>
>
>I would say that this is almost completely misinformed. Depending on the
>OS and the hardware, of course, a write on one spindle may not affect the
>performance of another.
>
>There are so many great things that happen when you have separate
>spindles. The OS manages the file systems separately, the device drivers
>may be separate, and if the low-level I/O device driver is even different,
>then you get your own bus mastering I/O buffers. All around good things
>happen when you have separate spindles.
>
>A single postgresql process may not see much benefit, because it does not
>do background I/O, but multiple postgresql processes will perform better
>because multiple I/O requests can be issued and processed simultaneously.
>If you got good SMP in your kernel, even better.
>
>  
>
There are good white papers about DB IO performance, e.g from Microsoft. 
They are not read very often...If you dedicate drives to services, it's your responsibility to size 
everything to have a balanced load. You'll probably end with some drives 
being the bottleneck, while others are still almost idle. That's why 
RAID shoud be used in the first and second place, it will distribute the 
workload on all spindles equally until saturated. The recommendation to 
use separate disks for this and that originates from ancient days where 
performance had to be achieved by application level programming and 
configuration, implementing own file systems on raw devices. pgsql 
deliberately doesn't work like this.

If you may use lets say 10 disks, you'd probably something similar like
2x RAID1 for /
2x RAID1 for /tmp + swap
2x RAID1 for xlog
4x RAID5 for data

I bet you get better performance with all disks in one RAID5, because 
now the system disks not only have no negative impact on DB transfer 
performance, but add additional seek bandwidth to DB traffic.

Regards,
Andreas

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 14:21:13

> In the age of inexpensive RAID, tablespaces have more or less lost their
> relevance regarding performance. pgsql's philosophy respects this by
> leaving the storage work up to the OS and disk subsystem. Even having
> the xlog on a different spindle won't help too much; you'll probably be
> better off if you stuff all your spindles in one raid on most systems.
> For worse, splitting two disks into separate storage areas to have xlog
> separated would degrade safety for very little performance gain. So the
> advise is: one disk, no alternative. 2 to 20 disks: use a single raid.
> more disks: examine your access patterns carefully before you believe
> you can do the job better than your raid controller.
>
> This leaves table spaces as a mere administrative feature, many (most)
> installations will happily live without that.
>
> Regards,
> Andreas

Sorry, I just can't leave this one alone. Having multiple spindles i.e.
separate data paths to separate disks and disk systems makes a big
difference. Take this simple program:

>>>>>> testio.c >>>>>>>>>>>>>>>>>>>>>>>>>>
#include <unistd.h>
#include <stdio.h>
/* gcc testio.c -o testio */
int main(int argc, char **argv)
{       int i;       int blocks;       FILE *files[16];       int block[512];       int foo[512];       int nblocks =
atoi(argv[1]);
       printf("Using %d 2K blocks, total file size %d\n",             nblocks, sizeof(block)*nblocks);       for(i=2; i
<argc; i++)               files[i]=fopen(argv[i], "w+b");       for(blocks=0; blocks < nblocks; blocks++)
for(i=2;i < argc; i++)                       fwrite(block, 1, sizeof(block), files[i]);       for(i=2; i < argc; i++)
           fseek(files[i], 0, SEEK_SET);       for(blocks=0; blocks < nblocks; blocks++)               for(i=2; i <
argc;i++)                       fread(foo, 1, sizeof(foo), files[i]);
 
}
<<<<<<<<<<<<<<<<<<<<

The purpose of the program is to write out files, and read them back in.
It is crude, obviously, and not a good general test, but it does show the
effect of which I am writing.

On my test system, I have a bunch of disks, but I'll use "/home" and
"/vol01" as examples: vol01 is ext2 and home is reiserfs, and both are IBM
ultra SCSI 10K RPM disks.

[root@node1 markw]# time ./testio 100000 /home/tmp/test.dat
Using 100000 2K blocks, total file size 204800000

real    0m6.790s
user    0m0.290s
sys     0m4.120s
[root@node1 markw]# time ./testio 100000 /vol01/tmp/test.dat
Using 100000 2K blocks, total file size 204800000

real    0m7.274s
user    0m0.210s
sys     0m1.940s

As you can see, they are fairly well matched +- filesystem issues. Now,
lets run the program across two disks:

[root@node1 markw]# time ./testio 100000 /vol01/tmp/test.dat
/home/tmp/test.dat
Using 100000 2K blocks, total file size 204800000

real    0m12.012s
user    0m0.610s
sys     0m6.820s

As you can see, it looks like almost double the time, and you might be
looking at this as proof that you are right. Actually, it is mostly an
artifact of program.

Now, look ate the results if I write two files to the same volume:

[root@node1 markw]# time ./testio 100000 /vol01/tmp/test.dat
/vol01/tmp/test.dat1
Using 100000 2K blocks, total file size 204800000

real    0m19.298s
user    0m0.680s
sys     0m3.990s

As you can see, it takes a LOT longer. That's because the disk has to do a
lot more work, the hardware data path has to carry twice as much data, and
you have to manage multiple accesses to a single object, the disk.

It doesn't take much imagination to see what this would mean for pg_xlog.

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 14:34:38

> pgsql@mohawksoft.com wrote:
>
>>>James Robinson wrote:
>>>
>>>
>>>
>>>>>Users are primarily, if not stupid, ignorant. They will read the
>>>>>absolute
>>>>>minimum needed to achieve a goal and little else. I say this with the
>>>>>utmost respect, because I and probably everyone else on this group is
>>>>>guilty of the same thing. So, the "preferred" installation procedure,
>>>>>i.e.
>>>>>the one with the easy to follow directions, should showcase features
>>>>>the
>>>>>user should know, and leave the user in a good place. IMHO, the user's
>>>>>database on one volume and pg_xlog on another is a better starting
>>>>>place.
>>>>>
>>>>>
>>>>Yes, that is generally the case (prefer pg_xlog on separate spindle),
>>>>but no
>>>>need to *forcibly* overcomplicate things if the box has only one
>>>>spindle,
>>>>or if they have only one single RAID'd partition configured. We should
>>>>continue to err on the side of keeping the path to a functional system
>>>>nice and simple, yet still offering superb functionality. Oracle gets
>>>>this
>>>>wrong. pg_autovacuum is another good step in this direction.
>>>>
>>>>
>>>In the age of inexpensive RAID, tablespaces have more or less lost their
>>>relevance regarding performance. pgsql's philosophy respects this by
>>>leaving the storage work up to the OS and disk subsystem. Even having
>>>the xlog on a different spindle won't help too much; you'll probably be
>>>better off if you stuff all your spindles in one raid on most systems.
>>>For worse, splitting two disks into separate storage areas to have xlog
>>>separated would degrade safety for very little performance gain. So the
>>>advise is: one disk, no alternative. 2 to 20 disks: use a single raid.
>>>more disks: examine your access patterns carefully before you believe
>>>you can do the job better than your raid controller.
>>>
>>>This leaves table spaces as a mere administrative feature, many (most)
>>>installations will happily live without that.
>>>
>>>Regards,
>>>Andreas
>>>
>>>
>>
>>I would say that this is almost completely misinformed. Depending on the
>>OS and the hardware, of course, a write on one spindle may not affect the
>>performance of another.
>>
>>There are so many great things that happen when you have separate
>>spindles. The OS manages the file systems separately, the device drivers
>>may be separate, and if the low-level I/O device driver is even
>> different,
>>then you get your own bus mastering I/O buffers. All around good things
>>happen when you have separate spindles.
>>
>>A single postgresql process may not see much benefit, because it does not
>>do background I/O, but multiple postgresql processes will perform better
>>because multiple I/O requests can be issued and processed simultaneously.
>>If you got good SMP in your kernel, even better.
>>
>>
>>
> There are good white papers about DB IO performance, e.g from Microsoft.

Do not trust *anything* from Microsoft, they have very dubious motives and
research practices. They are compulsive liars and everything published by
them is specifically edited to present their agenda.


> They are not read very often...
>  If you dedicate drives to services, it's your responsibility to size
> everything to have a balanced load. You'll probably end with some drives
> being the bottleneck, while others are still almost idle. That's why
> RAID shoud be used in the first and second place, it will distribute the
> workload on all spindles equally until saturated. The recommendation to
> use separate disks for this and that originates from ancient days where
> performance had to be achieved by application level programming and
> configuration, implementing own file systems on raw devices. pgsql
> deliberately doesn't work like this.
>
> If you may use lets say 10 disks, you'd probably something similar like
> 2x RAID1 for /
> 2x RAID1 for /tmp + swap
> 2x RAID1 for xlog
> 4x RAID5 for data
>
> I bet you get better performance with all disks in one RAID5, because
> now the system disks not only have no negative impact on DB transfer
> performance, but add additional seek bandwidth to DB traffic.
>
> Regards,
> Andreas

You are absolutely wrong on all accounts here. A RAID5 system is slower
than a single spindle as it is only as fast as the slowest disk in the
stripe and the overhead of the RAID. RAID[3,5] are slower on writes
because they have to calculate the parity and write it to the parity disk.

One of the things you are failing to note is that different disks can
operate in parallel on an SMP box with good CPU interupt management. Two
writes to two different disks can take place at the same time. Two writes
to a single disk (or disk system) must happen serially.

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 14:47:20

> "Mohawksoft":
>
>> I forgot to specify that tablepaces should be on separate volumes.
>> (sorry)
>> If all they have is one volume, no worries, but instructing the use of
>> alternate volumes for system and data will improve performance by
>> separating WAL and data operations.
>
> ... and the place for this is the documentation, maybe with a nice script
> to
> help automate it.   Twisting users' arms will just get us a lot of angry
> e-mail.  Plus force separation of tablespaces is not appropriate for many
> kinds of installations:
> -- the 1MB 2-table database of someone's DVD collection;
> -- the 700GB database running off a $75,000 NAS (which appears to the OS
> as a
> single volume)
>
> Also, you're getting confused here ... Tablespaces has nothing to do with
> the
> location of pg_xlog.

I'm not "confused" but, it is an inverse logic thing. By persuading users
to create databases on a separate tablespace than the system (which
contains pg_xlog), you are by definition separating database and pg_xlog.


>
>> Tablespaces are a familiar construct to experienced DBAs who may not be
>> familiar with PostgreSQL. PostgreSQL being similar to other databases
>> will
>> have it better "make sense" to new users.
>
> I'll have to disagree with you there.  I used to be a certified MSSQL
> admin,
> and I can tell you that not one in 25 members of MSDN Database forum had
> any
> idea how to use the analogous feature on MSSQL -- despite it being
> operative
> since 1998.   So forcing new users to deal with tablespaces, even if they
> don't need them, is a great way to get new users to adopt MySQL.

That's food for thought. That's different than my experience. I've set up
a few MSSQL systems and recall it saying that you should create a
different database file from the system, but if you say that it will
confuse new users, that is something that should be considered.


>
>> So, the "preferred" installation procedure, i.e.
>> the one with the easy to follow directions, should showcase features the
>> user should know, and leave the user in a good place.
>
> No, the "preferred" newbie installation is the one that gets them up and
> running and playing with PostgreSQL in the minimum amount of time.  Setup
> is
> boring, confusing, and often frustrating, and each step we add to the
> required minimum setup loses us another dozen newbies who weren't sure if
> they are ready to upgrade from MS Access or MySQL.  Heck, for the CDs
> we're
> making to hand out at conventions, we're running PostgreSQL on Knoppix so
> that users don't have to install *anything*.

That's cool, PostgreSQL on knoppix? How do you do that? RAM disk?

>
> Now, if you want to add a "power user setup" to the Tutorial in our
> official
> docs, please do!   We could use more guides.   But don't force the guy
> with
> the personal DVD database to set things up like he's running Ameritrade.
>
> Also, consider that about half our users install from packages: RPMs and
> Deb
> packages (and soon MSI as well).   Those users aren't going to be going
> through a manual installation procedure at all, so your efforts to
> "educate"
> them through proper database setup won't get very far.

I agree that there is a section missing in the manual for this sort of
information.

>
>> IMHO, the user's
>> database on one volume and pg_xlog on another is a better starting
>> place.
>
> For some setups, yes.   For others, no.  It depends on your hardware and
> application.   And, as I said above, Tablespaces will not determine the
> location of pg_xlog AFAIK.

See above/.

>
>> BTW: Is there a public spec on what will be tablespace compatible and
>> how?
>> For instance: will is be possible to create a table on a separate
>> tablespace than the DB? Will it be possible to create an index on a
>> separate tablespace than the table?
>
> This is in the archives of this list.     The whole point of tablespaces
> is to
> allow placing individual tables and indexes on seperate volumes.   AFAIR,
> we're not that concerned about whole databases, which have always been
> easily
> re-locatable via symlinks and/or mounts.
>
> P.S. if you signed your e-mails, I'd stop calling you "mohawksoft".
>
> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

27 May 2004, 15:03:11

pgsql@mohawksoft.com wrote:

>
>As you can see, it takes a LOT longer. That's because the disk has to do a
>lot more work, the hardware data path has to carry twice as much data, and
>you have to manage multiple accesses to a single object, the disk.
>
>It doesn't take much imagination to see what this would mean for pg_xlog.
>  
>

Well, writing off-reality benchmarks on hardware setups I'd never 
suggest for db server usage proves... what?
Additionally, do you care about safety?

Regards,
Andreas

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 17:24:31

> pgsql@mohawksoft.com wrote:
>
>>
>>As you can see, it takes a LOT longer. That's because the disk has to do
>> a
>>lot more work, the hardware data path has to carry twice as much data,
>> and
>>you have to manage multiple accesses to a single object, the disk.
>>
>>It doesn't take much imagination to see what this would mean for pg_xlog.
>>
>>
>
> Well, writing off-reality benchmarks on hardware setups I'd never
> suggest for db server usage proves... what?
> Additionally, do you care about safety?
>
> Regards,
> Andreas
>

It was a demostration of the effect that multiple drives has. There is a
very real and measurable I/O bandwidth advantage to be gained by putting
concurrently accessed data on separate data channels. Any test that is
capable fo utilizing multiple I/O channels will show it.

This is not the place to really discuss this, and if you want to persue
this discussion, lets take it off line.

As for RAID5, it has its advantages, as does RAID0 and RAID3, and yes,
RAID1 and higher are pretty safe. A *good* RAID system goes for thousands
or tens of thousands of dollars. A 24x7 redundant storage system may not
be required by everyone. With new IDE with DMA interface cards, it should
be possible to create a very high performance system. A system which will
perform better if the pg_xlog is on a different disk than the data.

As for "do I care about safety?" Yes, yes I do, but "safe" and "paranoid"
are two different things. There is always a cost/benefit analysis for any
system. RAID5 and RAID3 are calculated risks. The bet is that no more than
one drive will fail at any one time. It is a good bet, but not 100%. I've
seen RAID systems have two drive failures at the same time, during the
auto-rebuild to the spare, a second drive dies. Game over.

Since RAID5 is a risk, maybe we should mirror, RAID0+1. That's a risk too,
what if the two mirrored drives die? Yikes, then you're screwed again.

If you want to be safe, you may want RAID5+1, where you mirror two RAID5
systems. That's really safe. You should have each RAID5 system on its own
controller with its own independent battery backed up redundant power
supply. That will be safe. Wait.. What about earth quakes? Shock mounted
racks. Lightning? Surge suppressors on the SCSI cables. Flood? Wrap it in
a zip-lock bag. Al Qaeda? Wrap it in a kevlar!!

There is a cost/benefit analysis for everything. RAIDs are very good and
reasonably safe devices, but the argument that the performance will be the
same as multiple individual disks (which are equivilent to the ones in the
RAID) preferably on different I/O channels is proveable nonsense and you
should know that.

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 19:01:34

> pgsql@mohawksoft.com wrote:
>
>>You are absolutely wrong on all accounts here. A RAID5 system is slower
>>than a single spindle as it is only as fast as the slowest disk in the
>>stripe and the overhead of the RAID.
>>
> Huh, what kind of controller do you use... Sounds like some "value" IDE
> one. I'd never suggest IDE raid5 for DBMS purposes anyway.

Actually, my RAID system, currently on my test system, is fully UWLVD SCSI
with fast spindles.

Here is a logical factual question for you to answer: how can a set of
disks, lets say 7, 6 data drives with one parity, deliver results faster
than the slowest drive in the stripe?

If you say predictive and intelligent caching, yea, maybe, but *all* disks
today have caching, but the initial request still has to wait for the
longest seek time across all spindles and the slowest spindle position.
I've been dealing with RAID systems for almost a decade now, and they are
not a magic bullet.

RAID systems are always slower than their compnent disks. This is the
drawback to using them and a fundimental limitation. A single disk will
average 1/2 spindle seek, assuming its initial head placement is random,
and average 1/2 spindle revolution to track, assuming no out of order
sector access. A RAID system has to wait for the slowest disk, thus while
a single disk can average 1/2 seek and rotation, two disks will not. So,
your raid disk access will ALWAYS be slower or as slow as a single disk
access not including the additional RAID processing.

The advantage to a RAID is that a number of smaller disks can look like a
big disk with some redundency. The advantage to a RAID controller is that
the RAID processing and parity generation overhead is done on an external
device. Using a RAID controller that presents a SCSI LUN is great because
you don't need to trust third party drivers. All in all, RAID is a good
idea, but it isn't faster.

As for IDE RAID, IDE RAID is an awesome idea. SCSI disks are just too
expensive. Infortrend has a cool IDE to SCSI or Fibre RAID system that
rocks.

Re: tablespaces and DB administration

From

Darcy Buskermolen

Date:

27 May 2004, 20:50:52

On May 27, 2004 01:38 pm, pgsql@mohawksoft.com wrote:
> > pgsql@mohawksoft.com wrote:
> >>As you can see, it takes a LOT longer. That's because the disk has to do
> >> a
> >>lot more work, the hardware data path has to carry twice as much data,
> >> and
> >>you have to manage multiple accesses to a single object, the disk.
> >>
> >>It doesn't take much imagination to see what this would mean for pg_xlog.
> >
> > Well, writing off-reality benchmarks on hardware setups I'd never
> > suggest for db server usage proves... what?
> > Additionally, do you care about safety?
> >
> > Regards,
> > Andreas
>
> It was a demostration of the effect that multiple drives has. There is a
> very real and measurable I/O bandwidth advantage to be gained by putting
> concurrently accessed data on separate data channels. Any test that is
> capable fo utilizing multiple I/O channels will show it.
>
> This is not the place to really discuss this, and if you want to persue
> this discussion, lets take it off line.
>
> As for RAID5, it has its advantages, as does RAID0 and RAID3, and yes,
> RAID1 and higher are pretty safe. A *good* RAID system goes for thousands
> or tens of thousands of dollars. A 24x7 redundant storage system may not
> be required by everyone. With new IDE with DMA interface cards, it should
> be possible to create a very high performance system. A system which will
> perform better if the pg_xlog is on a different disk than the data.
>
> As for "do I care about safety?" Yes, yes I do, but "safe" and "paranoid"
> are two different things. There is always a cost/benefit analysis for any
> system. RAID5 and RAID3 are calculated risks. The bet is that no more than
> one drive will fail at any one time. It is a good bet, but not 100%. I've
> seen RAID systems have two drive failures at the same time, during the
> auto-rebuild to the spare, a second drive dies. Game over.
>
> Since RAID5 is a risk, maybe we should mirror, RAID0+1. That's a risk too,
> what if the two mirrored drives die? Yikes, then you're screwed again.
>
> If you want to be safe, you may want RAID5+1, where you mirror two RAID5
> systems. That's really safe. You should have each RAID5 system on its own
> controller with its own independent battery backed up redundant power
> supply. That will be safe. Wait.. What about earth quakes? Shock mounted
> racks. Lightning? Surge suppressors on the SCSI cables. Flood? Wrap it in
> a zip-lock bag. Al Qaeda? Wrap it in a kevlar!!
>
> There is a cost/benefit analysis for everything. RAIDs are very good and
> reasonably safe devices, but the argument that the performance will be the
> same as multiple individual disks (which are equivilent to the ones in the
> RAID) preferably on different I/O channels is proveable nonsense and you
> should know that.

One other huge advantage that tablespaces will bring, it the ability to place 
data based on "cost" ie, you can put your 10 most used tables on fast disk 
(or perhaps solid state devices), and move the seldom used data off onto the 
slower (lower cost) disks/storage array. 



-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

27 May 2004, 22:02:34

> One other huge advantage that tablespaces will bring, it the ability to
> place
> data based on "cost" ie, you can put your 10 most used tables on fast disk
> (or perhaps solid state devices), and move the seldom used data off onto
> the
> slower (lower cost) disks/storage array.

Great idea.

Re: tablespaces and DB administration

From

Gaetano Mendola

Date:

28 May 2004, 06:51:39

pgsql@mohawksoft.com wrote:
> A single postgresql process may not see much benefit, because it does not
> do background I/O, 

Not yet, I believe that in 7.5 there is a component that do it.
Am I wrong ?

Regards
Gaetano Mendola

Re: tablespaces and DB administration

From

Marty Scholes

Date:

28 May 2004, 12:52:41

Josh Berkus wrote:
> The whole point of tablespaces is to> allow placing individual tables and indexes> on seperate volumes.

That was one reason.  I seem to recall several more:

* Putting data on cost appropriate media
Mentioned previously in this thread

* Balancing I/O across spindles
Also mentioned previously, many times

* Inode stability
The free space bitmap never changes during production.  The inode table 
never changes during production (save for last modified and last 
accessed timestamps).  This makes the filesystem VERY likely to be 
functional in case of a crash.  Jounaled file systems have reduced the 
need for this.

* Device independence
Since a tablespace needs only some "file names" and the ability to 
lseek() to any point in a file, a tablespace file can be a file on the 
UFS (or its variantes), a disk partition, or even a tape drive device.

Re: tablespaces and DB administration

From

John Hansen

Date:

31 May 2004, 17:24:31

On Fri, 2004-05-28 at 08:15, pgsql@mohawksoft.com wrote:
> > pgsql@mohawksoft.com wrote:
> >
> >>You are absolutely wrong on all accounts here. A RAID5 system is slower
> >>than a single spindle as it is only as fast as the slowest disk in the
> >>stripe and the overhead of the RAID.
> >>
> > Huh, what kind of controller do you use... Sounds like some "value" IDE
> > one. I'd never suggest IDE raid5 for DBMS purposes anyway.
> 
> Actually, my RAID system, currently on my test system, is fully UWLVD SCSI
> with fast spindles.
> 
> Here is a logical factual question for you to answer: how can a set of
> disks, lets say 7, 6 data drives with one parity, deliver results faster
> than the slowest drive in the stripe?
> 
> If you say predictive and intelligent caching, yea, maybe, but *all* disks
> today have caching, but the initial request still has to wait for the
> longest seek time across all spindles and the slowest spindle position.
> I've been dealing with RAID systems for almost a decade now, and they are
> not a magic bullet.
> 
> RAID systems are always slower than their compnent disks. This is the
> drawback to using them and a fundimental limitation. A single disk will
> average 1/2 spindle seek, assuming its initial head placement is random,
> and average 1/2 spindle revolution to track, assuming no out of order
> sector access. A RAID system has to wait for the slowest disk, thus while
> a single disk can average 1/2 seek and rotation, two disks will not. So,
> your raid disk access will ALWAYS be slower or as slow as a single disk
> access not including the additional RAID processing.
> 

Some high end SCSI drives comes with an option for using an external
source for spindle syncronization. These drives will thus not have to
wait for rotation, as head positions are aligned.

> The advantage to a RAID is that a number of smaller disks can look like a
> big disk with some redundency. The advantage to a RAID controller is that
> the RAID processing and parity generation overhead is done on an external
> device. Using a RAID controller that presents a SCSI LUN is great because
> you don't need to trust third party drivers. All in all, RAID is a good
> idea, but it isn't faster.
> 
> As for IDE RAID, IDE RAID is an awesome idea. SCSI disks are just too
> expensive. Infortrend has a cool IDE to SCSI or Fibre RAID system that
> rocks.
> 
Addonics has these too, I've been using them with great results.
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

31 May 2004, 17:28:50

pgsql@mohawksoft.com wrote:

>
>As for IDE RAID, IDE RAID is an awesome idea. SCSI disks are just too
>expensive. Infortrend has a cool IDE to SCSI or Fibre RAID system that
>rocks.
>  
>

Obviously, you're caught by those marketing geeks. You're taking 
bandwidth (MB/s)as performance index, which is irrelevant for database 
access. Limiting factor is average access time, and there's still no 3ms 
seek time ide disk. This is not a problem of the interface, it's just a 
fact that (for marketing reasons?) all server grade disks are not 
equipped with ide.
A good raid system will be able to have independend seeks issued on all 
disks in parallel, thus scaling by spindle number (only for parallel 
accessing processes of course, not for serialized access). What you're 
proposing is that the app should parallelize it, instead of leaving this 
to the instance that can (should) do this better.

Regards,
Andreas

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

31 May 2004, 17:37:49

> pgsql@mohawksoft.com wrote:
>
>>
>>As for IDE RAID, IDE RAID is an awesome idea. SCSI disks are just too
>>expensive. Infortrend has a cool IDE to SCSI or Fibre RAID system that
>>rocks.
>>
>>
>
> Obviously, you're caught by those marketing geeks. You're taking
> bandwidth (MB/s)as performance index, which is irrelevant for database
> access. Limiting factor is average access time, and there's still no 3ms
> seek time ide disk. This is not a problem of the interface, it's just a
> fact that (for marketing reasons?) all server grade disks are not
> equipped with ide.

Depending on your application,  IDE RAID is a very cost effective system.
Sometimes speed is not important.

> A good raid system will be able to have independend seeks issued on all
> disks in parallel, thus scaling by spindle number (only for parallel
> accessing processes of course, not for serialized access). What you're
> proposing is that the app should parallelize it, instead of leaving this
> to the instance that can (should) do this better.

I'm not suggesting this at all, and clearly you have not read what I
wrote. It is physically impossible for RAID to be faster than its
component disks. Period. To argue that a single RAID system is faster than
separate (comparable) disks managed independently is just not true. I have
even explained why.

Yes, RAID systems do scale by spindle, and seeks are issued in parallel,
but you STILL need to wait for all spindles to complete the operation.
Operations on a RAID system are at least as slow as the slowest disk.

What you are missing is that the RAID is dealing with the multiple drives
as one drive. Two operations have to happen serially, one after the other,
where as with separate disks, the two can happen simultaneously.

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

31 May 2004, 17:39:44

pgsql@mohawksoft.com wrote:

>
>What you are missing is that the RAID is dealing with the multiple drives
>as one drive. Two operations have to happen serially,
>

You're kidding or vastly underestimating raid controllers. The average 
db access is well served with a single block of data, stored on a single 
drive. Nicely parallelizable by a raid controller if it has a minimum of 
smartness.

Regards,
Andreas

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

31 May 2004, 17:44:16

pgsql@mohawksoft.com wrote:

>>pgsql@mohawksoft.com wrote:
>>
>>    
>>
>>>What you are missing is that the RAID is dealing with the multiple drives
>>>as one drive. Two operations have to happen serially,
>>>
>>>      
>>>
>>You're kidding or vastly underestimating raid controllers. The average
>>db access is well served with a single block of data, stored on a single
>>drive. Nicely parallelizable by a raid controller if it has a minimum of
>>smartness.
>>
>>    
>>
>
>The data contained on a RAID is spread across all the drives in the raid,
>is this not true?
>  
>
Data is spread *blockwise*, usually 32k or 64k blocks of data. This 
means, that typically 8 to 16 database blocks will reside on a *single* 
disk, with additional parity data on other disks.

>To access data on a drive, one must get the data off all of the drives at
>the same time, is this not true?
>
The data is usually completely on a single drive.

>
>If you perform two different operations on the RAID, you must access each
>RAID drive twice.
>
>If you perform different operations on multiple different drives, you can
>access the same amount of data as you would with the RAID, but have
>parallelized operations.
>
>This is a fact. It is *the* drawback to RAID system. If you do not
>understand this, then you do not understand RAID systems.
>
>  
>
You indicate clearly that it's you having strange opinions of raid 
controller/subsystem functionality executing multiple commands.

>Perform any benchmark you want. Take any RAID system you want. Or,
>actually, I have a factual reason why RAID systems perform worse than
>multiple single drives, I have written a quick program to show it. I have
>even double checked on my own RAID system here. 
>

As I said, the "benchmark" you wrote does by no means simulate DBMS 
access patterns, it might be good to show video streaming performance or 
so.
Please do read dbms disk io white papers, e.g. 
http://msdn.microsoft.com/archive/en-us/dnarsqlsg/html/sqlperftune.asp
Teaching hardware issues is OT for this list.

Regards,
Andreas

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

31 May 2004, 17:45:03

> pgsql@mohawksoft.com wrote:
>
>>
>>What you are missing is that the RAID is dealing with the multiple drives
>>as one drive. Two operations have to happen serially,
>>
>
> You're kidding or vastly underestimating raid controllers. The average
> db access is well served with a single block of data, stored on a single
> drive. Nicely parallelizable by a raid controller if it has a minimum of
> smartness.
>

The data contained on a RAID is spread across all the drives in the raid,
is this not true?

To access data on a drive, one must get the data off all of the drives at
the same time, is this not true? (yes there is flexibility with mirror and
ignoring parity on reads)

If you perform two different operations on the RAID, you must access each
RAID drive twice.

If you perform different operations on multiple different drives, you can
access the same amount of data as you would with the RAID, but have
parallelized operations.

This is a fact. It is *the* drawback to RAID system. If you do not
understand this, then you do not understand RAID systems.

Perform any benchmark you want. Take any RAID system you want. Or,
actually, I have a factual reason why RAID systems perform worse than
multiple single drives, I have written a quick program to show it. I have
even double checked on my own RAID system here. Now, your turn, show me
some facts that support your position, not just that "it should." Show me
how it can, show proof as to how a RAID system can be faster than its
component disks.

If you can't, I'm pretty sure I can safely ignore this part of the thread.

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

31 May 2004, 17:49:53

> pgsql@mohawksoft.com wrote:
>
>>>pgsql@mohawksoft.com wrote:
>>>
>>>
>>>
>>>>What you are missing is that the RAID is dealing with the multiple
>>>> drives
>>>>as one drive. Two operations have to happen serially,
>>>>
>>>>
>>>>
>>>You're kidding or vastly underestimating raid controllers. The average
>>>db access is well served with a single block of data, stored on a single
>>>drive. Nicely parallelizable by a raid controller if it has a minimum of
>>>smartness.
>>>
>>>
>>>
>>
>>The data contained on a RAID is spread across all the drives in the raid,
>>is this not true?
>>
>>
> Data is spread *blockwise*, usually 32k or 64k blocks of data. This
> means, that typically 8 to 16 database blocks will reside on a *single*
> disk, with additional parity data on other disks.

That may or may not be true depending on the RAID OEM, setup, and caching
parameters.

>
>>To access data on a drive, one must get the data off all of the drives at
>>the same time, is this not true?
>>
> The data is usually completely on a single drive.

That may or may not be true, and you *don't* know that because the RAID
shields you from it.

>
>>
>>If you perform two different operations on the RAID, you must access each
>>RAID drive twice.
>>
>>If you perform different operations on multiple different drives, you can
>>access the same amount of data as you would with the RAID, but have
>>parallelized operations.
>>
>>This is a fact. It is *the* drawback to RAID system. If you do not
>>understand this, then you do not understand RAID systems.
>>
>>
>>
> You indicate clearly that it's you having strange opinions of raid
> controller/subsystem functionality executing multiple commands.

Wait, it gets better.

>
>>Perform any benchmark you want. Take any RAID system you want. Or,
>>actually, I have a factual reason why RAID systems perform worse than
>>multiple single drives, I have written a quick program to show it. I have
>>even double checked on my own RAID system here.
>>
>
> As I said, the "benchmark" you wrote does by no means simulate DBMS
> access patterns, it might be good to show video streaming performance or
> so.
> Please do read dbms disk io white papers, e.g.
> http://msdn.microsoft.com/archive/en-us/dnarsqlsg/html/sqlperftune.asp
> Teaching hardware issues is OT for this list.

From the top of the very article you site:
"Archived content. No warranty is made as to technical accuracy"
Typical Microsoft hogwash, but they do have a few nuggets:

"Note   As a general rule of thumb, be sure to stripe across as many disks
as necessary to achieve solid performance. Windows NT/SQL Performance
Monitor will indicate if Windows NT disk I/O is bottlenecking on a
particular RAID array. Be ready to add disks and redistribute data across
RAID arrays and/or SCSI channels as necessary to balance disk I/O and
maximize performance."

They are suggesting that you use multiple RAID arrays or data channels.
Hmmm, sound familiar? Isn't that EXACTLY what I've been saying?

How about this heading title:
"Creating as Much Disk I/O Parallelism as Possible"
"Distinct disk I/O channels refer mainly to distinct sets of hard drives
or distinct RAID arrays, because hard drives are the most likely point of
disk I/O bottleneck. But also consider distinct sets of RAID or SCSI
controllers and distinct sets of PCI buses as ways to separate SQL Server
activity if additional RAID controllers and PCI buses are available."

Your own documents don't even support your claims.

Re: tablespaces and DB administration

From

Andreas Pflug

Date:

31 May 2004, 17:50:02

Dear anonymous,

This is really making me tired, and still OT. May anybody interested 
read the document you're citing abusively, or believe you that storage 
controllers are only capable of one command at a time or not.

Regards,
Andreas

Re: tablespaces and DB administration

From

pgsql@mohawksoft.com

Date:

31 May 2004, 17:54:14

> Dear anonymous,
>
> This is really making me tired, and still OT. May anybody interested
> read the document you're citing abusively, or believe you that storage
> controllers are only capable of one command at a time or not.

I would say this is totally off topic except that it does present opinions
of how to optimize a database.

What I find troubling, is you are not debating this with facts, you use
insults and try to impugn my opinion or abilities, rather than present a
solid reason why you hold the position you do. I know why I know what I
know, I've shown you examples, and explained why it acts as it does.

You present a Microsoft document that basically supports what I've been
saying, and accuse me of "citing abusively" (whatever the hell that
means.)

While I hate that this sort of exchange is on Hackers, the content, if
debated reasonably, should have been very useful for people stuggling with
these issues.