Re: tablespaces and DB administration - Mailing list pgsql-hackers

From Andreas Pflug
Subject Re: tablespaces and DB administration
Date
Msg-id 40B620FB.4090604@pse-consulting.de
Whole thread Raw
In response to Re: tablespaces and DB administration  (pgsql@mohawksoft.com)
Responses Re: tablespaces and DB administration
List pgsql-hackers
pgsql@mohawksoft.com wrote:

>>James Robinson wrote:
>>
>>    
>>
>>>>Users are primarily, if not stupid, ignorant. They will read the
>>>>absolute
>>>>minimum needed to achieve a goal and little else. I say this with the
>>>>utmost respect, because I and probably everyone else on this group is
>>>>guilty of the same thing. So, the "preferred" installation procedure,
>>>>i.e.
>>>>the one with the easy to follow directions, should showcase features
>>>>the
>>>>user should know, and leave the user in a good place. IMHO, the user's
>>>>database on one volume and pg_xlog on another is a better starting
>>>>place.
>>>>        
>>>>
>>>Yes, that is generally the case (prefer pg_xlog on separate spindle),
>>>but no
>>>need to *forcibly* overcomplicate things if the box has only one
>>>spindle,
>>>or if they have only one single RAID'd partition configured. We should
>>>continue to err on the side of keeping the path to a functional system
>>>nice and simple, yet still offering superb functionality. Oracle gets
>>>this
>>>wrong. pg_autovacuum is another good step in this direction.
>>>      
>>>
>>In the age of inexpensive RAID, tablespaces have more or less lost their
>>relevance regarding performance. pgsql's philosophy respects this by
>>leaving the storage work up to the OS and disk subsystem. Even having
>>the xlog on a different spindle won't help too much; you'll probably be
>>better off if you stuff all your spindles in one raid on most systems.
>>For worse, splitting two disks into separate storage areas to have xlog
>>separated would degrade safety for very little performance gain. So the
>>advise is: one disk, no alternative. 2 to 20 disks: use a single raid.
>>more disks: examine your access patterns carefully before you believe
>>you can do the job better than your raid controller.
>>
>>This leaves table spaces as a mere administrative feature, many (most)
>>installations will happily live without that.
>>
>>Regards,
>>Andreas
>>    
>>
>
>I would say that this is almost completely misinformed. Depending on the
>OS and the hardware, of course, a write on one spindle may not affect the
>performance of another.
>
>There are so many great things that happen when you have separate
>spindles. The OS manages the file systems separately, the device drivers
>may be separate, and if the low-level I/O device driver is even different,
>then you get your own bus mastering I/O buffers. All around good things
>happen when you have separate spindles.
>
>A single postgresql process may not see much benefit, because it does not
>do background I/O, but multiple postgresql processes will perform better
>because multiple I/O requests can be issued and processed simultaneously.
>If you got good SMP in your kernel, even better.
>
>  
>
There are good white papers about DB IO performance, e.g from Microsoft. 
They are not read very often...If you dedicate drives to services, it's your responsibility to size 
everything to have a balanced load. You'll probably end with some drives 
being the bottleneck, while others are still almost idle. That's why 
RAID shoud be used in the first and second place, it will distribute the 
workload on all spindles equally until saturated. The recommendation to 
use separate disks for this and that originates from ancient days where 
performance had to be achieved by application level programming and 
configuration, implementing own file systems on raw devices. pgsql 
deliberately doesn't work like this.

If you may use lets say 10 disks, you'd probably something similar like
2x RAID1 for /
2x RAID1 for /tmp + swap
2x RAID1 for xlog
4x RAID5 for data

I bet you get better performance with all disks in one RAID5, because 
now the system disks not only have no negative impact on DB transfer 
performance, but add additional seek bandwidth to DB traffic.

Regards,
Andreas






pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: tablespaces and DB administration
Next
From: pgsql@mohawksoft.com
Date:
Subject: Re: tablespaces and DB administration