Thread: Partitioning Advice
I have a postgres server I need to move onto a new OS (RHEL6) on a new VM and am looking for advice on how to partition the disks to gain some performance improvement.
In the current environment I am given a single VHDD which I have not partitioned at all. The SAN performance is pretty good, but we have noticed slowdowns at various times.. The database does a lot of logging - constant small writes, with some probably insignificant reads of smaller tables. Delays in logging can effect the service which is doing the logging and cause problems upstream. Typically this does not happen, but there are hourly jobs which generate stats from the logs. Due to their complexity the query planner always chooses to do sequential scans on the main log table. This table is truncated monthly when the data is archived to another table, but peaks in size at around 10GB at the end of the month. Generally any time the stats jobs are running there are delays which I would like to reduce/eliminate. There is also a fair bit of iowait on the cpu.
The new server has a great deal more memory which I am hoping will help (shared_buffers = 8GB, total RAM 20GB), but I am looking at what might be optimal for the storage configuration. From looking at previous conversations here I am thinking of something like this..
100GB OS (ext3)
50GB pg_xlog (ext2)
400GB pg_data (ext3 data=writeback noatime?)
Hopefully this would mean the small writes can continue while a large read is going. Currently there is no streaming replication so only a gig or so is actually needed for xlogs. We do however use slony to sync some smaller tables to a secondary which may or may not affect anything.
This is the first time I have needed to delve into the storage configuration before of a database before so any advice or comments welcome.
Ben
Ben Carbery wrote: > I have a postgres server I need to move onto a new OS (RHEL6) on a new VM and am looking for advice on > how to partition the disks to gain some performance improvement. > > In the current environment I am given a single VHDD which I have not partitioned at all. The SAN > performance is pretty good, but we have noticed slowdowns at various times.. The database does a lot > of logging - constant small writes, with some probably insignificant reads of smaller tables. Delays > in logging can effect the service which is doing the logging and cause problems upstream. Typically > this does not happen, but there are hourly jobs which generate stats from the logs. Due to their > complexity the query planner always chooses to do sequential scans on the main log table. This table > is truncated monthly when the data is archived to another table, but peaks in size at around 10GB at > the end of the month. Generally any time the stats jobs are running there are delays which I would > like to reduce/eliminate. There is also a fair bit of iowait on the cpu. > > The new server has a great deal more memory which I am hoping will help (shared_buffers = 8GB, total > RAM 20GB), but I am looking at what might be optimal for the storage configuration. From looking at > previous conversations here I am thinking of something like this.. > > 100GB OS (ext3) > 50GB pg_xlog (ext2) > 400GB pg_data (ext3 data=writeback noatime?) > > Hopefully this would mean the small writes can continue while a large read is going. Currently there > is no streaming replication so only a gig or so is actually needed for xlogs. We do however use slony > to sync some smaller tables to a secondary which may or may not affect anything. > > This is the first time I have needed to delve into the storage configuration before of a database > before so any advice or comments welcome. Since you are on RHEL 6 I would use ext4 throughout. You say you have I/O problems when "stats jobs" run. Can you describe those jobs and what they are doing? If you have a lot of sequential scans on a 10GB table, that will suck majorly no matter how you tune it. Two more things that you can try out: - Change the I/O scheduler to "deadline" by booting with "elevator=deadline". - Split the 400GB LUN into several smaller LUNs and use tablespaces. I don't say that that is guaranteed to help, but I have made good experiences with it. Yours, Laurenz Albe
Since you are on RHEL 6 I would use ext4 throughout.
Great, I'll look into it. However my research suggested a journalled filesystem is unnecessary for xlogs and I assume ext4 is journalled?
You say you have I/O problems when "stats jobs" run. Can you describe those jobs
and what they are doing?
They summarise all the activity in the logs since the last run, essentially counting logs. Since there are around 30 columns in each log and I want to summarise in a number of ways - think multiple but separate groups bys with a regexp to filter out some of the logs which are not interesting - inevitably there are several passes over the table. Of course regexps are also very slow in postgres. I found various indices did not help at all, query planner thinks sequential scan is the way to go.
If you have a lot of sequential scans on a 10GB table, that will suck majorly
no matter how you tune it.
Ok.. but they I don't care how long they take, only that they don't affect new writes to the table. So will splitting xlogs off into a different partition/lun etc help?
Two more things that you can try out:
- Change the I/O scheduler to "deadline" by booting with "elevator=deadline".
I'll check it out.
- Split the 400GB LUN into several smaller LUNs and use tablespaces.
This could be worth doing on the active logging table if splitting off xlogs won't help. I archive (copy) these records off to the a new table every month and then use inheritance to access the entire logset..
log_master
..inherits
log_active
log_201205
log_201204
..etc
This is how I got the active table down to 10GB :)
I don't say that that is guaranteed to help, but I have made good experiences
with it.
Yours,
Laurenz Albe
thanks,
Ben
On 06/06/2012 01:07 AM, Ben Carbery wrote:
Latency on ext3 is better on RHEL6 than earlier versions, but it's still hard to get to keep it low with that filesystem. You should consider ext4 or xfs instead if you're already running into slow periods limited by disk I/O.
Large values of shared_buffers can also make write latency spikes worse, particularly when the underlying storage isn't very capable--which is likely to be the case in a VM environment. Most of the performance gain is from going from the tiny default (<=32MB) for shared_buffers to a moderate size. You'll probably get most of the performance gain setting that to around 1GB instead, and the worst case performance might improve.
If you already are seeing problems on your existing server, there are two things you could do to monitor what's going on:
-Turn on log_checkpoints on the server. If you see high numbers for the "sync=" section, that normally narrows your problem very specifically to the database's background checkpoints.
-Watch /proc/meminfo , specificially the "Dirty:" number. If that number gets very high during the same periods the slowdowns happen at, it might be possible to make things better by decreasing the amount of caching Linux does. There's some intro material on that subject at http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html and http://blog.2ndquadrant.com/tuning_linux_for_low_postgresq/ (note that some of the links in that second one, to the test pgbench results, are broken; http://www.highperfpostgres.com/pgbench-results/index.htm is the right URL now)
The new server has a great deal more memory which I am hoping will help (shared_buffers = 8GB, total RAM 20GB), but I am looking at what might be optimal for the storage configuration. From looking at previous conversations here I am thinking of something like this..100GB OS (ext3)50GB pg_xlog (ext2)400GB pg_data (ext3 data=writeback noatime?)Hopefully this would mean the small writes can continue while a large read is going.
Latency on ext3 is better on RHEL6 than earlier versions, but it's still hard to get to keep it low with that filesystem. You should consider ext4 or xfs instead if you're already running into slow periods limited by disk I/O.
Large values of shared_buffers can also make write latency spikes worse, particularly when the underlying storage isn't very capable--which is likely to be the case in a VM environment. Most of the performance gain is from going from the tiny default (<=32MB) for shared_buffers to a moderate size. You'll probably get most of the performance gain setting that to around 1GB instead, and the worst case performance might improve.
If you already are seeing problems on your existing server, there are two things you could do to monitor what's going on:
-Turn on log_checkpoints on the server. If you see high numbers for the "sync=" section, that normally narrows your problem very specifically to the database's background checkpoints.
-Watch /proc/meminfo , specificially the "Dirty:" number. If that number gets very high during the same periods the slowdowns happen at, it might be possible to make things better by decreasing the amount of caching Linux does. There's some intro material on that subject at http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html and http://blog.2ndquadrant.com/tuning_linux_for_low_postgresq/ (note that some of the links in that second one, to the test pgbench results, are broken; http://www.highperfpostgres.com/pgbench-results/index.htm is the right URL now)
-- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
On 2012-06-06, Ben Carbery <ben.carbery@gmail.com> wrote: > In the current environment I am given a single VHDD which I have not > partitioned at all. The SAN performance is pretty good, but we have noticed > slowdowns at various times.. The database does a lot of logging - constant > small writes, with some probably insignificant reads of smaller tables. > Delays in logging can effect the service which is doing the logging and > cause problems upstream. Typically this does not happen, but there are > hourly jobs which generate stats from the logs. Due to their complexity the > query planner always chooses to do sequential scans on the main log > table. This table is truncated monthly when the data is archived to another > table, but peaks in size at around 10GB at the end of the month. Generally > any time the stats jobs are running there are delays which I would like to > reduce/eliminate. There is also a fair bit of iowait on the cpu. do many of the statistical queries use the whole month's data? have you considered partitioning the log table by day? if you can reduce the number of rows involved in the table-scans by partitioning it'll be help performance. -- ⚂⚃ 100% natural
do many of the statistical queries use the whole month's data?
have you considered partitioning the log table by day? if you can
reduce the number of rows involved in the table-scans by partitioning
it'll be help performance.
I am summarising by hour, day, week and month. So I guess partitioning by day would help for the daily and hourly summaries.
Thanks for the suggestions everyone, I'll look into all these ideas.
cheers