Thread: High cpu usage after many inserts
Hi list,
We are running postgresql 8.3.5 and are trying to stress test our LMS.
The problem is when our stress tester (Jmeter) inserts around 10,000 rows (in 3 hours) over 2 tables (5000 rows each table) the CPU of the sql server hits 100% over all 4 cores for all future inserts.
I have tried numerous things to get the cpu back down but so far the only thing that works is deleting the 10,000 rows Jmeter inserted.
For more information on the problem along with a time stamped list of test results and outcomes please see http://spreadsheets.google.com/pub?key=pu_k0R6vNvOVP26TRZdtdYw
Any help would be appreciated
Regards,
Jordan Tomkinson
System Administrator
Moodle HQ
We are running postgresql 8.3.5 and are trying to stress test our LMS.
The problem is when our stress tester (Jmeter) inserts around 10,000 rows (in 3 hours) over 2 tables (5000 rows each table) the CPU of the sql server hits 100% over all 4 cores for all future inserts.
I have tried numerous things to get the cpu back down but so far the only thing that works is deleting the 10,000 rows Jmeter inserted.
For more information on the problem along with a time stamped list of test results and outcomes please see http://spreadsheets.google.com/pub?key=pu_k0R6vNvOVP26TRZdtdYw
Any help would be appreciated
Regards,
Jordan Tomkinson
System Administrator
Moodle HQ
On Sun, Feb 22, 2009 at 11:55 PM, Jordan Tomkinson <jordan@moodle.com> wrote: > Hi list, > > We are running postgresql 8.3.5 and are trying to stress test our LMS. > The problem is when our stress tester (Jmeter) inserts around 10,000 rows > (in 3 hours) over 2 tables (5000 rows each table) the CPU of the sql server > hits 100% over all 4 cores for all future inserts. > > I have tried numerous things to get the cpu back down but so far the only > thing that works is deleting the 10,000 rows Jmeter inserted. > > For more information on the problem along with a time stamped list of test > results and outcomes please see > http://spreadsheets.google.com/pub?key=pu_k0R6vNvOVP26TRZdtdYw Can you post the jmeter files? OR create a SQL test case? I haven't had this problem myself, so I'm guessing something in your method or something in your schema is setting something strange off. OR the background writer is busy writing all the changes out after the fact while the database is breathing from the heavy run. 10,000 rows over three hours isn't really a whole lotta work unless those are really wide rows. Oh, what is an LMS?
On Sun, Feb 22, 2009 at 11:55 PM, Jordan Tomkinson <jordan@moodle.com> wrote: > Hi list, > > We are running postgresql 8.3.5 and are trying to stress test our LMS. > The problem is when our stress tester (Jmeter) inserts around 10,000 rows > (in 3 hours) over 2 tables (5000 rows each table) the CPU of the sql server > hits 100% over all 4 cores for all future inserts. And just to clarify, this is user / system CPU usage, not IO wait, right?
One last thing. You were doing vacuum fulls but NOT reindexing, right? I quote from the document at google docs: 13:50:00 vacuum full & analyze on all databases through pgadmin 1: Do you have evidence that regular autovacuum isn't keeping up? 2: If you have such evidence, and you have to vacuum full, vacuum full doesn't really shrink indexes all that well. For a heavily updated database, the 1, 2, 3 punch of autovacuum (adjusted properly!), the background writer (adjusted properly) smoothing things out, and the HOT updates reusing all that space autovacuum is constantly reclaiming, meaning you should be able to avoid routine vacuum fulls. It's made a huge difference in db maintenance for me. Still I do find myself in vacuum full territory once or twice a year (rogue update or something like that on a live database). If you do have to vacuum full then reindex. OR cluster on your favorite index.
On Mon, Feb 23, 2009 at 4:03 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Sun, Feb 22, 2009 at 11:55 PM, Jordan Tomkinson <jordan@moodle.com> wrote:> Hi list,And just to clarify, this is user / system CPU usage, not IO wait, right?
>
> We are running postgresql 8.3.5 and are trying to stress test our LMS.
> The problem is when our stress tester (Jmeter) inserts around 10,000 rows
> (in 3 hours) over 2 tables (5000 rows each table) the CPU of the sql server
> hits 100% over all 4 cores for all future inserts.
I am unable to post the jmeter file as it contains sensitive user/pass details, but they simply login to a forum and create a new forum post, then logout.
SQL wise this performs several SELECT's and 3 INSERT'S over 3 different tables.
How does one create an SQL test case?
LMS is Learning Management System, in this case Moodle (moodle.org)
Yes this is user space CPU usage.
Running iostat -k 2 shows:
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 31.50 0.00 456.00 0 912
so not alot of disk writes.
On Mon, Feb 23, 2009 at 4:08 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
One last thing. You were doing vacuum fulls but NOT reindexing, right?
I quote from the document at google docs:
13:50:00 vacuum full & analyze on all databases through pgadmin
1: Do you have evidence that regular autovacuum isn't keeping up?
2: If you have such evidence, and you have to vacuum full, vacuum full
doesn't really shrink indexes all that well.
For a heavily updated database, the 1, 2, 3 punch of autovacuum
(adjusted properly!), the background writer (adjusted properly)
smoothing things out, and the HOT updates reusing all that space
autovacuum is constantly reclaiming, meaning you should be able to
avoid routine vacuum fulls. It's made a huge difference in db
maintenance for me.
Still I do find myself in vacuum full territory once or twice a year
(rogue update or something like that on a live database). If you do
have to vacuum full then reindex. OR cluster on your favorite index.
I have no evidence of autovacuum not working, the manual full was done for purpose of elimination.
On Mon, Feb 23, 2009 at 12:18 AM, Jordan Tomkinson <jordan@moodle.com> wrote: > > > On Mon, Feb 23, 2009 at 4:08 PM, Scott Marlowe <scott.marlowe@gmail.com> > wrote: >> >> One last thing. You were doing vacuum fulls but NOT reindexing, right? >> >> I quote from the document at google docs: >> 13:50:00 vacuum full & analyze on all databases through pgadmin >> >> 1: Do you have evidence that regular autovacuum isn't keeping up? >> 2: If you have such evidence, and you have to vacuum full, vacuum full >> doesn't really shrink indexes all that well. >> >> For a heavily updated database, the 1, 2, 3 punch of autovacuum >> (adjusted properly!), the background writer (adjusted properly) >> smoothing things out, and the HOT updates reusing all that space >> autovacuum is constantly reclaiming, meaning you should be able to >> avoid routine vacuum fulls. It's made a huge difference in db >> maintenance for me. >> >> Still I do find myself in vacuum full territory once or twice a year >> (rogue update or something like that on a live database). If you do >> have to vacuum full then reindex. OR cluster on your favorite index. > > I have no evidence of autovacuum not working, the manual full was done for > purpose of elimination. Oh, ok. If you're trying to make a fair benchmark, you should probably reindex after vacuum full.
Oh yeah, what OS is this? Version and all that.
On Mon, Feb 23, 2009 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
Oh yeah, what OS is this? Version and all that.
Red Hat Enterprise Linux 5.3 x64 kernel 2.6.18-128.el5
os and hardware details are in the google spreadsheet, you might have to refresh it.
Im working on getting the SQL log for you now.
On Mon, Feb 23, 2009 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
Oh yeah, what OS is this? Version and all that.
I should probably clarify that the high cpu only exists while the jmeter tests are running, once the tests are finished the cpu returns to 0% (this isnt a production server yet, so no other queries other than my tests)
I have not yet tried other SQL queries to see if they are affected, i suspect it may only be related to the two forum tables the test focuses on but I may be incorrect - the database is filling up with data again now so I can test this tomorrow.
Scott,
DB Schema: http://demo.moodle.org/db_schema.txt
SQL Query log: http://demo.moodle.org/querylog.txt
There are _much_ more queries than I anticipated :/
Jordan
DB Schema: http://demo.moodle.org/db_schema.txt
SQL Query log: http://demo.moodle.org/querylog.txt
There are _much_ more queries than I anticipated :/
Jordan
On Mon, Feb 23, 2009 at 12:42 AM, Jordan Tomkinson <jordan@moodle.com> wrote: > > > On Mon, Feb 23, 2009 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com> > wrote: >> >> Oh yeah, what OS is this? Version and all that. > > I should probably clarify that the high cpu only exists while the jmeter > tests are running, once the tests are finished the cpu returns to 0% (this > isnt a production server yet, so no other queries other than my tests) > I have not yet tried other SQL queries to see if they are affected, i > suspect it may only be related to the two forum tables the test focuses on > but I may be incorrect - the database is filling up with data again now so I > can test this tomorrow. Sorry, I had gotten the impression the CPU usage continued after the test. That it's 100% during the test is quite understandable. So does it start lower than 4x100% Then climb during the tests? Is the throughput dropping off over time?
On Mon, Feb 23, 2009 at 5:20 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
Sorry, I had gotten the impression the CPU usage continued after theOn Mon, Feb 23, 2009 at 12:42 AM, Jordan Tomkinson <jordan@moodle.com> wrote:
>
>
> On Mon, Feb 23, 2009 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com>
> wrote:
>>
>> Oh yeah, what OS is this? Version and all that.
>
> I should probably clarify that the high cpu only exists while the jmeter
> tests are running, once the tests are finished the cpu returns to 0% (this
> isnt a production server yet, so no other queries other than my tests)
> I have not yet tried other SQL queries to see if they are affected, i
> suspect it may only be related to the two forum tables the test focuses on
> but I may be incorrect - the database is filling up with data again now so I
> can test this tomorrow.
test. That it's 100% during the test is quite understandable. So
does it start lower than 4x100% Then climb during the tests? Is the
throughput dropping off over time?
As per the spreadsheet (http://spreadsheets.google.com/pub?key=pu_k0R6vNvOVP26TRZdtdYw) CPU usage is around 50% and starts climbing over 3 hours until we have just under 10,000 rows of data then stays at 99% for the duration of all future tests.
Once the rows are removed the tests start back down at 50% usage again.
On Mon, Feb 23, 2009 at 1:29 AM, Jordan Tomkinson <jordan@moodle.com> wrote: > > On Mon, Feb 23, 2009 at 5:20 PM, Scott Marlowe <scott.marlowe@gmail.com> > wrote: >> >> On Mon, Feb 23, 2009 at 12:42 AM, Jordan Tomkinson <jordan@moodle.com> >> wrote: >> > >> > >> > On Mon, Feb 23, 2009 at 4:29 PM, Scott Marlowe <scott.marlowe@gmail.com> >> > wrote: >> >> >> >> Oh yeah, what OS is this? Version and all that. >> > >> > I should probably clarify that the high cpu only exists while the jmeter >> > tests are running, once the tests are finished the cpu returns to 0% >> > (this >> > isnt a production server yet, so no other queries other than my tests) >> > I have not yet tried other SQL queries to see if they are affected, i >> > suspect it may only be related to the two forum tables the test focuses >> > on >> > but I may be incorrect - the database is filling up with data again now >> > so I >> > can test this tomorrow. >> >> Sorry, I had gotten the impression the CPU usage continued after the >> test. That it's 100% during the test is quite understandable. So >> does it start lower than 4x100% Then climb during the tests? Is the >> throughput dropping off over time? > > As per the spreadsheet > (http://spreadsheets.google.com/pub?key=pu_k0R6vNvOVP26TRZdtdYw) CPU usage > is around 50% and starts climbing over 3 hours until we have just under > 10,000 rows of data then stays at 99% for the duration of all future tests. > Once the rows are removed the tests start back down at 50% usage again. Oh, ok. well that's pretty normal as the indexes grow large enough to not fit in cache, then not fit in memory, etc... Are you noticing a sharp dropoff in performance?
Hi, Scott Marlowe wrote: > Oh, what is an LMS? A Learning Management System, not to be confused with a CMS, which might also stand for a Course Management System ;-) Regards Markus Wanner
On Tue, Feb 24, 2009 at 12:35 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
Oh, ok. well that's pretty normal as the indexes grow large enough toOn Mon, Feb 23, 2009 at 1:29 AM, Jordan Tomkinson <jordan@moodle.com> wrote:
> As per the spreadsheet
> (http://spreadsheets.google.com/pub?key=pu_k0R6vNvOVP26TRZdtdYw) CPU usage
> is around 50% and starts climbing over 3 hours until we have just under
> 10,000 rows of data then stays at 99% for the duration of all future tests.
> Once the rows are removed the tests start back down at 50% usage again.
not fit in cache, then not fit in memory, etc... Are you noticing a
sharp dropoff in performance?
Again as per the spreadsheet, you can see the tests normally take around 130 seconds to complete, but after many rows are inserted they start to take upwards of 500 seconds. I can leave the server idle for days (over the weekend infact) and start a new test to reproduce the results (500+ seconds), so i dont think its a delayed write issue (surely 2 days is enough?)
What configuration options should I be looking at to make sure it fits in the cache? I have 8GB available and no matter how much I tweak i cannot get it using any more than 2GB. the DB is almost 4GB in size on disk and as this is a dedicated sql server for just 1 database, id really like to have the whole thing in memory if possible.
On Mon, 23 Feb 2009, Scott Marlowe wrote: > well that's pretty normal as the indexes grow large enough to not fit in > cache, then not fit in memory, etc... Right, the useful thing to do in this case is to take a look at how big all the relations (tables, indexes) involved are at each of the steps in the process. The script at http://wiki.postgresql.org/wiki/Disk_Usage will show you that. That will give some feedback on whether the vacuum/reindex methodology is really doing what you expect, and it will also let you compare the size of the table/index with how much RAM is in the system. Have you done any tuning of the postgresql.conf file? If you haven't increased shared_buffers substantially, you could be seeing buffer cache churn as the CPU spends all its time shuffling buffers between PostgreSQL and the OS once the working set involved exceeds around 32MB. Shouldn't someone have ranted about RAID-5 by this point in the thread? -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Tue, Feb 24, 2009 at 11:42 AM, Greg Smith <gsmith@gregsmith.com> wrote:
On Mon, 23 Feb 2009, Scott Marlowe wrote:Right, the useful thing to do in this case is to take a look at how big all the relations (tables, indexes) involved are at each of the steps in the process. The script at http://wiki.postgresql.org/wiki/Disk_Usage will show you that. That will give some feedback on whether the vacuum/reindex methodology is really doing what you expect, and it will also let you compare the size of the table/index with how much RAM is in the system.well that's pretty normal as the indexes grow large enough to not fit in cache, then not fit in memory, etc...
Have you done any tuning of the postgresql.conf file? If you haven't increased shared_buffers substantially, you could be seeing buffer cache churn as the CPU spends all its time shuffling buffers between PostgreSQL and the OS once the working set involved exceeds around 32MB.
Shouldn't someone have ranted about RAID-5 by this point in the thread?
--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Right, i have done some more testing and I think its pretty conclusive.
1. Start with a known good copy of the database (some 3gb in size)
2. Run the Jmeter tests until ~7000 new rows are inserted equally over 3 tables. At this point performance goes to hell
3. delete the ~7000 rows from the db without re-indexing, (manually) analyzing or anything of the sort.
4. performance instantly returns to that of before the tests began (optimum).
So im thinking as Scott said it could be buffer/cache size filling up?
Here is my postgresql.conf, perhaps someone can make a few pointers.
The hardware is a Quad Xeon 2.0GHZ with 8GB RAM and 15K RPM SAS drives in RAID 5 (i know raid 5, dont tell me)
max_connections = 400
shared_buffers = 2048MB
temp_buffers = 8MB
max_prepared_transactions = 10
work_mem = 8MB
maintenance_work_mem = 128MB
max_stack_depth = 4MB
vacuum_cost_delay = 0
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 200
bgwriter_delay = 200ms
bgwriter_lru_maxpages = 100
bgwriter_lru_multiplier = 2.0
fsync = on
synchronous_commit = on
wal_sync_method = fsync
full_page_writes = on
wal_buffers = 128kB
wal_writer_delay = 200ms
commit_delay = 0
commit_siblings = 5
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
track_counts = on
log_parser_stats = off
log_planner_stats = off
log_executor_stats = off
log_statement_stats = off
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_max_workers = 3
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 50
autovacuum_analyze_threshold = 50
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_freeze_max_age = 200000000
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = -1
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
escape_string_warning = off
On Tue, Feb 24, 2009 at 11:42 AM, Greg Smith <gsmith@gregsmith.com> wrote:
Right, the useful thing to do in this case is to take a look at how big all the relations (tables, indexes) involved are at each of the steps in the process. The script at http://wiki.postgresql.org/wiki/Disk_Usage will show you that. That will give some feedback on whether the vacuum/reindex methodology is really doing what you expect, and it will also let you compare the size of the table/index with how much RAM is in the system.
taken before the ~7000 rows were entered.
relation | size
-------------------------------+--------
public.mdl_log | 595 MB
public.mdl_forum_posts | 375 MB
public.mdl_log_coumodact_ix | 197 MB
public.mdl_user | 191 MB
public.mdl_cache_text | 162 MB
public.mdl_log_usecou_ix | 137 MB
public.mdl_log_act_ix | 119 MB
public.mdl_log_cmi_ix | 97 MB
public.mdl_log_tim_ix | 97 MB
public.mdl_log_id_pk | 97 MB
public.mdl_question_states | 48 MB
public.mdl_stats_user_daily | 48 MB
public.mdl_hotpot_responses | 47 MB
public.mdl_register_downloads | 45 MB
public.mdl_message_read | 37 MB
public.mdl_course_display | 37 MB
public.mdl_stats_user_weekly | 31 MB
public.mdl_mnet_log | 27 MB
public.mdl_user_ema_ix | 26 MB
public.mdl_regidown_url_ix | 23 MB
(20 rows)
On Tue, Feb 24, 2009 at 12:40 AM, Jordan Tomkinson <jordan@moodle.com> wrote: > > > On Tue, Feb 24, 2009 at 11:42 AM, Greg Smith <gsmith@gregsmith.com> wrote: >> >> Right, the useful thing to do in this case is to take a look at how big >> all the relations (tables, indexes) involved are at each of the steps in the >> process. The script at http://wiki.postgresql.org/wiki/Disk_Usage will show >> you that. That will give some feedback on whether the vacuum/reindex >> methodology is really doing what you expect, and it will also let you >> compare the size of the table/index with how much RAM is in the system. >> > > taken before the ~7000 rows were entered. > > relation | size > -------------------------------+-------- > public.mdl_log | 595 MB > public.mdl_forum_posts | 375 MB > public.mdl_log_coumodact_ix | 197 MB > public.mdl_user | 191 MB > public.mdl_cache_text | 162 MB > public.mdl_log_usecou_ix | 137 MB > public.mdl_log_act_ix | 119 MB > public.mdl_log_cmi_ix | 97 MB > public.mdl_log_tim_ix | 97 MB > public.mdl_log_id_pk | 97 MB > public.mdl_question_states | 48 MB > public.mdl_stats_user_daily | 48 MB > public.mdl_hotpot_responses | 47 MB > public.mdl_register_downloads | 45 MB > public.mdl_message_read | 37 MB > public.mdl_course_display | 37 MB > public.mdl_stats_user_weekly | 31 MB > public.mdl_mnet_log | 27 MB > public.mdl_user_ema_ix | 26 MB > public.mdl_regidown_url_ix | 23 MB What's more interesting is how quickly they grow during your test. I'm betting that as public.mdl_log and public.mdl_forum_posts grow, you get a dataset larger than memory. There are two levels of caching that pgsql uses, the highest and closest to pgsql is the shared_buffer cache, and the next is the kernel level file system cache. While it's still way faster to hit the kernel level of file cache than to hit the actual hard drives, the pg shared_buffers is the fastest. You may be in a situation where giving a bit more memory to pg will help, but with a 4G dataset and 8G of ram you're cutting it close. You need a few gig for sorts and processes and such like that. Going to 16Gig you could set shared_buffers at somewhere in the 4 to 8Gig range and it might work out. If you're looking at scaling to large amounts of data, you can't plan on it all fitting into memory, and you have to start planning for faster Disk I/O. This means more disks, fast RAID controllers with optional battery backed cache (not really optional) and / or kernel level RAID, for read mostly stuff it's quite fast. As expensive as 16 or 24 or 32 fast hard drives are, they're cheaper than servers with a half terabyte of ram or whatever you'd need for a big dataset. First things first I'd try increasing shared_buffers to the just over 4G range. I'd check after each run with vacuum verbose (NOT FULL) to see how bloated my db was getting.
* Greg Smith <gsmith@gregsmith.com> [090201 00:00]: > Shouldn't someone have ranted about RAID-5 by this point in the thread? What? Sorry, I wasn't paying attention... You mean someone's actually still using RAID-5? ;-) -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
Attachment
On Wed, Feb 25, 2009 at 12:05 AM, Aidan Van Dyk <aidan@highrise.ca> wrote:
What exactly is wrong with RAID5 and what should we have gone with?
* Greg Smith <gsmith@gregsmith.com> [090201 00:00]:What? Sorry, I wasn't paying attention...
> Shouldn't someone have ranted about RAID-5 by this point in the thread?
You mean someone's actually still using RAID-5?
;-)
What exactly is wrong with RAID5 and what should we have gone with?
On Wed, 2009-02-25 at 09:21 +0900, Jordan Tomkinson wrote: > > On Wed, Feb 25, 2009 at 12:05 AM, Aidan Van Dyk <aidan@highrise.ca> > wrote: > * Greg Smith <gsmith@gregsmith.com> [090201 00:00]: > > > Shouldn't someone have ranted about RAID-5 by this point in > the thread? > > > What? Sorry, I wasn't paying attention... > > You mean someone's actually still using RAID-5? > > ;-) > > What exactly is wrong with RAID5 and what should we have gone with? RAID5 outside of RAID 0 is the worst possible RAID level to run with a database. (of the commonly used raid level's that is). It is very, very slow on random writes which is what databases do. Switch to RAID 10. Sincerely, Joshua D. Drkae > > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
--- On Wed, 2/25/09, Jordan Tomkinson <jordan@moodle.com> wrote: <...> > What exactly is wrong with RAID5 and what should we have > gone with? RAID10 is often used. As others have pointed out, it is very slow for random writes. It also has issues that expose yourdata to total loss, see for instance <http://www.miracleas.com/BAARF/RAID5_versus_RAID10.txt>. HTH, Greg Williamson
On Wed, Feb 25, 2009 at 9:23 AM, Joshua D. Drake <jd@commandprompt.com> wrote:
surely being (real) hardware raid with 15k rpm disks this wouldn't be a huge issue unless a large amount of data was being written ?
RAID5 outside of RAID 0 is the worst possible RAID level to run with a
database. (of the commonly used raid level's that is).
It is very, very slow on random writes which is what databases do.
Switch to RAID 10.
surely being (real) hardware raid with 15k rpm disks this wouldn't be a huge issue unless a large amount of data was being written ?
On Tue, Feb 24, 2009 at 5:21 PM, Jordan Tomkinson <jordan@moodle.com> wrote: > > On Wed, Feb 25, 2009 at 12:05 AM, Aidan Van Dyk <aidan@highrise.ca> wrote: >> >> * Greg Smith <gsmith@gregsmith.com> [090201 00:00]: >> >> > Shouldn't someone have ranted about RAID-5 by this point in the thread? >> >> What? Sorry, I wasn't paying attention... >> >> You mean someone's actually still using RAID-5? >> >> ;-) > > What exactly is wrong with RAID5 and what should we have gone with? RAID 5 is only suitable for situations where you need maximum storage for minimum cost and the database is mostly / all read all the time. Like large reporting databases. It's slow on writes, and it has a low tolerance for dead drives (2 and it's all gone) HOWEVER. RAID-10, which is theoretically MUCH better, is only better if it's implemented right, and lot of cheap RAID controllers don't do any better running RAID-10. Many of these can be put into JBOD mode where you do RAID-10 in the kernel, or you can do RAID-1 on the card (x sets) And RAID-0 in the kernel. RAID-10 is almost always the right choice when you're buying good controllers and fast drives and you want maximum performance. If you REALLY need a lot of storage, and you have to use something like RAID 5 at least look at RAID 6.
On Wed, 2009-02-25 at 09:44 +0900, Jordan Tomkinson wrote: > > > On Wed, Feb 25, 2009 at 9:23 AM, Joshua D. Drake > <jd@commandprompt.com> wrote: > > > RAID5 outside of RAID 0 is the worst possible RAID level to > run with a > database. (of the commonly used raid level's that is). > > It is very, very slow on random writes which is what databases > do. > Switch to RAID 10. > > surely being (real) hardware raid with 15k rpm disks this wouldn't be > a huge issue unless a large amount of data was being written ? Tests done by Mark Wong on a 3 disk 15k scsi versus 4 disk raid 10 scsi show that RAID 10 is on average 30% faster. Sincerely, Joshua D. Drake > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On Tue, Feb 24, 2009 at 4:40 PM, Jordan Tomkinson <jordan@moodle.com> wrote:
taken before the ~7000 rows were entered.
relation | size
-------------------------------+--------
public.mdl_log | 595 MB
public.mdl_forum_posts | 375 MB
public.mdl_log_coumodact_ix | 197 MB
public.mdl_user | 191 MB
public.mdl_cache_text | 162 MB
public.mdl_log_usecou_ix | 137 MB
public.mdl_log_act_ix | 119 MB
public.mdl_log_cmi_ix | 97 MB
public.mdl_log_tim_ix | 97 MB
public.mdl_log_id_pk | 97 MB
public.mdl_question_states | 48 MB
public.mdl_stats_user_daily | 48 MB
public.mdl_hotpot_responses | 47 MB
public.mdl_register_downloads | 45 MB
public.mdl_message_read | 37 MB
public.mdl_course_display | 37 MB
public.mdl_stats_user_weekly | 31 MB
public.mdl_mnet_log | 27 MB
public.mdl_user_ema_ix | 26 MB
public.mdl_regidown_url_ix | 23 MB
(20 rows)
Taken after 9000 rows entered, by this stage performance is terrible.
relation | size
-------------------------------+--------
public.mdl_log | 597 MB
public.mdl_forum_posts | 389 MB
public.mdl_log_coumodact_ix | 198 MB
public.mdl_user | 193 MB
public.mdl_cache_text | 162 MB
public.mdl_log_usecou_ix | 137 MB
public.mdl_log_act_ix | 119 MB
public.mdl_log_cmi_ix | 98 MB
public.mdl_log_tim_ix | 97 MB
public.mdl_log_id_pk | 97 MB
public.mdl_question_states | 48 MB
public.mdl_stats_user_daily | 48 MB
public.mdl_hotpot_responses | 47 MB
public.mdl_register_downloads | 45 MB
public.mdl_message_read | 37 MB
public.mdl_course_display | 37 MB
public.mdl_stats_user_weekly | 31 MB
public.mdl_mnet_log | 27 MB
public.mdl_user_ema_ix | 26 MB
public.mdl_regidown_url_ix | 23 MB
(20 rows)
as you can see, the tables arent growing by much (only a few mb) so is this really to do with buffer/cache size?
I set shared_buffers to 3072 (from 2048) and it hasnt made much improvement, requests are still taking longer and longer to execute.
Joshua D. Drake wrote: > On Wed, 2009-02-25 at 09:21 +0900, Jordan Tomkinson wrote: >> On Wed, Feb 25, 2009 at 12:05 AM, Aidan Van Dyk <aidan@highrise.ca> >> wrote: >> * Greg Smith <gsmith@gregsmith.com> [090201 00:00]: >> > Shouldn't someone have ranted about RAID-5 by this point in >> the thread? >> You mean someone's actually still using RAID-5? >> ;-) >> >> What exactly is wrong with RAID5 and what should we have gone with? On top of the stuff Joshua wrote, there's also the "RAID 5 Write Hole". Quoting Wikipedia: "In the event of a system failure while there are active writes, the parity of a stripe may become inconsistent with the data. If this is not detected and repaired before a disk or block fails, data loss may ensue as incorrect parity will be used to reconstruct the missing block in that stripe. This potential vulnerability is sometimes known as the write hole. Battery-backed cache and similar techniques are commonly used to reduce the window of opportunity for this to occur." And in more detail from http://blogs.sun.com/bonwick/entry/raid_z "RAID-5 write hole... What's worse, it will do so silently -- it has no idea that it's giving you corrupt data." I sometimes wonder if postgres should refuse to start up on RAID-5 in the same way it does on VFAT or running root. :-) > RAID5 outside of RAID 0 is the worst possible RAID level to run with a > database. (of the commonly used raid level's that is). > > It is very, very slow on random writes which is what databases do. > Switch to RAID 10. > > Sincerely, > > Joshua D. Drkae > > >>