Re: Vacuums on large busy databases - Mailing list pgsql-performance
From | Jeff Davis |
---|---|
Subject | Re: Vacuums on large busy databases |
Date | |
Msg-id | 1158284062.29889.234.camel@dogma.v10.wvs Whole thread Raw |
In response to | Re: Vacuums on large busy databases (Francisco Reyes <lists@stringsutils.com>) |
List | pgsql-performance |
On Thu, 2006-09-14 at 19:50 -0400, Francisco Reyes wrote: > > regarding shared buffers I'd make this much bigger, like 2GB or more > > Will do 2GB on the weekend. From what I read this requires shared memory so > have to restart my machine (FreeBSD). > You should be able to do: # sysctl -w kern.ipc.shmmax=2147483647 > if I plan to give shared buffers 2GB, how much more over that should I give > the total shared memory kern.ipc.shmmax? 2.5GB? > To get it higher than 2GB, you may need to recompile the kernel, but you should be able to get 2GB without a restart. > Also will shared buffers impact inserts/updates at all? > I wish the postgresql.org site docs would mention what will be impacted. > They will not have a real impact on INSERTs, because an INSERT still has to be logged in the WAL before commit. Technically, it may make a difference, but I would not expect much. shared_buffers has a big impact on UPDATEs, because an UPDATE needs to find the record to UPDATE first. An UPDATE is basically a DELETE and an INSERT in one transaction. > Comments like: This setting must be at least 16, as well as at least twice > the value of max_connections; however, settings significantly higher than > the minimum are usually needed for good performance. > > Are usefull, but could use some improvement.. increase on what? All > performance? inserts? updates? selects? More shared_buffers means fewer reads from disk. If you have 10MB worth of tables, having 100MB worth of shared buffers is useless because they will be mostly empty. However, if you have 100MB of shared buffers and you access records randomly from a 100 petabyte database, increasing shared_buffers to 200MB doesn't help much, because the chances that the record you need is in a shared buffer already are almost zero. Shared buffers are a cache, pure and simple. When you have "locality of reference", caches are helpful. Sometimes that's temporal locality (if you are likely to access data that you recently accessed), and sometimes that's spatial locality (if you access block 10, you're likely to access block 11). If you have "locality of referece" -- and almost every database does -- shared_buffers help. > For instance, increasing effective_cache_size has made a noticeable > difference in selects. However as I talk to the developers we are still > doing marginally in the inserts. About 150/min. effective_cache_size affects only the plan generated. INSERTs aren't planned because, well, it's an INSERT and there's only one thing to do and only one way to do it. > There is spare CPU cycles, both raid cards are doing considerably less they > can do.. so next I am going to try and research what parameters I need to > bump to increase inserts. Today I increased checkpoint_segments from the > default to 64. Now looking at wall_buffers. You won't see any amazing increases from those. You can improve INSERTs a lot if you have a battery-backed cache on your RAID card and set it to WriteBack mode (make sure to disable disk caches though, those aren't battery backed and you could lose data). If you do this, you should be able to do 1000's of inserts per second. Another thing to look at is "commit_delay". If you are trying to commit many INSERTs at once, normally they will be fsync()d individually, which is slow. However, by adding a commit delay, postgres can batch a few inserts into one fsync() call, which can help a lot. > It would be most helpfull to have something on the docs to specify what each > setting affects most such as reads, writes, updates, inserts, etc.. I agree that they could be improved. It gets complicated quickly though, and it's hard to generalize the effect that a performance setting will have. They are all very interdependent. Regards, Jeff Davis
pgsql-performance by date: