Thread: priority on a process
Running PostgreSQL 7.2.1 on RedHat Linux 7.2
I have a process running and it seems that it is not
using very much of the available resources on the server
(so, if I ran 'top' to see what's running and checked the
CPU, memory and system time usage, it appears very
low).
[example]
<top>
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
11467 postgres 9 0 3256 3176 2572 D 1.3 0.3 12:00 postmaster
<ps info>
postgres 11467 3.4 0.3 8204 3176 ? D 07:58 12:02 postgres: charles testdb [local] UPDATE
<query>
update tm_wm_hedis_num_wei
set
product = b.product,
prov_num = b.prov_num
from
ref_memmonth_spans b
where
substr(tm_wm_hedis_num_wei.contract,1,9) = b.contract
and
substr(tm_wm_hedis_num_wei.mbr_num,1,2) = b.mbr_num
and
tm_wm_hedis_num_wei.date between b.effdt and b.enddt;"
[/example]
Q1: how can I move the priority of the job up so that
it can take advantage of the CPU / memory alloted?
Q2: why will some jobs hog system resouces and others,
like this one, won't do much of anything?
Thanks!
-X
Shaunn Johnson writes: > I have a process running and it seems that it is not > using very much of the available resources on the server I have the same non-problem. :) Basically Postgres is pretty darn efficient. On my server, Postgres is using around 16MB of memory (very small db so far), and my Java Servlet engine is using around 225MB. Yeah. You can change the priority of any process with the nice and renice commands. Read the man pages; they're pretty straightforward. However, if your database server is doing nothing but Postgres (which is generally how you should have it set up for production use), changing the priority won't help. The big thing is filesystem buffering and cacheing. As long as you have plenty of RAM, Pg can cache filesystem data in RAM which is far faster than fetching it from disk. Ideally you will have enough RAM to hold your entire data set. The next best thing is to have a fast storage array with many spindles. Basically, don't worry about performance until it's a problem, and then worry about the specific bottleneck that exists in your situation.
On Fri, 16 May 2003, Johnson, Shaunn wrote: > Running PostgreSQL 7.2.1 on RedHat Linux 7.2 > > Q1: how can I move the priority of the job up so that > it can take advantage of the CPU / memory alloted? > > Q2: why will some jobs hog system resouces and others, > like this one, won't do much of anything? More than likely you are I/O bound. i.e. Postgresql is waiting on the disk subsystem to return data. See if you can raise shared buffers a bit, the default 64 is awfully small, 500 to 1000 is a nice place to start. They're measured in 8k blocks, by the way, so 1024 of them are only 8 megs cache. that little change alone speeds up postgresql quite a bit. Also, if your machine has memory to spare, look at increasing the effect cache size. If this is set too low, postgresql will favor seq scans, which are I/O intensive, but not usually CPU intensive. If postgresql is running fast enough, then you can ignore all my advice, if it's running a little slow, then this might help.
Scott Marlowe writes: > Also, if your machine has memory to spare, look at increasing > the effect cache size. Doesn't Linux automatically grow the fs cache/buffer to most of available memory? The OP said he was using Linux. On e.g. OpenBSD you would have to set the size in the kernel config, but Istr Linux doesn't need that. Granted, I haven't followed Linux closely since the 2.2 days, and I know there has been some VM hilarity in the "stable" 2.4 series. So maybe someone can update my old understanding...
On Fri, 16 May 2003, Chris Palmer wrote: > Scott Marlowe writes: > > > Also, if your machine has memory to spare, look at increasing > > the effect cache size. > > Doesn't Linux automatically grow the fs cache/buffer to most of > available memory? The OP said he was using Linux. > > On e.g. OpenBSD you would have to set the size in the kernel config, but > Istr Linux doesn't need that. Granted, I haven't followed Linux closely > since the 2.2 days, and I know there has been some VM hilarity in the > "stable" 2.4 series. So maybe someone can update my old understanding... Yes, effective_cache_size is the postgresql setting that tells the postmaster we have "about this much kernel cache" on average. If it's set low, then postgresql assumes the kernel isn't caching much, if it's higher, then it assumes it's more likely for data to be "in memory" and makes index scans more likely than seq scans. Sorry, I should have pointed out I was talking out about a postgresql configuration parameter and not a linux one...
"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes: > I have a process running and it seems that it is not > using very much of the available resources on the server > PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND > postgres 11467 3.4 0.3 8204 3176 ? D 07:58 12:02 postgres: > charles testdb [local] UPDATE It's waiting for disk I/O --- that's what the D means. > Q2: why will some jobs hog system resouces and others, > like this one, won't do much of anything? You have too narrow a view of what system resources are ;-). I feel sure that this query is banging the heck out of your disk. When the disk is the bottleneck, worrying about improving CPU usage is pointless. regards, tom lane
Scott Marlowe writes: > Yes, effective_cache_size is the postgresql setting that > tells the postmaster we have "about this much kernel cache" > on average. > > If it's set low, then postgresql assumes the kernel isn't > caching much, if it's higher, then it assumes it's more > likely for data to be "in memory" > and makes index scans more likely than seq scans. > > Sorry, I should have pointed out I was talking out about a > postgresql configuration parameter and not a linux one... Ahh. Thanks for the tip (I am still new to Pg). I guess the reason Pg can't ask the running kernel how much cache mem it has is that any such thing would be completely non-portable. Now branching the thread: The documentation (doc/runtime-config.html in my 7.3.2 source tree) says the value is a number of disk pages, and that disk pages are usualy 8KB. My filesystem has 4KB blocks; are blocks and pages the same thing in this context, or does "page" refer to the in-memory copy of a disk block? Are bigger blocks/pages on the filesystem used for Pg a good idea? I would guess yes, since Pg's data files are few and large, instead of many and small. Should I just crank my fs block size as big as it will go, on a partition dedicated to pg_data? Thanks.
On Fri, 16 May 2003, Chris Palmer wrote: > Scott Marlowe writes: > > > Yes, effective_cache_size is the postgresql setting that > > tells the postmaster we have "about this much kernel cache" > > on average. > > > > If it's set low, then postgresql assumes the kernel isn't > > caching much, if it's higher, then it assumes it's more > > likely for data to be "in memory" > > and makes index scans more likely than seq scans. > > > > Sorry, I should have pointed out I was talking out about a > > postgresql configuration parameter and not a linux one... > > Ahh. Thanks for the tip (I am still new to Pg). > > I guess the reason Pg can't ask the running kernel how much cache mem it > has is that any such thing would be completely non-portable. > > > Now branching the thread: > > The documentation (doc/runtime-config.html in my 7.3.2 source tree) says > the value is a number of disk pages, and that disk pages are usualy 8KB. > My filesystem has 4KB blocks; are blocks and pages the same thing in > this context, or does "page" refer to the in-memory copy of a disk > block? Are bigger blocks/pages on the filesystem used for Pg a good > idea? I would guess yes, since Pg's data files are few and large, > instead of many and small. Should I just crank my fs block size as big > as it will go, on a partition dedicated to pg_data? They mean it's measured in postgresql disk blocks, so, it's the size block postgresql was compiled in that matters for that. Linux is currently limited to a maximum block size of whatever a memory page is, and under x86, that's 4k. On bigger hardware you can have larger blocks.Itanium can do 8k, Itanium2 can do 16 to 64k, most other 64 bit archs do up to 64k page size. Now, postgresql has it's own block size, which defaults to 8k. This is a good compromise in size. If you're dealing with really large rows all the time (say 32k and above) then going to a larger block size in postgresql will help that. However, this will come at the expense of performance on smaller accesses. Also, some subtle bugs have been known to creep into postgresql when increasing the block size in the past, so careful testing is called for if you're going to try increasing the block size. The idea behind blocks of 4k as opposed to the default 512 byte sector size of most hard drives is that on a typical 9ms access hard drive, you'll spend 9ms moving the head, another 4 or 5 waiting for the head to settle, and another 4 or so waiting for the sector you want to spin around under the head you're using. Since we're waiting on the drive most of this time, we might as well make the best of it and read in a bit more. The Linux OS sets to to a default of 4k, while postgresql uses 8k blocks, which means the OS will make two "high level" requests for those 4k blocks. Since a lot of what gets accessed next will likely be in the same block as what we're reading, it's usually a win. Note that once the sectors are under the head the actual read time is tiny, say 1 ms or so. But, when block size starts rising, then the read time of each block rises, and if you are only using on average a small part of each block (say 1 or 2k) then you're reading more than you need. Hence, optimal block size may be influenced by the type of load, but there is a maximum size at which you really don't see much return, even with a load that reads large chunks at a time. Note that also, the maximum size of an index entry is about 1/3 of a block (pgsql block) so if you are indexing LARGE fields, then a larger block size might be helpful, but often, indexes on large fields tend to be suboptimal compared to a full text indexing engine, and postgresql has two of those available that work quite well, so the argument for a larger blocksize there kind of goes away for about 90% of all applications.
On Fri, May 16, 2003 at 02:16:04PM -0400, Johnson, Shaunn wrote: > <ps info> > postgres 11467 3.4 0.3 8204 3176 ? D 07:58 12:02 postgres: > charles testdb [local] UPDATE As other people have pointed out, your disk is probably quite busy. Run "vmstat 1" and look where the activity is. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > "the West won the world not by the superiority of its ideas or values or > religion but rather by its superiority in applying organized violence. > Westerners often forget this fact, non-Westerners never do." > - Samuel P. Huntington