Thread: priority on a process

priority on a process

From
"Johnson, Shaunn"
Date:

Running PostgreSQL 7.2.1 on RedHat Linux 7.2

I have a process running and it seems that it is not
using very much of the available resources on the server
(so, if I ran 'top' to see what's running and checked the
CPU, memory and system time usage, it appears very
low).

[example]

<top>
PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND

11467 postgres   9   0  3256 3176  2572 D     1.3  0.3  12:00 postmaster

<ps info>
postgres 11467  3.4  0.3  8204 3176 ?        D    07:58  12:02 postgres: charles  testdb [local] UPDATE

<query>

update tm_wm_hedis_num_wei
set
        product = b.product,
        prov_num = b.prov_num
from
        ref_memmonth_spans b
where
        substr(tm_wm_hedis_num_wei.contract,1,9) = b.contract
and
        substr(tm_wm_hedis_num_wei.mbr_num,1,2) = b.mbr_num
and
        tm_wm_hedis_num_wei.date between b.effdt and b.enddt;"

[/example]

Q1: how can I move the priority of the job up so that
it can take advantage of the CPU / memory alloted?

Q2: why will some jobs hog system resouces and others,
like this one, won't do much of anything?

Thanks!

-X

Re: priority on a process

From
"Chris Palmer"
Date:
Shaunn Johnson writes:

> I have a process running and it seems that it is not
> using very much of the available resources on the server

I have the same non-problem. :)

Basically Postgres is pretty darn efficient. On my server, Postgres is
using around 16MB of memory (very small db so far), and my Java Servlet
engine is using around 225MB. Yeah.

You can change the priority of any process with the nice and renice
commands. Read the man pages; they're pretty straightforward.

However, if your database server is doing nothing but Postgres (which is
generally how you should have it set up for production use), changing
the priority won't help.

The big thing is filesystem buffering and cacheing. As long as you have
plenty of RAM, Pg can cache filesystem data in RAM which is far faster
than fetching it from disk. Ideally you will have enough RAM to hold
your entire data set. The next best thing is to have a fast storage
array with many spindles.

Basically, don't worry about performance until it's a problem, and then
worry about the specific bottleneck that exists in your situation.


Re: priority on a process

From
"scott.marlowe"
Date:
On Fri, 16 May 2003, Johnson, Shaunn wrote:

> Running PostgreSQL 7.2.1 on RedHat Linux 7.2
>
> Q1: how can I move the priority of the job up so that
> it can take advantage of the CPU / memory alloted?
>
> Q2: why will some jobs hog system resouces and others,
> like this one, won't do much of anything?

More than likely you are I/O bound.  i.e. Postgresql is waiting on the
disk subsystem to return data.  See if you can raise shared buffers a bit,
the default 64 is awfully small, 500 to 1000 is a nice place to start.
They're measured in 8k blocks, by the way, so 1024 of them are only 8 megs
cache.  that little change alone speeds up postgresql quite a bit.

Also, if your machine has memory to spare, look at increasing the effect
cache size.  If this is set too low, postgresql will favor seq scans,
which are I/O intensive, but not usually CPU intensive.

If postgresql is running fast enough, then you can ignore all my advice,
if it's running a little slow, then this might help.


Re: priority on a process

From
"Chris Palmer"
Date:
Scott Marlowe writes:

> Also, if your machine has memory to spare, look at increasing
> the effect cache size.

Doesn't Linux automatically grow the fs cache/buffer to most of
available memory? The OP said he was using Linux.

On e.g. OpenBSD you would have to set the size in the kernel config, but
Istr Linux doesn't need that. Granted, I haven't followed Linux closely
since the 2.2 days, and I know there has been some VM hilarity in the
"stable" 2.4 series. So maybe someone can update my old understanding...


Re: priority on a process

From
"scott.marlowe"
Date:
On Fri, 16 May 2003, Chris Palmer wrote:

> Scott Marlowe writes:
>
> > Also, if your machine has memory to spare, look at increasing
> > the effect cache size.
>
> Doesn't Linux automatically grow the fs cache/buffer to most of
> available memory? The OP said he was using Linux.
>
> On e.g. OpenBSD you would have to set the size in the kernel config, but
> Istr Linux doesn't need that. Granted, I haven't followed Linux closely
> since the 2.2 days, and I know there has been some VM hilarity in the
> "stable" 2.4 series. So maybe someone can update my old understanding...

Yes, effective_cache_size is the postgresql setting that tells the
postmaster we have "about this much kernel cache" on average.

If it's set low, then postgresql assumes the kernel isn't caching much, if
it's higher, then it assumes it's more likely for data to be "in memory"
and makes index scans more likely than seq scans.

Sorry, I should have pointed out I was talking out about a postgresql
configuration parameter and not a linux one...


Re: priority on a process

From
Tom Lane
Date:
"Johnson, Shaunn" <SJohnson6@bcbsm.com> writes:
> I have a process running and it seems that it is not
> using very much of the available resources on the server

> PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND

> postgres 11467  3.4  0.3  8204 3176 ?        D    07:58  12:02 postgres:
> charles  testdb [local] UPDATE

It's waiting for disk I/O --- that's what the D means.

> Q2: why will some jobs hog system resouces and others,
> like this one, won't do much of anything?

You have too narrow a view of what system resources are ;-).  I feel
sure that this query is banging the heck out of your disk.  When the
disk is the bottleneck, worrying about improving CPU usage is pointless.

            regards, tom lane

Re: priority on a process

From
"Chris Palmer"
Date:
 Scott Marlowe writes:

> Yes, effective_cache_size is the postgresql setting that
> tells the postmaster we have "about this much kernel cache"
> on average.
>
> If it's set low, then postgresql assumes the kernel isn't
> caching much, if it's higher, then it assumes it's more
> likely for data to be "in memory"
> and makes index scans more likely than seq scans.
>
> Sorry, I should have pointed out I was talking out about a
> postgresql configuration parameter and not a linux one...

Ahh. Thanks for the tip (I am still new to Pg).

I guess the reason Pg can't ask the running kernel how much cache mem it
has is that any such thing would be completely non-portable.


Now branching the thread:

The documentation (doc/runtime-config.html in my 7.3.2 source tree) says
the value is a number of disk pages, and that disk pages are usualy 8KB.
My filesystem has 4KB blocks; are blocks and pages the same thing in
this context, or does "page" refer to the in-memory copy of a disk
block? Are bigger blocks/pages on the filesystem used for Pg a good
idea? I would guess yes, since Pg's data files are few and large,
instead of many and small. Should I just crank my fs block size as big
as it will go, on a partition dedicated to pg_data?

Thanks.


Re: priority on a process

From
"scott.marlowe"
Date:
On Fri, 16 May 2003, Chris Palmer wrote:

>  Scott Marlowe writes:
>
> > Yes, effective_cache_size is the postgresql setting that
> > tells the postmaster we have "about this much kernel cache"
> > on average.
> >
> > If it's set low, then postgresql assumes the kernel isn't
> > caching much, if it's higher, then it assumes it's more
> > likely for data to be "in memory"
> > and makes index scans more likely than seq scans.
> >
> > Sorry, I should have pointed out I was talking out about a
> > postgresql configuration parameter and not a linux one...
>
> Ahh. Thanks for the tip (I am still new to Pg).
>
> I guess the reason Pg can't ask the running kernel how much cache mem it
> has is that any such thing would be completely non-portable.
>
>
> Now branching the thread:
>
> The documentation (doc/runtime-config.html in my 7.3.2 source tree) says
> the value is a number of disk pages, and that disk pages are usualy 8KB.
> My filesystem has 4KB blocks; are blocks and pages the same thing in
> this context, or does "page" refer to the in-memory copy of a disk
> block? Are bigger blocks/pages on the filesystem used for Pg a good
> idea? I would guess yes, since Pg's data files are few and large,
> instead of many and small. Should I just crank my fs block size as big
> as it will go, on a partition dedicated to pg_data?

They mean it's measured in postgresql disk blocks, so, it's the size block
postgresql was compiled in that matters for that.  Linux is currently
limited to a maximum block size of whatever a memory page is, and under
x86, that's 4k.  On bigger hardware you can have larger blocks.Itanium can
do 8k, Itanium2 can do 16 to 64k, most other 64 bit archs do up to 64k
page size.

Now, postgresql has it's own block size, which defaults to 8k.

This is a good compromise in size.  If you're dealing with really large
rows all the time (say 32k and above) then going to a larger block size in
postgresql will help that.  However, this will come at the expense of
performance on smaller accesses.  Also, some subtle bugs have been known
to creep into postgresql when increasing the block size in the past, so
careful testing is called for if you're going to try increasing the block
size.

The idea behind blocks of 4k as opposed to the default 512 byte sector
size of most hard drives is that on a typical 9ms access hard drive,
you'll spend 9ms moving the head, another 4 or 5 waiting for the head to
settle, and another 4 or so waiting for the sector you want to spin around
under the head you're using.  Since we're waiting on the drive most of
this time, we might as well make the best of it and read in a bit more.
The Linux OS sets to to a default of 4k, while postgresql uses 8k blocks,
which means the OS will make two "high level" requests for those 4k
blocks.  Since a lot of what gets accessed next will likely be in the same
block as what we're reading, it's usually a win.  Note that once the
sectors are under the head the actual read time is tiny, say 1 ms or so.

But, when block size starts rising, then the read time of each block
rises, and if you are only using on average a small part of each block
(say 1 or 2k) then you're reading more than you need.

Hence, optimal block size may be influenced by the type of load, but there
is a maximum size at which you really don't see much return, even with a
load that reads large chunks at a time.

Note that also, the maximum size of an index entry is about 1/3 of a block
(pgsql block) so if you are indexing LARGE fields, then a larger block
size might be helpful, but often, indexes on large fields tend to be
suboptimal compared to a full text indexing engine, and postgresql has two
of those available that work quite well, so the argument for a larger
blocksize there kind of goes away for about 90% of all applications.


Re: priority on a process

From
Martijn van Oosterhout
Date:
On Fri, May 16, 2003 at 02:16:04PM -0400, Johnson, Shaunn wrote:
> <ps info>
> postgres 11467  3.4  0.3  8204 3176 ?        D    07:58  12:02 postgres:
> charles  testdb [local] UPDATE

As other people have pointed out, your disk is probably quite busy. Run
"vmstat 1" and look where the activity is.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "the West won the world not by the superiority of its ideas or values or
> religion but rather by its superiority in applying organized violence.
> Westerners often forget this fact, non-Westerners never do."
>   - Samuel P. Huntington

Attachment