Thread: "nice"/low priority Query
The short question: Is there any ways to give postgresql a hint that a particular SQL call should be run at lower priority? Since every db connection has a pid, I can manually run "renice" to scheduele it by the OS - but of course I can't do it manually all the time. The long story: We have a constantly growing database, and hence also a constantly growing load on the database server. A hardware upgrade has for different reasons been postponed, and it's still beeing postponed. We were hitting the first capacity problems in June, though so far I've managed to keep the situation in check by tuning the configuration, adding indices, optimizing queries, doing cacheing in the application, and at one point in the code I'm even asking the database for "explain plan", grepping out the estimated cost number, and referring the user to take contact with the IT-dept if he really needs the report. But I digress. Still there are lots of CPU power available - normally the server runs with 50-80% of the CPUs idle, it's just the spikes that kills us. We basically have two kind of queries that are significant - an ever-ongoing "critical" rush of simple queries, both reading and writing to the database, plus some few heavy "non-critical" read-only queries that may cause significant iowait. The problem comes when we are so unlucky that two or three heavy queries are run simultaneously; we get congestion problems - instead of the applications just running a bit slower, they run _much_ slower. Ideally, if it was trivial to give priorities, it should be possible to keep the CPUs running at 100% for hours without causing critical problems...? -- Tobias Brox, +47-91700050 Tromso, Norway
Tobias Brox <tobias@nordicbet.com> writes: > Is there any ways to give postgresql a hint that a > particular SQL call should be run at lower priority? Since every db > connection has a pid, I can manually run "renice" to scheduele it by the OS > - but of course I can't do it manually all the time. And it won't help you anyway, because renice only affects CPU priority not I/O scheduling ... which, by your description, is the real problem. I think the only thing that's likely to help much is trying to arrange that the "simple" queries only need to touch pages that are already in memory. Some playing around with shared_buffer sizing might help. Also, if you're not on PG 8.0.*, an update might help. regards, tom lane
Tom Lane wrote: > Tobias Brox <tobias@nordicbet.com> writes: > >>Is there any ways to give postgresql a hint that a >>particular SQL call should be run at lower priority? Since every db >>connection has a pid, I can manually run "renice" to scheduele it by the OS >>- but of course I can't do it manually all the time. > > And it won't help you anyway, because renice only affects CPU priority > not I/O scheduling ... which, by your description, is the real problem. > > I think the only thing that's likely to help much is trying to arrange > that the "simple" queries only need to touch pages that are already in > memory. Some playing around with shared_buffer sizing might help. > Also, if you're not on PG 8.0.*, an update might help. Would it be useful to be able to re-use the vacuum_cost_xxx settings in 8.0 for this sort of thing? I'm thinking a long-running report query isn't that different from a vacuum. -- Richard Huxton Archonet Ltd
On Tue, Aug 02, 2005 at 12:19:30PM -0400, Tom Lane wrote: > Tobias Brox <tobias@nordicbet.com> writes: > > Is there any ways to give postgresql a hint that a > > particular SQL call should be run at lower priority? Since every db > > connection has a pid, I can manually run "renice" to scheduele it by the OS > > - but of course I can't do it manually all the time. > > And it won't help you anyway, because renice only affects CPU priority > not I/O scheduling ... which, by your description, is the real problem. Actually, from what I've read 4.2BSD actually took priority into account when scheduling I/O. I don't know if this behavior is still present in FreeBSD or the like, though. So depending on the OS, priority could play a role in determining I/O scheduling. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
[Tobias Brox - Tue at 06:04:34PM +0200] > (...) and at one > point in the code I'm even asking the database for "explain plan", grepping > out the estimated cost number, and referring the user to take contact with > the IT-dept if he really needs the report. But I digress. I just came to think about some more "dirty" tricks I can do. I have turned on stats collection in the configuration; now, if I do: select count(*) from pg_stat_activity where not current_query like '<IDLE>%'; or, eventually: select count(*) from pg_stat_activity where not current_query like '<IDLE>%' and query_start+'1 second'<now(); it will give a hint about how busy the database server is, thus I can eventually let the application sleep and retry if there are any other heavy queries in progress. -- Tobias Brox, +47-91700050 Nordicbet, IT dept
Tobias Brox wrote: > [Tobias Brox - Tue at 06:04:34PM +0200] > >>(...) and at one >>point in the code I'm even asking the database for "explain plan", grepping >>out the estimated cost number, and referring the user to take contact with >>the IT-dept if he really needs the report. But I digress. > > > I just came to think about some more "dirty" tricks I can do. I have turned > on stats collection in the configuration; now, if I do: > > select count(*) from pg_stat_activity where not current_query like '<IDLE>%'; > > or, eventually: > > select count(*) from pg_stat_activity > where not current_query like '<IDLE>%' and query_start+'1 second'<now(); > > it will give a hint about how busy the database server is, thus I can > eventually let the application sleep and retry if there are any other heavy > queries in progress. Or - create a table with an estimated_cost column, when you start a new "heavy" query, insert that query's cost, then sleep SUM(estimated_cost)/100 secs or something. When the query ends, delete the cost-row. Hmm - actually rather than dividing by 100, perhaps make it a tunable value. -- Richard Huxton Archonet Ltd
Jim C. Nasby wrote: > Actually, from what I've read 4.2BSD actually took priority into account > when scheduling I/O. FWIW, you can set I/O priority in recent versions of the Linux kernel using ionice, which is part of RML's schedutils package (which was recently merged into util-linux). -Neil