Thread: 7.4.1 upgrade issues

7.4.1 upgrade issues

From
"Gavin M. Roy"
Date:
I upgraded my main production db from 7.3.4 last night to 7.4.1.  I'm
running into an issue where a big query that may take 30-40 seconds to
reply is holding up all other backends from performing their queries.
Once the big query is finished, all the tiny ones fly through.  This is
seemingly ne behavior on the box, as with previous versions things would
slow down, but not wait for the cpu/resource hog queries to finish.  The
box is Slackware 8.1, on a fairly decent box with plenty of ram, cpu,
and disk speed.  I've considered renicing the processes, I was wondering
if anyone had a different suggestion.

TIA,
Gavin

Re: 7.4.1 upgrade issues

From
Andrew Sullivan
Date:
On Sat, Mar 06, 2004 at 01:12:57PM -0800, Gavin M. Roy wrote:
> I upgraded my main production db from 7.3.4 last night to 7.4.1.  I'm
> running into an issue where a big query that may take 30-40 seconds to
> reply is holding up all other backends from performing their queries.

By "holding up", do you mean that it's causing the other transactions
to block (INSERT WAITING, for instance), or that it's making
everything real slow?

It could be your sort_mem is set too high.  Remember that the
new-in-7.4 hash behaviour works with the sort_mem setting, and if
it's set too high and you have enough cases of this, you might
actually cause your box to start swapping.

> and disk speed.  I've considered renicing the processes, I was wondering

That is unlikely to help, and certainly won't if the queries are
actually blocked.

--
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.
        --Roger Brinner

Re: 7.4.1 upgrade issues

From
Mike Mascari
Date:
Gavin M. Roy wrote:
> I upgraded my main production db from 7.3.4 last night to 7.4.1.  I'm
> running into an issue where a big query that may take 30-40 seconds to
> reply is holding up all other backends from performing their queries.
> Once the big query is finished, all the tiny ones fly through.  This is
> seemingly ne behavior on the box, as with previous versions things would
> slow down, but not wait for the cpu/resource hog queries to finish.  The
> box is Slackware 8.1, on a fairly decent box with plenty of ram, cpu,
> and disk speed.  I've considered renicing the processes, I was wondering
> if anyone had a different suggestion.

Hi Gavin.

Assuming a VACUUM ANALYZE after reload, one possibility is that the
query in question contains >= 11 joins. I forgot to adjust the GEQO
settings during an upgrade and experienced the associated
sluggishness in planning time.

Mike Mascari



Re: 7.4.1 upgrade issues

From
"Gavin M. Roy"
Date:
It's not WAITING, the larger queries are eating cpu (99%) and the rest
are running so slow it would seem they're waitng for processing time.
My sort mem is fairly high, but this is a dedicated box, and there is no
swapping going on afaik,

Gavin

Andrew Sullivan wrote:

>On Sat, Mar 06, 2004 at 01:12:57PM -0800, Gavin M. Roy wrote:
>
>
>>I upgraded my main production db from 7.3.4 last night to 7.4.1.  I'm
>>running into an issue where a big query that may take 30-40 seconds to
>>reply is holding up all other backends from performing their queries.
>>
>>
>
>By "holding up", do you mean that it's causing the other transactions
>to block (INSERT WAITING, for instance), or that it's making
>everything real slow?
>
>It could be your sort_mem is set too high.  Remember that the
>new-in-7.4 hash behaviour works with the sort_mem setting, and if
>it's set too high and you have enough cases of this, you might
>actually cause your box to start swapping.
>
>
>
>>and disk speed.  I've considered renicing the processes, I was wondering
>>
>>
>
>That is unlikely to help, and certainly won't if the queries are
>actually blocked.
>
>
>


Re: 7.4.1 upgrade issues

From
"Gavin M. Roy"
Date:
It is using indexs, and not seqscan, and there was an analyze after
reload... I'll play with GEQO, thanks.

Gavin

Mike Mascari wrote:

> Gavin M. Roy wrote:
>
>> I upgraded my main production db from 7.3.4 last night to 7.4.1.  I'm
>> running into an issue where a big query that may take 30-40 seconds
>> to reply is holding up all other backends from performing their
>> queries.  Once the big query is finished, all the tiny ones fly
>> through.  This is seemingly ne behavior on the box, as with previous
>> versions things would slow down, but not wait for the cpu/resource
>> hog queries to finish.  The box is Slackware 8.1, on a fairly decent
>> box with plenty of ram, cpu, and disk speed.  I've considered
>> renicing the processes, I was wondering if anyone had a different
>> suggestion.
>
>
> Hi Gavin.
>
> Assuming a VACUUM ANALYZE after reload, one possibility is that the
> query in question contains >= 11 joins. I forgot to adjust the GEQO
> settings during an upgrade and experienced the associated sluggishness
> in planning time.
>
> Mike Mascari
>
>


Re: 7.4.1 upgrade issues

From
"Jim Wilson"
Date:
"Gavin M. Roy" said:

> I upgraded my main production db from 7.3.4 last night to 7.4.1.  I'm
> running into an issue where a big query that may take 30-40 seconds to
> reply is holding up all other backends from performing their queries.
> Once the big query is finished, all the tiny ones fly through.  This is
> seemingly ne behavior on the box, as with previous versions things would
> slow down, but not wait for the cpu/resource hog queries to finish.  The
> box is Slackware 8.1, on a fairly decent box with plenty of ram, cpu,
> and disk speed.  I've considered renicing the processes, I was wondering
> if anyone had a different suggestion.
>

It sounds like you are suggesting this same system and data worked fine on
7.3.4.  Just the same, you might want to provide more detail anyway.  EIDE
drives when used (not recommended for servers IMO) are often not configured
properly and can cause similar issues in a system with tons of ram and cpu.

Best,

Jim

--
Jim Wilson - IT Manager
Kelco Industries
PO Box 160
58 Main Street
Milbridge, ME 04658
207-546-7989 - FAX 207-546-2791
http://www.kelcomaine.com



Re: 7.4.1 upgrade issues

From
Tom Lane
Date:
"Gavin M. Roy" <gmr@ehpg.net> writes:
> It's not WAITING, the larger queries are eating cpu (99%) and the rest
> are running so slow it would seem they're waitng for processing time.

Could we see EXPLAIN ANALYZE output for the large query?  (Also the
usual supporting evidence, ie table schemas for all the tables
involved.)

            regards, tom lane

Re: 7.4.1 upgrade issues

From
"Gavin M. Roy"
Date:
I'll post it if you want, but the issue isn't with the optimizer, index
usage, or seq scan, the issue seems to be more revolving around the
backend getting so much cpu priority it's not allowing other backends to
process, or something along those lines.   For the hardware question
asked, it's an adaptec 7899 Ultra 160 SCSI card w/ accompanying fast
drives...

Again, I'll send the explain, etc if you think it would help answer my
question, but from my perspective, the amount of time the query takes to
execute isnt my issue, but the fact that nothing else can seemingly
execute while its running.

Gavin

Tom Lane wrote:

>"Gavin M. Roy" <gmr@ehpg.net> writes:
>
>
>>It's not WAITING, the larger queries are eating cpu (99%) and the rest
>>are running so slow it would seem they're waitng for processing time.
>>
>>
>
>Could we see EXPLAIN ANALYZE output for the large query?  (Also the
>usual supporting evidence, ie table schemas for all the tables
>involved.)
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faqs/FAQ.html
>
>


Re: 7.4.1 upgrade issues

From
Tom Lane
Date:
"Gavin M. Roy" <gmr@ehpg.net> writes:
> ... the issue seems to be more revolving around the
> backend getting so much cpu priority it's not allowing other backends to
> process, or something along those lines.

I can't think of any difference between 7.3 and 7.4 that would create
a problem of that sort where there was none before.  For that matter,
since Postgres runs nonprivileged it's hard to see how it could create
a priority problem in any version.  I thought the previous suggestion
about added use of hashtables was a pretty good idea.  We could
confirm or disprove it by looking at EXPLAIN output.

            regards, tom lane

Re: 7.4.1 upgrade issues

From
"Karl O. Pinc"
Date:
This reminds me of the scheduler optimizations that have been flying
around the Linux kernel deveopment over the last year or so.  There are
cases apparently where this kind of behavior can come up.  IIRC it's
fixed in later kernels but don't take my word for it, I'm just writing
to give a heads-up.  Take a look at the Linux kernel mailing list,
and you'll probably find good articles at Linux Weekly News (lwn.net.)

On 2004.03.06 23:32 Gavin M. Roy wrote:
> I'll post it if you want, but the issue isn't with the optimizer,
> index usage, or seq scan, the issue seems to be more revolving around
> the backend getting so much cpu priority it's not allowing other
> backends to process, or something along those lines.   For the
> hardware question asked, it's an adaptec 7899 Ultra 160 SCSI card w/
> accompanying fast drives...
>
> Again, I'll send the explain, etc if you think it would help answer
> my question, but from my perspective, the amount of time the query
> takes to execute isnt my issue, but the fact that nothing else can
> seemingly execute while its running.
>
> Gavin
>
> Tom Lane wrote:
>
>> "Gavin M. Roy" <gmr@ehpg.net> writes:
>>
>>> It's not WAITING, the larger queries are eating cpu (99%) and the
>>> rest are running so slow it would seem they're waitng for
>>> processing time.
>>
>> Could we see EXPLAIN ANALYZE output for the large query?  (Also the
>> usual supporting evidence, ie table schemas for all the tables
>> involved.)

Karl <kop@meme.com>
Free Software:  "You don't pay back, you pay forward."
                  -- Robert A. Heinlein

Re: 7.4.1 upgrade issues

From
"Gavin M. Roy"
Date:
Thanks, I'll take a look, we've rewritten the queries and indexes to
avoid the issue, but I'd like to get an ultimate solution to the issue,
and the concept that it's a linux kernel scheduling thing is probably
dead on.

Gavin

Karl O. Pinc wrote:

> This reminds me of the scheduler optimizations that have been flying
> around the Linux kernel deveopment over the last year or so.  There are
> cases apparently where this kind of behavior can come up.  IIRC it's
> fixed in later kernels but don't take my word for it, I'm just writing
> to give a heads-up.  Take a look at the Linux kernel mailing list,
> and you'll probably find good articles at Linux Weekly News (lwn.net.)
>
> On 2004.03.06 23:32 Gavin M. Roy wrote:
>
>> I'll post it if you want, but the issue isn't with the optimizer,
>> index usage, or seq scan, the issue seems to be more revolving around
>> the backend getting so much cpu priority it's not allowing other
>> backends to process, or something along those lines.   For the
>> hardware question asked, it's an adaptec 7899 Ultra 160 SCSI card w/
>> accompanying fast drives...
>>
>> Again, I'll send the explain, etc if you think it would help answer
>> my question, but from my perspective, the amount of time the query
>> takes to execute isnt my issue, but the fact that nothing else can
>> seemingly execute while its running.
>>
>> Gavin
>>
>> Tom Lane wrote:
>>
>>> "Gavin M. Roy" <gmr@ehpg.net> writes:
>>>
>>>> It's not WAITING, the larger queries are eating cpu (99%) and the
>>>> rest are running so slow it would seem they're waitng for
>>>> processing time.
>>>
>>>
>>> Could we see EXPLAIN ANALYZE output for the large query?  (Also the
>>> usual supporting evidence, ie table schemas for all the tables
>>> involved.)
>>
>
> Karl <kop@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein