Thread: Re: [PERFORM] Hanging queries on dual CPU windows

Re: [PERFORM] Hanging queries on dual CPU windows

From
"Magnus Hagander"
Date:
> > If so,
> > we could perhaps recode that part using a Mutex instead of
> a critical
> > section - since it's not a performance critical path, the
> difference
> > shouldn't be large. If I code up a patch for that, can you re-apply
> > SP1 and test it? Or is this a production system you can't
> really touch?
>
> I can do whatever the hell I want with it, so if you could
> cook up a patch that would be great.
>
> As a BTW: I reinstalled SP1 and turned stats collection off.
> That also seems to work, but is not really a solution since
> we want to use autovacuuming.

Ok, I've coded up a patch that changes the code to use a mutex instead.
Patch attached. You can get a precompiled postgres.exe at
http://www.hagander.net/download/postgres.exe_mutex.zip. You need to
copy this file to postmaster.exe as well - they are supposed to be
identical. It's based off a snapshot of 8.1-stable.

Looking a my system while testing this it still loooked like it was
hanging on that plac ein the code, even though I saw no problems. So I'm
not convinced we can actually trust the stacktrace from the non-default
threads. So I don't think this patch will actually work :-( But it's
worth a try.

(Oh, and I moved the thread over to -hackers, seems more correct at this
time)

//Magnus


Attachment

Re: [PERFORM] Hanging queries on dual CPU windows

From
Jan de Visser
Date:
On Sunday 12 March 2006 09:40, Magnus Hagander wrote:
> > > If so,
> > > we could perhaps recode that part using a Mutex instead of
> >
> > a critical
> >
> > > section - since it's not a performance critical path, the
> >
> > difference
> >
> > > shouldn't be large. If I code up a patch for that, can you re-apply
> > > SP1 and test it? Or is this a production system you can't
> >
> > really touch?
> >
> > I can do whatever the hell I want with it, so if you could
> > cook up a patch that would be great.
> >
> > As a BTW: I reinstalled SP1 and turned stats collection off.
> > That also seems to work, but is not really a solution since
> > we want to use autovacuuming.
>
> Ok, I've coded up a patch that changes the code to use a mutex instead.
> Patch attached. You can get a precompiled postgres.exe at
> http://www.hagander.net/download/postgres.exe_mutex.zip. You need to
> copy this file to postmaster.exe as well - they are supposed to be
> identical. It's based off a snapshot of 8.1-stable.
>
> Looking a my system while testing this it still loooked like it was
> hanging on that plac ein the code, even though I saw no problems. So I'm
> not convinced we can actually trust the stacktrace from the non-default
> threads. So I don't think this patch will actually work :-( But it's
> worth a try.
>
> (Oh, and I moved the thread over to -hackers, seems more correct at this
> time)

Thanks Magnus,

I'll try tomorrow. Will let you know ASAP (8:30 EST I guess :).

If this doesn't work, how do we progress?

>
> //Magnus

jan

--
--------------------------------------------------------------
Jan de Visser                     jdevisser@digitalfairway.com

                Baruk Khazad! Khazad ai-menu!
--------------------------------------------------------------


Re: [PERFORM] Hanging queries on dual CPU windows

From
"Qingqing Zhou"
Date:
""Magnus Hagander"" <mha@sollentuna.net> wrote
> Ok, I've coded up a patch that changes the code to use a mutex instead.

Are we asserting the problem is caused by the spinlock random wake-up order?
I am not sure why this would fix the problem. If my memory serves, a
critical section might be a problem if one process aborts unexpected while
it is inside. Other waiting processes can never have a chance to enter it
(also have no chance to handle SIGQUIT) -- so this patch may solve this.

There is another suspect in http://www.devisser-siderius.com/stack1.jpg,
i.e., process 3 does shmctl. I once filed a server core dump bug in win32 of
reporting WSAEWOULDBLOCK.
(http://archives.postgresql.org/pgsql-bugs/2006-02/msg00185.php). AFAICS, it
is actually an mistranslated EINTR. There seems some relation between these
issues, but I didn't come up with a complete theory of it.

Regards,
Qingqing




Re: [PERFORM] Hanging queries on dual CPU windows

From
Jan de Visser
Date:
On Sunday 12 March 2006 09:40, Magnus Hagander wrote:
> Looking a my system while testing this it still loooked like it was
> hanging on that plac ein the code, even though I saw no problems. So I'm
> not convinced we can actually trust the stacktrace from the non-default
> threads. So I don't think this patch will actually work :-( But it's
> worth a try.

I'm afraid you're right. Hangs again :(

jan


--
--------------------------------------------------------------
Jan de Visser                     jdevisser@digitalfairway.com

                Baruk Khazad! Khazad ai-menu!
--------------------------------------------------------------


Re: [PERFORM] Hanging queries on dual CPU windows

From
Jan de Visser
Date:
On Monday 13 March 2006 09:26, Jan de Visser wrote:
> On Sunday 12 March 2006 09:40, Magnus Hagander wrote:
> > Looking a my system while testing this it still loooked like it was
> > hanging on that plac ein the code, even though I saw no problems. So I'm
> > not convinced we can actually trust the stacktrace from the non-default
> > threads. So I don't think this patch will actually work :-( But it's
> > worth a try.
>
> I'm afraid you're right. Hangs again :(

I now have the toolchain set up, so if you want me to try stuff, please let me
know. Resolving this is important to us.

On a whim, I replaced InitializeCriticalSection with
InitializeCriticalSectionAndSpinCount, since MSDN told me that would be
better for SMP. No joy.

jan

--
--------------------------------------------------------------
Jan de Visser                     jdevisser@digitalfairway.com

                Baruk Khazad! Khazad ai-menu!
--------------------------------------------------------------