Re: Cygwin PostgreSQL Regression Test Problems (Revisited) - Mailing list pgsql-ports

From Jason Tishler
Subject Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
Date
Msg-id 20010328173449.E510@dothill.com
Whole thread Raw
In response to Re: Cygwin PostgreSQL Regression Test Problems (Revisited)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Cygwin PostgreSQL Regression Test Problems (Revisited)
List pgsql-ports
Tom,

On Wed, Mar 28, 2001 at 04:40:30PM -0500, Tom Lane wrote:
> Jason Tishler <Jason.Tishler@dothill.com> writes:
> > I've done the above and it seems to indicate that all backends exited
> > with a status of 0.  So, I still don't know why some backends "aborted."
>
> Hm.  So what exactly is the failure mode?  Do the psql processes report
> any errors?  Have they produced (any/all of) the expected output?

The failure mode is always something like the following:

The regression test proceeds normally until one of the larger parallel
groups is running.  Then it will hang after output such as:

parallel group (18 tests):  point lseg box path circle date polygon time abstime inet interval reltime type_sanity
oidjoinsopr_sanity timestamp... 

If I do a ps, I will see the postmaster process and one or more psql
processes.  The corresponding postgres processes are no longer running.
(Were they ever running?)  The NT Task Manager shows essentially 100% idle.

I usually kill the psql processes, with the following command:

    kill $(ps | fgrep psql | awk '{print $1}')

Then the regression test will continue with output like the following:

                                                             ...Signal 15
Signal 15
 comments tinterval
     point                ... ok
     lseg                 ... ok
     box                  ... ok
     path                 ... ok
     polygon              ... ok
     circle               ... ok
     date                 ... ok
     time                 ... ok
     timestamp            ... ok
     interval             ... ok
     abstime              ... ok
     reltime              ... ok
     tinterval            ... FAILED
     inet                 ... ok
     comments             ... FAILED
     oidjoins             ... ok
     type_sanity          ... ok
     opr_sanity           ... ok
test geometry             ... ok
..

I believe that the "failures" above correspond to the psql processes
that I killed.

Sometimes the regression test will run to completion without any more
hangs.  Sometimes it will hang at one or more large parallel groups.  If
I continue to kill the psql processes as above, the regression test will
eventually complete (with more "failures").

I've trying another experiment of killing a postgres backend to see if
the psql process notices the backend dying.  It does but I was only able
to kill -9 the postgres backend.  Otherwise, postgres ignored the
signal.  So, I don't know if my experiment was valid.  If a backend
exits normally while a psql is connected, will the psql process notice
this event?

Any other suggestions?  Or, should I just run the serial_schedule and
stop my head banging?

Thanks,
Jason
--
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com

pgsql-ports by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Re: pgmonitor and Solaris
Next
From: Tom Lane
Date:
Subject: Re: Cygwin PostgreSQL Regression Test Problems (Revisited)