Received: (qmail 22280 invoked by uid 2012); 7 May 1999 23:14:56 -0000 Message-Id: <19990507231456.22276.qmail@hyperreal.org> Date: 7 May 1999 23:14:56 -0000 From: Karyn Joseph Reply-To: karyn.joseph@teradyne.com To: apbugs@hyperreal.org Subject: [notice] child pid 1525 exit signal Segmentation Fault (11) X-Send-Pr-Version: 3.2 >Number: 4374 >Category: os-solaris >Synopsis: [notice] child pid 1525 exit signal Segmentation Fault (11) >Confidential: no >Severity: serious >Priority: medium >Responsible: apache >State: open >Class: sw-bug >Submitter-Id: apache >Arrival-Date: Fri May 7 16:20:00 PDT 1999 >Last-Modified: Sun Jun 13 05:19:54 PDT 1999 >Originator: karyn.joseph@teradyne.com >Organization: >Release: 1.3.4 >Environment: binder# uname -a SunOS binder 5.7 Generic sun4u sparc SUNW,Ultra-5_10 binder# >Description: At least once a week (not including the once a week cron job I'm now running) I have to reboot the server due to slow proxy response, and hung jobs. I run the ../bin/apachectl stop but sometimes after I run this there are still httpd processes running. Initially they will be running off of the "killed" httpd process, but then they will be showing as children of the system process "1". When I try to do a kill -9 there is no response, and then I cannot start httpd, because it shows that port 80 is already "occupied." This results in me having to do a reboot of the server which really is not a good fix or work-a-round. Any help on this matter would be GREATLY appreciated. This is really starting to be a real problem for me. Another thing I noticed was that most of the http child processes show time stamps of a pretty recent hour, but the ones that won't die tend to be very "old" processes, sometimes hours but most of the time days. >How-To-Repeat: >Fix: I need to find a way to unconditionally kill these hung processes, and release control over port 80. >Audit-Trail: State-Changed-From-To: open-feedback State-Changed-By: lars State-Changed-When: Sun May 9 09:48:58 PDT 1999 State-Changed-Why: Do you use any custom modules? Do you know in what situations the segmentation fault occurs (can you reproduce it)? Release-Changed-From-To: version 1.3.4-1.3.4 Release-Changed-By: lars Release-Changed-When: Sun May 9 09:48:58 PDT 1999 From: karyn joseph To: apache-bugdb@apache.org, karyn.joseph@teradyne.com, lars@apache.org Cc: apbugs@apache.org Subject: Re: os-solaris/4374: [notice] child pid 1525 exit signal Segmentation Fault (11) Date: Wed, 19 May 1999 12:06:52 -0700 (PDT) ------------- Begin Forwarded Message ------------- Date: Mon, 10 May 1999 10:29:07 -0700 (PDT) From: karyn joseph Subject: Re: os-solaris/4374: [notice] child pid 1525 exit signal Segmentation Fault (11) To: apache-bugdb@apache.org, karyn.joseph@teradyne.com, lars@apache.org MIME-Version: 1.0 Content-MD5: 84Ax6mn1meWYK2nXEbDYQg== No custom modules in place. The segmentation fault happens periodically while the server is up and running. As I mentioned in the initial report, I will see jobs that have never released for over a period of time, and these jobs will not die, even after the initial pid has been killed. In other words, it is reproduced constantly. Although I don't know what is happening to hold the specific jobs in their "open" state. Karyn > Date: 9 May 1999 16:48:59 -0000 > To: apache-bugdb@apache.org, karyn.joseph@teradyne.com, lars@apache.org > From: lars@apache.org > Subject: Re: os-solaris/4374: [notice] child pid 1525 exit signal Segmentation Fault (11) > Mime-Version: 1.0 > > [In order for any reply to be added to the PR database, ] > [you need to include in the Cc line ] > [and leave the subject line UNCHANGED. This is not done] > [automatically because of the potential for mail loops. ] > [If you do not include this Cc, your reply may be ig- ] > [nored unless you are responding to an explicit request ] > [from a developer. ] > [Reply only with text; DO NOT SEND ATTACHMENTS! ] > > > Synopsis: [notice] child pid 1525 exit signal Segmentation Fault (11) > > State-Changed-From-To: open-feedback > State-Changed-By: lars > State-Changed-When: Sun May 9 09:48:58 PDT 1999 > State-Changed-Why: > > Do you use any custom modules? > Do you know in what situations the segmentation fault > occurs (can you reproduce it)? > > Release-Changed-From-To: version 1.3.4-1.3.4 > Release-Changed-By: lars > Release-Changed-When: Sun May 9 09:48:58 PDT 1999 > ------------- End Forwarded Message ------------- From: karyn joseph To: apache-bugdb@apache.org, karyn.joseph@teradyne.com, lars@apache.org Cc: apbugs@apache.org Subject: Re: os-solaris/4374: [notice] child pid 1525 exit signal Segmentation Fault (11) Date: Wed, 19 May 1999 12:09:59 -0700 (PDT) ------------- Begin Forwarded Message ------------- Date: Tue, 11 May 1999 17:55:33 -0700 (PDT) From: karyn joseph Subject: Re: os-solaris/4374: [notice] child pid 1525 exit signal Segmentation Fault (11) To: apache-bugdb@apache.org, karyn.joseph@teradyne.com, lars@apache.org MIME-Version: 1.0 Content-MD5: mdhfD4op2qIx1OuKc80B8A== Here is a sample of the hung jobs. This was done at 17:40. Notice all the jobs running since 05:00, etc: binder# ps -ef | grep httpd nobody 4595 291 0 05:38:28 ? 0:00 /usr/local/apache_www/bin/httpd nobody 12399 291 0 06:38:35 ? 0:02 /usr/local/apache_www/bin/httpd nobody 11668 291 0 17:29:31 ? 0:00 /usr/local/apache_www/bin/httpd root 291 1 0 00:17:12 ? 0:20 /usr/local/apache_www/bin/httpd nobody 1013 291 0 05:08:42 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4373 291 0 05:33:58 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4241 291 0 05:30:45 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4714 291 0 05:40:24 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17797 291 0 04:08:08 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4686 291 0 05:39:56 ? 0:00 /usr/local/apache_www/bin/httpd nobody 5804 291 0 05:52:23 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4129 291 0 05:28:55 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4371 291 0 05:33:55 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4131 291 0 05:28:55 ? 0:00 /usr/local/apache_www/bin/httpd nobody 6050 291 0 05:54:10 ? 0:00 /usr/local/apache_www/bin/httpd nobody 5794 291 0 05:52:22 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4531 291 0 05:37:35 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4471 291 0 05:37:12 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4710 291 0 05:40:23 ? 0:00 /usr/local/apache_www/bin/httpd nobody 5005 291 0 05:43:26 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4381 291 0 05:34:25 ? 0:00 /usr/local/apache_www/bin/httpd nobody 9090 291 0 03:07:28 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4337 291 0 05:32:24 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4589 291 0 05:38:27 ? 0:00 /usr/local/apache_www/bin/httpd nobody 6214 291 0 05:55:54 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4153 291 0 05:29:46 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4355 291 0 05:32:57 ? 0:00 /usr/local/apache_www/bin/httpd nobody 24799 291 0 04:35:13 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17259 291 0 04:06:02 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4133 291 0 05:28:55 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4164 291 0 05:30:00 ? 0:00 /usr/local/apache_www/bin/httpd nobody 13712 291 0 17:32:57 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4194 291 0 05:30:26 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4351 291 0 05:32:55 ? 0:00 /usr/local/apache_www/bin/httpd nobody 6134 291 0 05:54:54 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4146 291 0 05:29:13 ? 0:00 /usr/local/apache_www/bin/httpd nobody 7310 291 0 06:07:42 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4529 291 0 05:37:34 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4123 291 0 05:28:54 ? 0:00 /usr/local/apache_www/bin/httpd nobody 5042 291 0 05:43:30 ? 0:00 /usr/local/apache_www/bin/httpd nobody 4119 291 0 05:28:53 ? 0:00 /usr/local/apache_www/bin/httpd . . . . nobody 14623 291 0 17:35:39 ? 0:00 /usr/local/apache_www/bin/httpd nobody 2816 291 0 15:44:16 ? 0:00 /usr/local/apache_www/bin/httpd nobody 11233 291 0 17:28:54 ? 0:00 /usr/local/apache_www/bin/httpd nobody 9088 291 0 17:25:28 ? 0:00 /usr/local/apache_www/bin/httpd nobody 14303 291 0 17:34:00 ? 0:00 /usr/local/apache_www/bin/httpd nobody 10659 291 0 17:27:50 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17740 291 0 13:58:13 ? 0:00 /usr/local/apache_www/bin/httpd nobody 22397 291 0 15:29:11 ? 0:08 /usr/local/apache_www/bin/httpd nobody 9948 291 0 17:26:41 ? 0:00 /usr/local/apache_www/bin/httpd nobody 12116 291 0 17:30:05 ? 0:00 /usr/local/apache_www/bin/httpd nobody 19447 291 0 15:25:32 ? 0:06 /usr/local/apache_www/bin/httpd nobody 12185 291 0 17:30:09 ? 0:00 /usr/local/apache_www/bin/httpd nobody 14794 291 0 17:36:13 ? 0:00 /usr/local/apache_www/bin/httpd nobody 14062 291 0 17:33:31 ? 0:00 /usr/local/apache_www/bin/httpd nobody 6883 291 0 13:43:15 ? 0:00 /usr/local/apache_www/bin/httpd nobody 9851 291 0 17:26:37 ? 0:00 /usr/local/apache_www/bin/httpd nobody 23369 291 0 11:36:30 ? 0:00 /usr/local/apache_www/bin/httpd nobody 14436 291 0 17:34:39 ? 0:00 /usr/local/apache_www/bin/httpd nobody 14913 291 0 17:36:57 ? 0:00 /usr/local/apache_www/bin/httpd nobody 5691 291 0 13:41:07 ? 0:00 /usr/local/apache_www/bin/httpd nobody 19262 291 0 15:25:18 ? 0:06 /usr/local/apache_www/bin/httpd nobody 7882 291 0 13:44:48 ? 0:00 /usr/local/apache_www/bin/httpd nobody 10693 291 0 17:27:53 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17028 291 0 13:57:26 ? 0:00 /usr/local/apache_www/bin/httpd nobody 1541 291 0 15:42:33 ? 0:00 /usr/local/apache_www/bin/httpd nobody 5749 291 0 13:41:10 ? 0:00 /usr/local/apache_www/bin/httpd nobody 2161 291 0 15:43:23 ? 0:00 /usr/local/apache_www/bin/httpd nobody 12921 291 0 17:31:10 ? 0:00 /usr/local/apache_www/bin/httpd nobody 29123 291 0 15:38:34 ? 0:00 /usr/local/apache_www/bin/httpd binder# Then I run a stop: binder# stophttpd /usr/local/apache_www/bin/apachectl stop: httpd stopped After running that I still get: binder# !! ps -ef | grep httpd nobody 1013 1 0 05:08:42 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17797 1 0 04:08:08 ? 0:00 /usr/local/apache_www/bin/httpd nobody 9090 1 0 03:07:28 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17259 1 0 04:06:02 ? 0:00 /usr/local/apache_www/bin/httpd nobody 548 1 0 17:11:02 ? 0:00 /usr/local/apache_www/bin/httpd root 15595 14175 0 17:51:50 pts/2 0:00 grep httpd nobody 2816 1 0 15:44:16 ? 0:00 /usr/local/apache_www/bin/httpd nobody 19262 1 0 15:25:18 ? 0:06 /usr/local/apache_www/bin/httpd nobody 1541 1 0 15:42:33 ? 0:00 /usr/local/apache_www/bin/httpd nobody 2161 1 0 15:43:23 ? 0:00 /usr/local/apache_www/bin/httpd nobody 29123 1 0 15:38:34 ? 0:00 /usr/local/apache_www/bin/httpd binder# These I cannot kill: binder# kill -9 1013 binder# !ps ps -ef | grep httpd nobody 1013 1 0 05:08:42 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17797 1 0 04:08:08 ? 0:00 /usr/local/apache_www/bin/httpd nobody 9090 1 0 03:07:28 ? 0:00 /usr/local/apache_www/bin/httpd nobody 17259 1 0 04:06:02 ? 0:00 /usr/local/apache_www/bin/httpd nobody 548 1 0 17:11:02 ? 0:00 /usr/local/apache_www/bin/httpd root 15694 14175 0 17:54:51 pts/2 0:00 grep httpd nobody 2816 1 0 15:44:16 ? 0:00 /usr/local/apache_www/bin/httpd nobody 19262 1 0 15:25:18 ? 0:06 /usr/local/apache_www/bin/httpd nobody 1541 1 0 15:42:33 ? 0:00 /usr/local/apache_www/bin/httpd nobody 2161 1 0 15:43:23 ? 0:00 /usr/local/apache_www/bin/httpd nobody 29123 1 0 15:38:34 ? 0:00 /usr/local/apache_www/bin/httpd binder# The log has: binder# tail /net/aftp/webserver/logs/error_log httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 9090 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 17259 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 1013 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 1541 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 19262 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 29123 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 2161 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 548 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [error] could not make child process 2816 exit, attempting to continue anyway httpd: [Tue May 11 17:51:46 1999] [notice] caught SIGTERM, shutting down binder# At this point the server is in a basic stop state, and is running jobs. So if I reboot the machine it will cause major problems. If I could just somehow take over control of the port and make these jobs "belong" somewhere else it would be so helpful. I hope this gives you some more information to go on. HELP!! Please, Karyn > Date: 9 May 1999 16:48:59 -0000 > To: apache-bugdb@apache.org, karyn.joseph@teradyne.com, lars@apache.org > From: lars@apache.org > Subject: Re: os-solaris/4374: [notice] child pid 1525 exit signal Segmentation Fault (11) > Mime-Version: 1.0 > > [In order for any reply to be added to the PR database, ] > [you need to include in the Cc line ] > [and leave the subject line UNCHANGED. This is not done] > [automatically because of the potential for mail loops. ] > [If you do not include this Cc, your reply may be ig- ] > [nored unless you are responding to an explicit request ] > [from a developer. ] > [Reply only with text; DO NOT SEND ATTACHMENTS! ] > > > Synopsis: [notice] child pid 1525 exit signal Segmentation Fault (11) > > State-Changed-From-To: open-feedback > State-Changed-By: lars > State-Changed-When: Sun May 9 09:48:58 PDT 1999 > State-Changed-Why: > > Do you use any custom modules? > Do you know in what situations the segmentation fault > occurs (can you reproduce it)? > > Release-Changed-From-To: version 1.3.4-1.3.4 > Release-Changed-By: lars > Release-Changed-When: Sun May 9 09:48:58 PDT 1999 > ------------- End Forwarded Message ------------- State-Changed-From-To: feedback-open State-Changed-By: lars State-Changed-When: Sun Jun 13 05:19:54 PDT 1999 State-Changed-Why: >Unformatted: [In order for any reply to be added to the PR database, ] [you need to include in the Cc line ] [and leave the subject line UNCHANGED. This is not done] [automatically because of the potential for mail loops. ] [If you do not include this Cc, your reply may be ig- ] [nored unless you are responding to an explicit request ] [from a developer. ] [Reply only with text; DO NOT SEND ATTACHMENTS! ]