![]() |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() |
|
| Plugged In Online Gaming, and Technology |
![]() |
| | LinkBack | Thread Tools |
| | #1 (permalink) |
| Active Member Join Date: Aug 2006 Location: Boise, Idaho | Odd server lockup/reboot/shutdown problem
So I've build myself a server with the following parts: 4U Rackmount case w/ 1x 120mm front fan and 2x 80mm rear fans iStarUSA TC-500PD1 500W Server PSU Asus M2N-LR motherboard 3 x Western Digital 160gb Sata 3.0 in RAID 5 AMD Opteron 1214 Santa Ana 2.2Ghz AM2 4x Kingston 1GB DDR2 SDRAM 667 ECC Unbuffered Samsung DVD-RW SATA Windows 2003 Enterprise x64 Ed. The system went together fine, and seems to operate fine. All parts have checked out OK when tested using various methods. (Ultimate Boot CD, Burn-In software, etc) Under a load the system has no issues and reponds quickly. When left idle the system may stop reponding or reboot or power off. Since the day it was build on Feb. 7th, it has stopped reponding 3 times, rebooted 2 times and powered off 1 time. I'm at a loss, I can't catch the machine actually locking up/rebooting/shutting down, I only find it after the fact. I've checked the event longs and nothing looks hinkey. It only happens when idle. I've checked and re-checked that. I only have one theory, and am testing that now. The motherboard can control the CPU fan speed, and seems to have it spin down quite low when the machine not under a load. I'm trying to see if it is possible that the CPU fan can idle so low that the motherboard thinks it's dead and tries to shut the system down. I'm turing off that feature so the CPU fan is full bore all the time. Any other thoughts? I currently have it running another 5 hour test.
__________________ -- Sayonara, not to be confused with cyanide, which is, of course, goodbye in any language. |
| | |
| | #3 (permalink) | |
| AKA Doughnut Holeschtein Join Date: Mar 2006 |
you may need to turn off apcmi support ... some implementations are buggy ... Disable this in the bios and see if the issue persists Regards Eric
__________________ Rogue Cell #5 Quote:
| |
| | |
| | #4 (permalink) | |
| I have a Phantom |
a quick scan of Asus web site and there is a new BIOS dated Jan. 30 Quote:
Last edited by Rayodder; 02-11-2008 at 07:06 PM. | |
| | |
| | #6 (permalink) |
| Active Member Join Date: Aug 2006 Location: Boise, Idaho |
Lrrpie-CT - None that I see. I can see the comments that I make when the machine comes back up. Those have a reason code in the form of "0x00000" ehalcik - I'll have to check that. Rayodder - Good thought. That slipped my mind.
__________________ -- Sayonara, not to be confused with cyanide, which is, of course, goodbye in any language. |
| | |
| | #7 (permalink) |
| Active Member Join Date: Aug 2006 Location: Boise, Idaho |
Ah Ha! Got a stop error! Event Type: Warning Event Source: USER32 Event Category: None Event ID: 1076 Date: 2/11/2008 Time: 5:08:43 PM User: CAMPUS-BBC\administrator Computer: XXXXXXX Description: The reason supplied by user XXXXXXXXX or for the last unexpected shutdown of this computer is: System Failure: Stop error Reason Code: 0x805000f Bug ID: Stop error Bugcheck String: 0x0000007e (0xffffffffc0000005, 0x0000000000000000, 0xfffffadf856e6b80, 0xfffffadf856e6590) Comment: 0x0000007e (0xffffffffc0000005, 0x0000000000000000, 0xfffffadf856e6b80, 0xfffffadf856e6590) For more information, see Help and Support Center at TechNet Events And Errors Message Center: Basic Search. Data: 0000: 0f 00 05 08 .... ------------------------------------ Then when I chose to send the information on the error to Microsoft, I got this information: Problem caused by Device Driver You received this message because a device driver installed on your computer caused the Windows operating system to stop unexpectedly. This type of error is referred to as a "stop error." A stop error requires you to restart your computer. More information -------------------------------------------------------------------------------- Problem report summary Problem type Windows stop error (a message appears on a blue screen with error code information) Solution available? No What does this problem mean? Windows has encountered a problem it cannot recover from and it needs to be restarted Cause Unknown Computer symptoms A message appears on a blue screen with error code information (for example: 0x0000001E, KMODE_EXCEPTION_NOT_HANDLED) Additional steps for you to take Please continue to send problem reports so analysts at Microsoft can study and try to correct the problem as quickly as possible
__________________ -- Sayonara, not to be confused with cyanide, which is, of course, goodbye in any language. |
| | |
| | #8 (permalink) |
| Active Member Join Date: Aug 2006 Location: Boise, Idaho |
Thanks for all the help folks! I was able to narrow down the issue. Updating the BIOS and drivers caused it not to crash, which then allowed the event log to capture and error! Hard disk failure! One of the drives is defective, kept breaking the RAID array. With the new Bios and RAID drivers it no longer happens when the drive wonks out. RMA time!
__________________ -- Sayonara, not to be confused with cyanide, which is, of course, goodbye in any language. |
| | |
| | #9 (permalink) |
| Will Code For Schlitz |
Good catch! Better get on that drive though ! I just spent the better part of a Friday night (Grrrrrr...always on Fridays!) chasing down a similar ghost. Turned out the embedded raid controller was getting flaky and would just randomly drop the entire container whenever it felt moody. Our 4 hour parts guarentee from HP is usually fine, but starting the 4 hour clock at 10pm on a Friday isn't my idea of a fun night...you'd be surprised how quickly another system becomes "un-necessary" when you need a part for the company owner's private server!
__________________ CHAPPY FOR GENERAL: Because he'll shoot your mom to win a game. Strive for that moment when you're only a slice of pizza and a hooker away from paradise. |
| | |