Friday, February 3, 2017

Fixing Error : The event log has overflowed and needs to be cleared.

I recently bought 2 Dell R710 Servers and needed to start testing them to see if they were functional or had issues to be resolved before trying to sell them.  When looking at the bios I confirmed the R710 's had Dual Xeon 5530 's running at 2.4 GHz with 2.66 GHz turbo.  However I didn't see anything mention about hyper-threading within the BIOS.  I remembered an easy way to see how many real cores/HyperThread cores you have is to run the MpMemory test which uses every core to put your RAM through its paces. So I loaded it up, went to custom scan, selected all the short tests, not just the default of 7/8, hit enter and got an error.

The event log has overflowed and needs to be cleared.
No existing memory related problems were found
Test(s) Failed

Here's how I went about fixing it.

First thing was to google the problem, I remember seeing the event log in one of the beginning menus, but couldn't quite place where and found my answer in this posting.
http://en.community.dell.com/support-forums/servers/f/906/t/19470371


This below posting for Legacy Server, which will probably be more useful to me when I start testing my 2950s, and any 9g, 9th generation dell server.
http://www.dell.com/support/article/us/en/19/SLN132517


Following along with the accepted Dell support answer, we need to access the iDRAC.  However with a new system, the iDRAC probably needs to be reconfigured. With our new server, first we're going to configure the iDRAC System and check out the error log.  Configuring the iDRAC will allow us to access the iDRAC remotely through an IP address which we're going to make static.  Since my weird Linksys home router by default seems to use 192.168.1.1 as the gateway address and the default Dell IP settings usually are all 192.168.0.xxx. We need to boot the system and hit Ctrl+E to setup our iDRAC with some of our own settings, while also clearing out the log within the iDRAC menu.

You can read the post on setting up your iDRAC or just look at the picture below to see where I changed the IP address.  Keep in mind my Dell R710 right now only has iDRAC Express.  So we're using a shared LOM, the RJ-45 port, on port 1 on the Dell R710 motherboard



Now that we've configured the iDRAC for remote use, lets check the error log.  Step 1.

1. READ THE LOG
Down arrow down to System Event Log Menu, (SEL) at the bottom and hit enter.  Dell alternates between calling this the SEL, or Embedded System Management (ESM) log.  This should go without saying, but read what problems your new surplus server has had in the past. The error log starts at the most recent but you can also jump backwards to an earlier record.  The event log caps out at 512 errors



I'm reading my log and it looks like I've had a lot of errors from the OS making the server turn off, which I didn't know was possible, and a lot of errors with PS 2, Power Supply 2 from 2011-2015, but not since then.  Which could mean my error log has been full since 2015. I did a quick check ripping out the power cord for the ps 1 and the system didn't flinch.  Checking google says to swap ps 2 with ps 1 and watch error logs to then determine if the PSU is broken, or if its actually the motherboard that might be a little wonky.  Something to watch in the future...which is why we read the log.


2. Clear the Log
There is no way to delete some of the messages, it's all or nothing.

3. Exit iDRAC
Escape out of the system, saving settings.

Saving iDRAC settings does not change anything to cause the computer to need to fully reboot, it continues its boot process



You're done now, this next section is just what happened to me on my new system without hard drives.

My system has no hard drives or boot partitions so the system continuing to boot arrives at a problem.  It gives a few options,
F1 to retry boot
F2 for System Setup, - which is the bios
F11 for BIOS boot manager 



Select F11
At the bottom after listing Normal, my Optical Drive and PXE boot configured networking card you have the option to select System setup(this goes to the bios), or System Services.  Choose System Services at the bottom, which causes the system to reboot, but goes into the normal F10 System Services UEFI boot menu option which is where our memory test will be accessed.


Goto Hardware Diagnostics. Select MpMemory test and run your MpMemory tests. Follow that up with your diagnostics as shown in the Diagnostics Post.

No comments:

Post a Comment