Dazed and confused… have you seen this… in the logs


Dazed and confused, but trying to continue….   have you seen something like this in your Linux Server logs especially the one running CentOS 5 on it. Following is the full Kernel Panic log:

Nov 17 16:59:25 localhost kernel: CIFS VFS: server <Source machine IP> of type Samba 3.0.28-0.el4.9 returned unexpected error on SMB posix open, disabling posix open support. Check if server update available.
Nov 17 17:34:44 localhost kernel: Uhhuh. NMI received for unknown reason 38.
Nov 17 17:34:44 localhost kernel: Do you have a strange power saving mode enabled?
Nov 17 17:34:44 localhost kernel: Dazed and confused, but trying to continue
Nov 17 17:34:44 localhost kernel: Uhhuh. NMI received for unknown reason 38.
Nov 17 17:34:44 localhost kernel: Do you have a strange power saving mode enabled?
Nov 17 17:34:44 localhost kernel: Dazed and confused, but trying to continue

Here is one of the related post on CentOS forum https://www.centos.org/modules/newbb/viewtopic.php?viewmode=flat&topic_id=23350&forum=38. Well there are many solutions to this like go to the BIOS and try to disable the power saving mode. Well i tried this solution but it didnt work. Here is my complete Situation and my solution:

OK… I had installed a blank machine with CentOS 5 following a pretty normal installation explained on the CentOS forum. I wanted to use this machine as a Backup server which would Sync (rsync) huge amounts of data. So i scheduled a Cron Job every night to rsync huge folder which was approx 700 to 800 GB. It use to hang after some syncing and there was no other option left but to do a hard boot. I tried googling and found some blog answer which said “try updating your BIOS” and then i saw that my Servers BIOS was old and so i updated the BIOS And did a clean installation of CentOS 5 again just to be on the safer side. But Still again the same problem… also tried the power saving mode off… same problem… so finally this is what i did…

the solution was damn simple… it was the bad RAM!!!…  it drove me crazy for almost 2 weeks!!!

Since you know rsync computes the difference between the source folder tree and the destination folder tree, to do that it needs to hold the whole folder trees in the memory. So from that idea i checked all my 4 RAM modules. I did this by removing one RAM module at a time and running the rsync to see if it hangs(i know this sounds crazy… but there might be a better way… may be memtest86). Doing this for each module i finally found the Culprit…  there was 1 RAM module out of the 4 which was bad… i replaced it with a new one and till now it is working like a charm!!!

Advertisements

About Dominic

J for JAVA more about me : http://about.me/dominicdsouza
This entry was posted in Thechy Stuff. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s