I suspect my system's drive controller is failing. Can anyone use these
symptoms (below) to confirm my suspicion? I know it's a long shot to
give definitive diagnosis, given many possible variables:
Setup:
Two 36Gb SCSI drives setup as LVM giving 72Gb as one volume, running
from a PCI-X 64bit soclet using an LSI logic 2-channel U320 controller
through terminated internal 68-pin ribbon cable.
Symptoms:
After running for different, random lengths of uptime the system may
show odd behaviour in the way it's running its apps. Running 'top' as a
quick process check always reports "input/output error" at this point.
Or, same as above except it can be in that state as a cold system right
from booting. Forced reboot (once or even twice, occasionally) sees the
system recover it's journal and be okay for days on end.
I had a similar problem a few years back with similar setup. In that
case I finally realised that it always coincided with very hot ambient
temps and the discs overheating. I'm since using an identical setup but
with different controller card and mb. I made a point of adding a fan
blowing directly on the drives - and anyway ambient temps are still
quite low in my workspace.
Any ideas of what to test for?
Gavin
_______________________________________________
PLUG discussion list: plug@???
http://www.plug.org.au/mailman/listinfo/plug
Committee e-mail: committee@???