Saturday, August 13, 2016

linux - SAS disk error. should I replace my drive ASAP?



When I try to reboot UBUNTU server boot prints that he find "Foreign configuration" on the adapter.



When I boot from secondary (/dev/sdc) disk and run fsck it show me some errors on the first disk.
"Buffer I/O errors on the device sdb1, logical block block_number_here



dmesg also shows multiply errors on this disk. such as:





  1. lost page write due to I/O error on sdb1

  2. journal commit I/O error

  3. etc.



I have SAS disk Toshiba MK1001TRKB.
Should I replace all other disks of of Toshiba from my server ASAP?



Sorry but I cant post screenshots here.



Answer



You have an HDD Which is giving multiple errors, through various channels, and you're asking us if you should replace it?



Yes, of course you should. As soon as possible.



If you're asking whether or not to replace the other drives, that's trickier. If they've thrown no errors yet, you will probably end up having to pay to replace them (any support contract you've got is unlikely to cover pre-emptive replacement of non-failed components).



If the drives are accessible to the OS (ie, not hidden behind hardware RAID) you might want to consider running smartctl -t long against each drive in turn. If they pass that test, you really have no valid reasons to be suspicious of them.



Consider running that smartctl test from crontab one a quarter or so, and make sure to check the output.



No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...