S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology) 狀態經常被用於判讀硬碟的健康狀態以及提供健康狀態預警,在 Ubuntu / Debian 底下可以透過 smartmontools 套件底下的 smartctl 這支程式來讀取相關的資訊,因為經常不小心忘記用法,所以這邊做個筆記 …
透過 apt-get 安裝:
$ sudo apt-get install -y smartmontools
如果不是在 RAID 底下管理的硬碟,一般可以直接透過指定磁碟代號的方式如 /dev/hda 來讀取資訊或進行測試,例如
顯示監康狀態 (-H/--health)
:
$ sudo smartctl -H /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-lowlatency] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
顯示規格資訊 (-i)
:
$ sudo smartctl -i /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-lowlatency] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: WDC WD4002FYYZ-01B7CB0
Serial Number: K4HXYJVB
LU WWN Device Id: 5 000cca 25ddb40cc
Firmware Version: 01.01M02
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Apr 14 21:27:52 2017 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
其他常用的還有顯示全部 SMART 資訊的 -a / --all
或是顯示所有資訊的 -x / --xall
以及用來掃描裝置的 --scan
以及進行測試的 --test
等
透過 Raid Controller 管理的硬碟除了透過 Raid card 的管理工具來看健康狀態以外,同樣可以透過 smartctl 來讀取 SMART 資訊(參考支援列表:https://www.smartmontools.org/wiki/Supported_RAID-Controllers),這邊以 LSI MegaRAID SAS 9260-8i 為例 (LSI/MegaRAID 的貼牌卡還滿多的,用法相同),先透過管理工具 (storcli64) 列出特定 controller 上的所有硬碟 (看你要拿的硬碟資訊是掛在哪個控制器底下):
$ sudo storcli64 /c1 /eall /sall show
Controller = 1
Status = Success
Description = Show Drive Information Succeeded.
Drive Information :
=================
--------------------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp Type
--------------------------------------------------------------------------------
30:0 6 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:1 7 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:2 8 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:3 9 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:4 10 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:5 11 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:6 12 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:7 13 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:8 14 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:9 15 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:10 16 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:11 17 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:12 18 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:13 19 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:14 20 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:15 21 GHS - 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 D -
30:16 22 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:17 23 Onln 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:18 24 Rbld 0 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 U -
30:19 25 GHS - 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 D -
30:20 26 GHS - 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 D -
30:21 27 GHS - 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 D -
30:22 28 GHS - 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 D -
30:23 29 GHS - 3.637 TB SATA HDD N N 512B WDC WD40EFRX-68WT0N0 D -
--------------------------------------------------------------------------------
EID-Enclosure Device ID|Slt-Slot No.|DID-Device ID|DG-DriveGroup
DHS-Dedicated Hot Spare|UGood-Unconfigured Good|GHS-Global Hotspare
UBad-Unconfigured Bad|Onln-Online|Offln-Offline|Intf-Interface
Med-Media Type|SED-Self Encryptive Drive|PI-Protection Info
SeSz-Sector Size|Sp-Spun|U-Up|D-Down|T-Transition|F-Foreign
UGUnsp-Unsupported|UGShld-UnConfigured shielded|HSPShld-Hotspare shielded
CFShld-Configured shielded|Cpybck-CopyBack|CBShld-Copyback Shielded
再根據 Raid Card 的廠牌及 Device ID (DID) 帶入 smartctl, 例如 MegaRAID 就是 megaraid,這邊示範使用代號為 sdc 、 底下 DID = 15,的硬碟,其他參數則按照自行需求調整 :
$ sudo smartctl -H -d megaraid,15 /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
若是不確定有哪些裝置或是手上的 Raid Card 該用什麼名稱 … 就用 --scan
列出吧:
$ sudo smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device
/dev/bus/1 -d megaraid,6 # /dev/bus/1 [megaraid_disk_06], SCSI device
/dev/bus/1 -d megaraid,7 # /dev/bus/1 [megaraid_disk_07], SCSI device
/dev/bus/1 -d megaraid,8 # /dev/bus/1 [megaraid_disk_08], SCSI device
/dev/bus/1 -d megaraid,9 # /dev/bus/1 [megaraid_disk_09], SCSI device
井字號後面的是註解,/dev/bus/
後面接的數字以我這邊的例子看起來是 Raid Card 的 Controller 編號,使用時除了透過硬碟代號,直接給 /dev/bus/
也是可行的,例如:
$ sudo smartctl /dev/bus/1 -d megaraid,6 -H
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.
好像常用的就大概這樣了,還有想到再補 …