透過 Smartmontools / smartctl 讀取硬碟 S.M.A.R.T. 資訊

S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology) 狀態經常被用於判讀硬碟的健康狀態以及提供健康狀態預警,在 Ubuntu / Debian 底下可以透過 smartmontools 套件底下的 smartctl 這支程式來讀取相關的資訊,因為經常不小心忘記用法,所以這邊做個筆記 …

透過 apt-get 安裝:

$ sudo apt-get install -y smartmontools

如果不是在 RAID 底下管理的硬碟,一般可以直接透過指定磁碟代號的方式如 /dev/hda 來讀取資訊或進行測試,例如

顯示監康狀態 (-H/--health)

$ sudo smartctl -H /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-lowlatency] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

顯示規格資訊 (-i)

$ sudo smartctl -i /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-lowlatency] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD4002FYYZ-01B7CB0
Serial Number:    K4HXYJVB
LU WWN Device Id: 5 000cca 25ddb40cc
Firmware Version: 01.01M02
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Apr 14 21:27:52 2017 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

其他常用的還有顯示全部 SMART 資訊的 -a / --all 或是顯示所有資訊的 -x / --xall 以及用來掃描裝置的 --scan 以及進行測試的 --test

透過 Raid Controller 管理的硬碟除了透過 Raid card 的管理工具來看健康狀態以外,同樣可以透過 smartctl 來讀取 SMART 資訊(參考支援列表:https://www.smartmontools.org/wiki/Supported_RAID-Controllers),這邊以 LSI MegaRAID SAS 9260-8i 為例 (LSI/MegaRAID 的貼牌卡還滿多的,用法相同),先透過管理工具 (storcli64) 列出特定 controller 上的所有硬碟 (看你要拿的硬碟資訊是掛在哪個控制器底下):

$ sudo storcli64 /c1 /eall /sall  show
Controller = 1
Status = Success
Description = Show Drive Information Succeeded.


Drive Information :
=================

--------------------------------------------------------------------------------
EID:Slt DID State DG     Size Intf Med SED PI SeSz Model                Sp Type
--------------------------------------------------------------------------------
30:0      6 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:1      7 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:2      8 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:3      9 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:4     10 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:5     11 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:6     12 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:7     13 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:8     14 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:9     15 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:10    16 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:11    17 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:12    18 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:13    19 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:14    20 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:15    21 GHS    - 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 D  -
30:16    22 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:17    23 Onln   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:18    24 Rbld   0 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 U  -
30:19    25 GHS    - 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 D  -
30:20    26 GHS    - 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 D  -
30:21    27 GHS    - 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 D  -
30:22    28 GHS    - 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 D  -
30:23    29 GHS    - 3.637 TB SATA HDD N   N  512B WDC WD40EFRX-68WT0N0 D  -
--------------------------------------------------------------------------------

EID-Enclosure Device ID|Slt-Slot No.|DID-Device ID|DG-DriveGroup
DHS-Dedicated Hot Spare|UGood-Unconfigured Good|GHS-Global Hotspare
UBad-Unconfigured Bad|Onln-Online|Offln-Offline|Intf-Interface
Med-Media Type|SED-Self Encryptive Drive|PI-Protection Info
SeSz-Sector Size|Sp-Spun|U-Up|D-Down|T-Transition|F-Foreign
UGUnsp-Unsupported|UGShld-UnConfigured shielded|HSPShld-Hotspare shielded
CFShld-Configured shielded|Cpybck-CopyBack|CBShld-Copyback Shielded

再根據 Raid Card 的廠牌及 Device ID (DID) 帶入 smartctl, 例如 MegaRAID 就是 megaraid,這邊示範使用代號為 sdc 、 底下 DID = 15,的硬碟,其他參數則按照自行需求調整 :

$ sudo smartctl -H -d megaraid,15 /dev/sdc
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

若是不確定有哪些裝置或是手上的 Raid Card 該用什麼名稱 … 就用 --scan 列出吧:

$ sudo smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/sdb -d scsi # /dev/sdb, SCSI device
/dev/bus/0 -d megaraid,2 # /dev/bus/0 [megaraid_disk_02], SCSI device
/dev/bus/1 -d megaraid,6 # /dev/bus/1 [megaraid_disk_06], SCSI device
/dev/bus/1 -d megaraid,7 # /dev/bus/1 [megaraid_disk_07], SCSI device
/dev/bus/1 -d megaraid,8 # /dev/bus/1 [megaraid_disk_08], SCSI device
/dev/bus/1 -d megaraid,9 # /dev/bus/1 [megaraid_disk_09], SCSI device

井字號後面的是註解,/dev/bus/ 後面接的數字以我這邊的例子看起來是 Raid Card 的 Controller 編號,使用時除了透過硬碟代號,直接給 /dev/bus/ 也是可行的,例如:

$ sudo smartctl /dev/bus/1 -d megaraid,6 -H
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.4.0-70-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

好像常用的就大概這樣了,還有想到再補 …