The Index
What do you want to check?
The Index is an alphabetical ordered list of keywords
which should give you an answer to the question: How can I check
<keyword>?
active-iq
Active IQ, formerly known as ASUP
see
ASUP
aggregate
Usage
checks the used space in volumes and aggregates. Thresholds can be set in GB or percent.
AggregateState
checks the aggregates-state. Alarms if they are not online (configurable).
check_netapp_scrub.pl
sends an alarm if the last scrubs timestamp of an aggregate is over a certain age.
OvercommitAggr
returns a list of aggregates together with their overcommitment in percent. Overcommitment is the relation between the aggregates size and the total of all its (thin provisioned) volumes sizes.
Raidstatus
alarms if one of the RAIDs is degraded.
UsageTrend
checks the time how long ist would last until an aggregate or volume is full, if the trend of the last 48h (configurable) would continue. Checks both bytes and inodes.
PerfAggregate
checks the 'latency', 'transfer-rate' and other performance counters per aggregate. Shows details for total, read, write and other. Also averages and totals over all aggregates of the filer can be measured and monitored, which allows the monitoring of the aggregate-latency and aggretate-transfer-rate on the filer level.
SyncMirror
checks the mirror-status on Metro Cluster aggregates.
ASUP
Auto Support
check_netapp_asup.pl
monitors the ASUP-log and alarms if failed transmissions or collections were found.
autosize
AutosizeMode
checks the autosize-mode of autosized volumes if they are all set to given value (grow, grow_shrink, ...)
VolumeAutosize
checks a volumes total-size and alerts when the volume is close to being full relative to the autosize maximum.
check_netapp_ems
checks the ems-log for the number of specific events per time-unit (rate). Alarm if e.g. too many autogrow-events took place within the last hour or day.
[help]
backup
check_netapp_snapcenter
checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected.
[help]
BYP
disk bypass events
ShelfBay
checks, the shelf- and disk-port status. Can alarm BYP-status disks.
certificate
Server and other certificates
check_netapp_certificate
checks SVM (and possibly also other) certificates for their expiration time. Will trigger an alarm if certificates expire soon (configurable thresholds).
[help]
cifs
Common Internet File System, SMB network protokoll
PerfSys
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
cluster
check_netapp_takeover.pl
sends an alarm if the storage failover facility is disabled or otherwise not active.
check_netapp_mc_config.pl
checks a metro-clusters mode and configuration state.
ClusterPeerHealth
checks the health of cluster peer relationships by evaluating several ping- and health-status.
SyncMirror
checks the mirror-status on Metro Cluster aggregates.
consistency-points
Wafl
reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.
crc-error
PerfTcpIp
checks CRC errors and packets send/received for both the IP and TCP layer.
PerfNic
checks various performance counters of a NetApps *physical* network
interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.
debugging
check_netapp_process.pl
checks for runaway processes on a filer (as shown with the ps command).
check_netapp_scrub.pl
sends an alarm if the last scrubs timestamp of an aggregate is over a certain age.
check_netapp_time.pl
checks the filers NTP configuration (at least one ntp server must be configured) and measures the drift between the filers system-time and the monitoring server. Can alarm if that drift is getting too high.
check_netapp_unused_lun.pl
checks for luns which are online but do not have an initiator connected.
DiskPaths
checks if each disk has a given number and pattern of paths (A/B, B/A, ABAB, ABBA, ...).
Job
checks for failed jobs.
LunAlignment
searches for misaligned luns. Alarms if a certain number of misaligned luns is reached.
NetInterface
checks if a network interfaces current-port is not equal to its home-port (output of the CLI command `network interface show -is-home false`). Can also check it's operational mode (up/down).
ServiceProcessor
checks the status of the nodes service-processor and if they are correctly configured (autoupdate, IP-address).
BadlyPerformingDisks
checks all disks in a NetApp system or in a specific raid-group. If a certain number of them performes badly (=has a high utilization) an alarm is send.
PerfTcpIp
checks CRC errors and packets send/received for both the IP and TCP layer.
VolumeAge
searches for and flags volumes which have been created a (configurable) long time ago. An old age may be an indication for a forgotten and unused volume-clone. The logic can be also inverted to search for volumes with an exceptional short age (which have been created within the last day or so).
PerfNic
checks various performance counters of a NetApps *physical* network
interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.
deduplication
Sis
checks dedup-values (stale-fingerprint-percentage, run-time of last successfull operation).
SisStatus
find volumes whose compression or deduplication is not enabled.
disk
check_netapp_spare
counts the number of available spare-disks or -partitions and sends an alarm if this number falls below a given threshold. Considers also the type of the disk (or partition) and its location.
[help]
Disk
checks for failed, offline or unassigned disks on the filer.
DiskCount
counts the number of disks matching defineable criteria (disk-type, container (spare, ...), storage-pool). Mostly used to monitor the number of spare-disks of a certain type.
DiskPathQuality
hecks disk path qualities, reports i/o-error percentages and raises a CRITICAL error whenever an error percentage is above zero.
DiskPaths
checks if each disk has a given number and pattern of paths (A/B, B/A, ABAB, ABBA, ...).
ShelfBay
checks, the shelf- and disk-port status. Can alarm BYP-status disks.
BadlyPerformingDisks
checks all disks in a NetApp system or in a specific raid-group. If a certain number of them performes badly (=has a high utilization) an alarm is send.
PerfDisk
checks all disks in a NetApp system for their utilization (Percentage of time there was at least one outstanding request to the disk). Optional the check can be limited to the disks of a single aggregate.
PerfHostadapter
checks and counts rates per host adapter (Fibre Channel, Serial Attached SCSI, and parallel SCSI).
PerfSys
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
EMS
Event Management System
check_netapp_ems
checks the ems-log for the number of specific events per time-unit (rate). Alarm if e.g. too many autogrow-events took place within the last hour or day.
[help]
fibre-channel
FCPAdapter
checks the operational status of all fcp adapters.
PerfHostadapter
checks and counts rates per host adapter (Fibre Channel, Serial Attached SCSI, and parallel SCSI).
head
Head
monitors the heads hardware objects (fans, NVRAM, power-supplies, health-state, temperature-sensors)
PerfCpu
checks one or all processors in a NetApp system for their utilization.
health
Health States
check_netapp_health
monitors the system- and subsystems health state. Sends an alarm if the system health status does not match a given pattern like 'ok'.
[help]
check_netapp_health.pl
monitors the system health. Sends an alarm if the system health status is anything other than 'ok'.
HTTP
HyperText Transfer Protocol
PerfSys
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
inode
Usage
checks the used space in volumes and aggregates. Thresholds can be set in GB or percent.
UsageTrend
checks the time how long ist would last until an aggregate or volume is full, if the trend of the last 48h (configurable) would continue. Checks both bytes and inodes.
iops
input / output operations
ReportIOPS
reports how many iops are consumed by a given tenant.
iSCSI
internet Small Computer System Interface
latency
PerfAggregate
checks the 'latency', 'transfer-rate' and other performance counters per aggregate. Shows details for total, read, write and other. Also averages and totals over all aggregates of the filer can be measured and monitored, which allows the monitoring of the aggregate-latency and aggretate-transfer-rate on the filer level.
LunLatency
Checks the 'latency' and 'operations per second' (ops) per LUN. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.
PerfVolume
checks the 'latency' and 'operations per second' (ops) per volume. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.
license
check_netapp_license.pl
checks the filer for expiring (demo-)licenses.
logfile
check_netapp_asup.pl
monitors the ASUP-log and alarms if failed transmissions or collections were found.
Job
checks for failed jobs.
check_netapp_ems
checks the ems-log for the number of specific events per time-unit (rate). Alarm if e.g. too many autogrow-events took place within the last hour or day.
[help]
LUN
Logical Unit Number
check_netapp_unused_lun.pl
checks for luns which are online but do not have an initiator connected.
LunAlignment
searches for misaligned luns. Alarms if a certain number of misaligned luns is reached.
LunSize
checks the unused but allocated blocks inside of a LUN. Notfifys the admin if they exceed a certain number (he may than run an unmap procedure on vmware).
LunState
checks the LUN-states. Alarms if they are offline or not mapped to an initiator.
LunLatency
Checks the 'latency' and 'operations per second' (ops) per LUN. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.
memory
BufferCache
checks several metrics of the system buffer cache (=system memory) like Buffers being read, Buffers being written, Empty (unused) buffers, Buffers with modified data, Buffers associated with CP IO, ...
FlashCache
checks several metrics of the external FlashCache (PAM II) like External cache hit rate, Average latency of read I/Os, Number of wafl buffers served off the external cache, ...
multi-tenant
ReportIOPS
reports how many iops are consumed by a given tenant.
ReportSpace
reports how much space in bytes are consumed by a given tenant.
network
NetPort
checks if the network-interfaces are enabled or not
FCPAdapter
checks the operational status of all fcp adapters.
IfGrp
checks if an interface-group has enough links in up-state to still be redundant.
NetInterface
checks if a network interfaces current-port is not equal to its home-port (output of the CLI command `network interface show -is-home false`). Can also check it's operational mode (up/down).
PerfIf
checks and counts transfer-rates and errors per network-interface (ifnet). Especially useful for monitoring 10GbE-ports.
PerfLif
checks and counts transfer-rates and errors per network-interface (lif) for DataONTAP 8.2.x. or higher.
PerfSys
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
PerfTcpIp
checks CRC errors and packets send/received for both the IP and TCP layer.
PerfNic
checks various performance counters of a NetApps *physical* network
interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.
PerfNic
checks various performance counters of a NetApps *physical* network
interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.
nfs
Network File System
PerfSys
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
node
Job
checks for failed jobs.
ServiceProcessor
checks the status of the nodes service-processor and if they are correctly configured (autoupdate, IP-address).
PerfSysNode
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
NVRAM
Non-Volatile RAM
NVRAM
checks data-rates and latency of the NVRAM.
Wafl
reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.
overcommit
OvercommitAggr
returns a list of aggregates together with their overcommitment in percent. Overcommitment is the relation between the aggregates size and the total of all its (thin provisioned) volumes sizes.
check_netapp_process.pl
checks for runaway processes on a filer (as shown with the ps command).
PerfNic
checks various performance counters of a NetApps *physical* network
interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.
process
processes running in the filers OS, output of the ps-command
check_netapp_process.pl
checks for runaway processes on a filer (as shown with the ps command).
qtree
PerfQtree
checks some ops-counters per q-tree (nfs-ops, cifs-ops, ...).
quota
check_netapp_quotas.pl
monitors quotas on a NetApp-filer (cluster mode only).
RAID
Redundant Array of Independent Disks
Raidstatus
alarms if one of the RAIDs is degraded.
SAS
Serial Attached SCSI
PerfHostadapter
checks and counts rates per host adapter (Fibre Channel, Serial Attached SCSI, and parallel SCSI).
SCSI
Small Computer System Interface
see
SAS
shelf
ShelfEnvironment
checks, the shelf-status, power-supplys, temperature, fans, voltage-sensor and current-sensor on the shelves.
ShelfBay
checks, the shelf- and disk-port status. Can alarm BYP-status disks.
SnapCenter
check_netapp_snapcenter
checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected.
[help]
snapmirror
SnapMirror
SnapMirrorMetrics
checks and logs SnapMirrors (including type Vault): lag-time, last-transfer-duration, last-transfer-size
SnapMirrorState
checks and logs for SnapMirror (including type Vault): health, mirror-state
UnprotectedVolume
checks for volumes not protected by SnapMirror.
check_netapp_snapcenter
checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected.
[help]
check_netapp_snapmirror
checks the health- and mirror-state of snap-mirror relations. Monitors also different metrics (lag-time, last-transfer-size, last-transfer-duration)
[help]
snapshot
Snapshots
checks, if the snap-reserve is still sufficient. Thresholds are set in percent; performance-data can be either in percent or absolute (Byte). Additional criteria are the age or name of the snapshot. This can be used for monitoring snapshot-backups and whether they are up to date or not. Also can be used to find snapshots related to a specific application like SNMV and check all volumes for left-over snapshots.
SnapshotLessVolume
searches for volumes which do not have snapshots.
check_netapp_snapcenter
checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected.
[help]
snapvault
SnapVault
see
snapmirror
check_netapp_snapcenter
checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected.
[help]
spare-disk
check_netapp_spare
counts the number of available spare-disks or -partitions and sends an alarm if this number falls below a given threshold. Considers also the type of the disk (or partition) and its location.
[help]
DiskCount
counts the number of disks matching defineable criteria (disk-type, container (spare, ...), storage-pool). Mostly used to monitor the number of spare-disks of a certain type.
SVM
Storage Virtual Machines, formerly known as Vserver
LunAlignment
searches for misaligned luns. Alarms if a certain number of misaligned luns is reached.
ReportIOPS
reports how many iops are consumed by a given tenant.
ReportSpace
reports how much space in bytes are consumed by a given tenant.
Vserver
monitors the admin-state or the operational-status of a Vserver (running, stopped, inconsistent or defunct)
MetroClusterVserver
sends an alarm if the configuration state of a MetroCluster vserver changes to unhealthy.
check_netapp_certificate
checks SVM (and possibly also other) certificates for their expiration time. Will trigger an alarm if certificates expire soon (configurable thresholds).
[help]
syncmirror
SyncMirror
SyncMirror
checks the mirror-status on Metro Cluster aggregates.
system
check_netapp_health
monitors the system- and subsystems health state. Sends an alarm if the system health status does not match a given pattern like 'ok'.
[help]
Uptime
checks the seconds since last reboot.
check_netapp_process.pl
checks for runaway processes on a filer (as shown with the ps command).
check_netapp_time.pl
checks the filers NTP configuration (at least one ntp server must be configured) and measures the drift between the filers system-time and the monitoring server. Can alarm if that drift is getting too high.
BufferCache
checks several metrics of the system buffer cache (=system memory) like Buffers being read, Buffers being written, Empty (unused) buffers, Buffers with modified data, Buffers associated with CP IO, ...
FlashCache
checks several metrics of the external FlashCache (PAM II) like External cache hit rate, Average latency of read I/Os, Number of wafl buffers served off the external cache, ...
NVRAM
checks data-rates and latency of the NVRAM.
PerfCpu
checks one or all processors in a NetApp system for their utilization.
PerfSys
checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
Wafl
reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.
volume
Usage
checks the used space in volumes and aggregates. Thresholds can be set in GB or percent.
AutosizeMode
checks the autosize-mode of autosized volumes if they are all set to given value (grow, grow_shrink, ...)
ReportIOPS
reports how many iops are consumed by a given tenant.
ReportSpace
reports how much space in bytes are consumed by a given tenant.
Sis
checks dedup-values (stale-fingerprint-percentage, run-time of last successfull operation).
SisStatus
find volumes whose compression or deduplication is not enabled.
SnapshotLessVolume
searches for volumes which do not have snapshots.
UnprotectedVolume
checks for volumes not protected by SnapMirror.
UsageTrend
checks the time how long ist would last until an aggregate or volume is full, if the trend of the last 48h (configurable) would continue. Checks both bytes and inodes.
VolumeAutosize
checks a volumes total-size and alerts when the volume is close to being full relative to the autosize maximum.
VolumeState
checks the volume-states. Alarms if they are not online (configurable).
PerfVolume
checks the 'latency' and 'operations per second' (ops) per volume. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.
VolumeAge
searches for and flags volumes which have been created a (configurable) long time ago. An old age may be an indication for a forgotten and unused volume-clone. The logic can be also inverted to search for volumes with an exceptional short age (which have been created within the last day or so).
WAFL
Write Anywhere File Layout
Wafl
reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.
S3
S3 Storage
check_netapp_s3bucket
checks the free or used size in an S3 bucket linked to a NetApp filer.
[help]