The Index

What do you want to check?

The Index is an alphabetical ordered list of keywords which should give you an answer to the question: How can I check <keyword>?

active-iq

Active IQ, formerly known as ASUP

see ASUP

aggregate

Usage checks the used space in volumes and aggregates. Thresholds can be set in GB or percent.
AggregateState checks the aggregates-state. Alarms if they are not online (configurable).
check_netapp_scrub.pl sends an alarm if the last scrubs timestamp of an aggregate is over a certain age.
OvercommitAggr returns a list of aggregates together with their overcommitment in percent. Overcommitment is the relation between the aggregates size and the total of all its (thin provisioned) volumes sizes.
Raidstatus alarms if one of the RAIDs is degraded.
UsageTrend checks the time how long ist would last until an aggregate or volume is full, if the trend of the last 48h (configurable) would continue. Checks both bytes and inodes.
PerfAggregate checks the 'latency', 'transfer-rate' and other performance counters per aggregate. Shows details for total, read, write and other. Also averages and totals over all aggregates of the filer can be measured and monitored, which allows the monitoring of the aggregate-latency and aggretate-transfer-rate on the filer level.
SyncMirror checks the mirror-status on Metro Cluster aggregates.

asis

see deduplication

ASUP

Auto Support

check_netapp_asup.pl monitors the ASUP-log and alarms if failed transmissions or collections were found.

autosize

AutosizeMode checks the autosize-mode of autosized volumes if they are all set to given value (grow, grow_shrink, ...)
VolumeAutosize checks a volumes total-size and alerts when the volume is close to being full relative to the autosize maximum.
check_netapp_ems checks the ems-log for the number of specific events per time-unit (rate). Alarm if e.g. too many autogrow-events took place within the last hour or day. [help]

backup

check_netapp_snapcenter checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected. [help]

BYP

disk bypass events

ShelfBay checks, the shelf- and disk-port status. Can alarm BYP-status disks.

certificate

Server and other certificates

check_netapp_certificate checks SVM (and possibly also other) certificates for their expiration time. Will trigger an alarm if certificates expire soon (configurable thresholds). [help]

cifs

Common Internet File System, SMB network protokoll

PerfSys checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.

cluster

check_netapp_takeover.pl sends an alarm if the storage failover facility is disabled or otherwise not active.
check_netapp_mc_config.pl checks a metro-clusters mode and configuration state.
ClusterPeerHealth checks the health of cluster peer relationships by evaluating several ping- and health-status.
SyncMirror checks the mirror-status on Metro Cluster aggregates.

consistency-points

Wafl reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.

crc-error

PerfTcpIp checks CRC errors and packets send/received for both the IP and TCP layer.
PerfNic checks various performance counters of a NetApps *physical* network interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.

debugging

check_netapp_process.pl checks for runaway processes on a filer (as shown with the ps command).
check_netapp_scrub.pl sends an alarm if the last scrubs timestamp of an aggregate is over a certain age.
check_netapp_time.pl checks the filers NTP configuration (at least one ntp server must be configured) and measures the drift between the filers system-time and the monitoring server. Can alarm if that drift is getting too high.
check_netapp_unused_lun.pl checks for luns which are online but do not have an initiator connected.
DiskPaths checks if each disk has a given number and pattern of paths (A/B, B/A, ABAB, ABBA, ...).
Job checks for failed jobs.
LunAlignment searches for misaligned luns. Alarms if a certain number of misaligned luns is reached.
NetInterface checks if a network interfaces current-port is not equal to its home-port (output of the CLI command `network interface show -is-home false`). Can also check it's operational mode (up/down).
ServiceProcessor checks the status of the nodes service-processor and if they are correctly configured (autoupdate, IP-address).
BadlyPerformingDisks checks all disks in a NetApp system or in a specific raid-group. If a certain number of them performes badly (=has a high utilization) an alarm is send.
PerfTcpIp checks CRC errors and packets send/received for both the IP and TCP layer.
VolumeAge searches for and flags volumes which have been created a (configurable) long time ago. An old age may be an indication for a forgotten and unused volume-clone. The logic can be also inverted to search for volumes with an exceptional short age (which have been created within the last day or so).
PerfNic checks various performance counters of a NetApps *physical* network interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.

deduplication

Sis checks dedup-values (stale-fingerprint-percentage, run-time of last successfull operation).
SisStatus find volumes whose compression or deduplication is not enabled.

disk

check_netapp_spare counts the number of available spare-disks or -partitions and sends an alarm if this number falls below a given threshold. Considers also the type of the disk (or partition) and its location. [help]
Disk checks for failed, offline or unassigned disks on the filer.
DiskCount counts the number of disks matching defineable criteria (disk-type, container (spare, ...), storage-pool). Mostly used to monitor the number of spare-disks of a certain type.
DiskPathQuality hecks disk path qualities, reports i/o-error percentages and raises a CRITICAL error whenever an error percentage is above zero.
DiskPaths checks if each disk has a given number and pattern of paths (A/B, B/A, ABAB, ABBA, ...).
ShelfBay checks, the shelf- and disk-port status. Can alarm BYP-status disks.
BadlyPerformingDisks checks all disks in a NetApp system or in a specific raid-group. If a certain number of them performes badly (=has a high utilization) an alarm is send.
PerfDisk checks all disks in a NetApp system for their utilization (Percentage of time there was at least one outstanding request to the disk). Optional the check can be limited to the disks of a single aggregate.
PerfHostadapter checks and counts rates per host adapter (Fibre Channel, Serial Attached SCSI, and parallel SCSI).
PerfSys checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.

EMS

Event Management System

check_netapp_ems checks the ems-log for the number of specific events per time-unit (rate). Alarm if e.g. too many autogrow-events took place within the last hour or day. [help]

fibre-channel

FCPAdapter checks the operational status of all fcp adapters.
PerfHostadapter checks and counts rates per host adapter (Fibre Channel, Serial Attached SCSI, and parallel SCSI).
Head monitors the heads hardware objects (fans, NVRAM, power-supplies, health-state, temperature-sensors)
PerfCpu checks one or all processors in a NetApp system for their utilization.

health

Health States

check_netapp_health monitors the system- and subsystems health state. Sends an alarm if the system health status does not match a given pattern like 'ok'. [help]
check_netapp_health.pl monitors the system health. Sends an alarm if the system health status is anything other than 'ok'.

HTTP

HyperText Transfer Protocol

PerfSys checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.

inode

Usage checks the used space in volumes and aggregates. Thresholds can be set in GB or percent.
UsageTrend checks the time how long ist would last until an aggregate or volume is full, if the trend of the last 48h (configurable) would continue. Checks both bytes and inodes.

iops

input / output operations

ReportIOPS reports how many iops are consumed by a given tenant.

iSCSI

internet Small Computer System Interface

latency

PerfAggregate checks the 'latency', 'transfer-rate' and other performance counters per aggregate. Shows details for total, read, write and other. Also averages and totals over all aggregates of the filer can be measured and monitored, which allows the monitoring of the aggregate-latency and aggretate-transfer-rate on the filer level.
LunLatency Checks the 'latency' and 'operations per second' (ops) per LUN. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.
PerfVolume checks the 'latency' and 'operations per second' (ops) per volume. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.

license

check_netapp_license.pl checks the filer for expiring (demo-)licenses.

logfile

check_netapp_asup.pl monitors the ASUP-log and alarms if failed transmissions or collections were found.
Job checks for failed jobs.
check_netapp_ems checks the ems-log for the number of specific events per time-unit (rate). Alarm if e.g. too many autogrow-events took place within the last hour or day. [help]

LUN

Logical Unit Number

check_netapp_unused_lun.pl checks for luns which are online but do not have an initiator connected.
LunAlignment searches for misaligned luns. Alarms if a certain number of misaligned luns is reached.
LunSize checks the unused but allocated blocks inside of a LUN. Notfifys the admin if they exceed a certain number (he may than run an unmap procedure on vmware).
LunState checks the LUN-states. Alarms if they are offline or not mapped to an initiator.
LunLatency Checks the 'latency' and 'operations per second' (ops) per LUN. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.

memory

BufferCache checks several metrics of the system buffer cache (=system memory) like Buffers being read, Buffers being written, Empty (unused) buffers, Buffers with modified data, Buffers associated with CP IO, ...
FlashCache checks several metrics of the external FlashCache (PAM II) like External cache hit rate, Average latency of read I/Os, Number of wafl buffers served off the external cache, ...

metro-cluster

see cluster

multi-tenant

ReportIOPS reports how many iops are consumed by a given tenant.
ReportSpace reports how much space in bytes are consumed by a given tenant.

network

NetPort checks if the network-interfaces are enabled or not
FCPAdapter checks the operational status of all fcp adapters.
IfGrp checks if an interface-group has enough links in up-state to still be redundant.
NetInterface checks if a network interfaces current-port is not equal to its home-port (output of the CLI command `network interface show -is-home false`). Can also check it's operational mode (up/down).
PerfIf checks and counts transfer-rates and errors per network-interface (ifnet). Especially useful for monitoring 10GbE-ports.
PerfLif checks and counts transfer-rates and errors per network-interface (lif) for DataONTAP 8.2.x. or higher.
PerfSys checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.
PerfTcpIp checks CRC errors and packets send/received for both the IP and TCP layer.
PerfNic checks various performance counters of a NetApps *physical* network interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.
PerfNic checks various performance counters of a NetApps *physical* network interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.

nfs

Network File System

PerfSys checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
PerfSysNode checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.

node

Job checks for failed jobs.
ServiceProcessor checks the status of the nodes service-processor and if they are correctly configured (autoupdate, IP-address).
PerfSysNode checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops. The check evaluates these counters per Node and works only for DataONTAP 8.3 or later.

NVRAM

Non-Volatile RAM

NVRAM checks data-rates and latency of the NVRAM.
Wafl reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.

overcommit

OvercommitAggr returns a list of aggregates together with their overcommitment in percent. Overcommitment is the relation between the aggregates size and the total of all its (thin provisioned) volumes sizes.

performance

check_netapp_process.pl checks for runaway processes on a filer (as shown with the ps command).
PerfNic checks various performance counters of a NetApps *physical* network interface (NIC). Among them are crc/transmit-error-counters which can be used to detect errors on the physical network-layer.

process

processes running in the filers OS, output of the ps-command

check_netapp_process.pl checks for runaway processes on a filer (as shown with the ps command).

qtree

PerfQtree checks some ops-counters per q-tree (nfs-ops, cifs-ops, ...).

quota

check_netapp_quotas.pl monitors quotas on a NetApp-filer (cluster mode only).

RAID

Redundant Array of Independent Disks

Raidstatus alarms if one of the RAIDs is degraded.

SAS

Serial Attached SCSI

PerfHostadapter checks and counts rates per host adapter (Fibre Channel, Serial Attached SCSI, and parallel SCSI).

SCSI

Small Computer System Interface

see SAS

shelf

ShelfEnvironment checks, the shelf-status, power-supplys, temperature, fans, voltage-sensor and current-sensor on the shelves.
ShelfBay checks, the shelf- and disk-port status. Can alarm BYP-status disks.

sis

Single-Instance Storage

see deduplication

snap-reserve

see snapshot

SnapCenter

check_netapp_snapcenter checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected. [help]

snapmirror

SnapMirror

SnapMirrorMetrics checks and logs SnapMirrors (including type Vault): lag-time, last-transfer-duration, last-transfer-size
SnapMirrorState checks and logs for SnapMirror (including type Vault): health, mirror-state
UnprotectedVolume checks for volumes not protected by SnapMirror.
check_netapp_snapcenter checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected. [help]
check_netapp_snapmirror checks the health- and mirror-state of snap-mirror relations. Monitors also different metrics (lag-time, last-transfer-size, last-transfer-duration) [help]

snapshot

Snapshots checks, if the snap-reserve is still sufficient. Thresholds are set in percent; performance-data can be either in percent or absolute (Byte). Additional criteria are the age or name of the snapshot. This can be used for monitoring snapshot-backups and whether they are up to date or not. Also can be used to find snapshots related to a specific application like SNMV and check all volumes for left-over snapshots.
SnapshotLessVolume searches for volumes which do not have snapshots.
check_netapp_snapcenter checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected. [help]

snapvault

SnapVault

see snapmirror
check_netapp_snapcenter checks the SnapCenter database for failed or missing jobs. This way alarms are sent immediately if backups do not run as expected. [help]

spare-disk

check_netapp_spare counts the number of available spare-disks or -partitions and sends an alarm if this number falls below a given threshold. Considers also the type of the disk (or partition) and its location. [help]
DiskCount counts the number of disks matching defineable criteria (disk-type, container (spare, ...), storage-pool). Mostly used to monitor the number of spare-disks of a certain type.

SVM

Storage Virtual Machines, formerly known as Vserver

LunAlignment searches for misaligned luns. Alarms if a certain number of misaligned luns is reached.
ReportIOPS reports how many iops are consumed by a given tenant.
ReportSpace reports how much space in bytes are consumed by a given tenant.
Vserver monitors the admin-state or the operational-status of a Vserver (running, stopped, inconsistent or defunct)
MetroClusterVserver sends an alarm if the configuration state of a MetroCluster vserver changes to unhealthy.
check_netapp_certificate checks SVM (and possibly also other) certificates for their expiration time. Will trigger an alarm if certificates expire soon (configurable thresholds). [help]

syncmirror

SyncMirror

SyncMirror checks the mirror-status on Metro Cluster aggregates.

system

check_netapp_health monitors the system- and subsystems health state. Sends an alarm if the system health status does not match a given pattern like 'ok'. [help]
Uptime checks the seconds since last reboot.
check_netapp_process.pl checks for runaway processes on a filer (as shown with the ps command).
check_netapp_time.pl checks the filers NTP configuration (at least one ntp server must be configured) and measures the drift between the filers system-time and the monitoring server. Can alarm if that drift is getting too high.
BufferCache checks several metrics of the system buffer cache (=system memory) like Buffers being read, Buffers being written, Empty (unused) buffers, Buffers with modified data, Buffers associated with CP IO, ...
FlashCache checks several metrics of the external FlashCache (PAM II) like External cache hit rate, Average latency of read I/Os, Number of wafl buffers served off the external cache, ...
NVRAM checks data-rates and latency of the NVRAM.
PerfCpu checks one or all processors in a NetApp system for their utilization.
PerfSys checks various performance counters of the NetApp-system (mostly operations/second and transfer-rates). Counters supported: net_data_sent, dafs_ops, total_ops, disk_data_written, net_data_recv, cifs_ops, streaming_pkts, http_ops, nfs_ops, fcp_ops, disk_data_read, iscsi_ops
Wafl reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.

thin-provisioning

see overcommit

vfiler

see SVM

volume

Usage checks the used space in volumes and aggregates. Thresholds can be set in GB or percent.
AutosizeMode checks the autosize-mode of autosized volumes if they are all set to given value (grow, grow_shrink, ...)
ReportIOPS reports how many iops are consumed by a given tenant.
ReportSpace reports how much space in bytes are consumed by a given tenant.
Sis checks dedup-values (stale-fingerprint-percentage, run-time of last successfull operation).
SisStatus find volumes whose compression or deduplication is not enabled.
SnapshotLessVolume searches for volumes which do not have snapshots.
UnprotectedVolume checks for volumes not protected by SnapMirror.
UsageTrend checks the time how long ist would last until an aggregate or volume is full, if the trend of the last 48h (configurable) would continue. Checks both bytes and inodes.
VolumeAutosize checks a volumes total-size and alerts when the volume is close to being full relative to the autosize maximum.
VolumeState checks the volume-states. Alarms if they are not online (configurable).
PerfVolume checks the 'latency' and 'operations per second' (ops) per volume. Shows details for total, read, write and other. NetApp recommends monitoring latency as the primary performance indicator.
VolumeAge searches for and flags volumes which have been created a (configurable) long time ago. An old age may be an indication for a forgotten and unused volume-clone. The logic can be also inverted to search for volumes with an exceptional short age (which have been created within the last day or so).

vserver

see SVM

WAFL

Write Anywhere File Layout

Wafl reads WAFL performance-counters like cp_count twice and calculates the rate of CPs per second. Different types of consistency-points (wafl-timer, back-to-back, ...) can be checked. The information gathered from this plugin corresponds to the CPty-column of 'sysstat -x 1'.

S3

S3 Storage

check_netapp_s3bucket checks the free or used size in an S3 bucket linked to a NetApp filer. [help]