SNMP Exporter Guide: Mapping ER605 OIDs to Prometheus Metrics

SNMP Exporter: How to Monitor an Omada ER605 with Prometheus and Grafana

SNMP Exporter transforms router SNMP data into Prometheus metrics, enabling detailed traffic, interface, CPU, and memory observability for devices like the Omada ER605.

The Omada ER605 may ship as a simple “it works” router, but with SNMP Exporter you can translate the device’s SNMP OIDs into Prometheus metrics and build real observability around traffic, interface health, CPU and memory, and actionable alerts with Grafana dashboards. In this article I walk through the rationale for using SNMP Exporter with an ER605, explain which OIDs to collect and why, show how to configure a Prometheus scrape job and SNMP v3 credentials, outline recommended Prometheus expressions (including byte-to-bit conversion and 64-bit counters), and describe the operational and security trade-offs you should consider when deploying this stack in production.

Why use SNMP Exporter for ER605 monitoring

The ER605 does not expose an HTTP metrics endpoint, so it cannot be scraped directly by Prometheus. SNMP Exporter acts as a translator: it queries the router over SNMP, maps OIDs to Prometheus metric names and labels, and exposes those metrics on an HTTP /metrics endpoint that Prometheus can scrape. This lets you turn the ER605 from a black-box “it’s alive” device into a transparent, measurable component of your network telemetry—showing real throughput per interface, link state, error counts, storage and CPU usage, and more.

How SNMP collection differs: walk vs get

SNMP Exporter supports two primary collection styles: walk and get. A walk traverses an entire OID tree and is ideal for tables—interfaces, storage entries, and per-core processor loads. Walking can yield many rows and is the default for tables. A get retrieves single OIDs and is lighter-weight; use it for scalar values such as system uptime. For the ER605, a hybrid approach is best: walk the standard interface, storage and processor trees and use get for sysUpTime and similar scalars.

Core OIDs to collect and why they matter

A monitoring profile aimed at the ER605 should include a handful of standard SNMP MIB trees and specific OIDs that provide the insight you need:

System (1.3.6.1.2.1.1): device uptime and basic system information—useful for correlating restarts with incidents.
Interfaces (1.3.6.1.2.1.2 and 1.3.6.1.2.1.31): interface descriptions, names, types, admin/operational status, MTU, speed and high-capacity counters. These let you detect link downs, identify which physical port corresponds to a Prometheus label, and compute utilization.
Processor load (HOST-RESOURCES-MIB hrProcessorLoad, 1.3.6.1.2.1.25.3.3.1.2): per-core CPU load. Aggregating or averaging cores gives meaningful CPU utilization metrics.
Storage (HOST-RESOURCES-MIB hrStorageTable): allocation units, size and used values so you can compute percent-used for flash and other storage components. Calculation is (hrStorageUsed / hrStorageSize) * 100 after accounting for allocation units.
Traffic counters (ifHCInOctets / ifHCOutOctets, 1.3.6.1.2.1.31.1.1.1.6 and …1.10): the 64-bit high-capacity octet counters are essential for accurate throughput on modern links; always prefer ifHC* over legacy 32-bit counters like ifInOctets.
Errors and discards (ifInErrors, ifOutErrors, ifInDiscards, ifOutDiscards): useful for detecting physical problems, duplex mismatches or congestion.

Collecting these OIDs gives you a practical set of signals for both network engineers and platform teams: link state and real throughput, device health, and resource exhaustion indicators.

Prometheus expressions you’ll use frequently

Raw counters from SNMP require small Prometheus calculations to become useful metrics:

Convert octets to bits-per-second for a 1-minute rate:
rate(ifHCInOctets[1m]) * 8
Compute outgoing bps similarly:
rate(ifHCOutOctets[1m]) * 8
Average per-core CPU:
avg(hrProcessorLoad) by (instance)
Storage percent used (after converting allocation units):
(hrStorageUsed hrStorageAllocationUnits) / (hrStorageSize hrStorageAllocationUnits) * 100
These simple formulas let you populate dashboards and set thresholds for alerts.

SNMP v3 configuration and security considerations

SNMP v3 is strongly recommended over v1/v2c because it supports authentication and privacy. On the ER605, however, there are device-specific quirks: SHA-based authentication may be unstable for some firmware/hardware combinations, while MD5 tends to work reliably. That means you will frequently configure SNMP Exporter or your Prometheus SNMP authentication blocks with version: 3, security_level: authNoPriv, auth_protocol: MD5 (and the corresponding username and password). authNoPriv provides authentication without encryption; it’s better than community strings but does not encrypt SNMP payloads. If your environment requires confidentiality, evaluate SNMP v3 with privacy (authPriv) or place SNMP traffic on an isolated management network or VPN.

Example auth block (conceptual):
version: 3
security_level: authNoPriv
username:
auth_protocol: MD5
password: ""

Be explicit with version and security_level in your Prometheus job params so the exporter talks to the router using the right profile.

A practical Prometheus scrape job for SNMP Exporter

Your Prometheus scrape config needs to call the SNMP Exporter endpoint and pass the module and auth identifiers that match your exporter’s snmp.yml module. A typical job sets metrics_path to /snmp and includes params for both module and auth, along with a static target that points at the router’s IP.

Key settings (conceptual):
job_name: "omada-router"
metrics_path: /snmp
params:
auth: [omada_v3]
module: [omada_full]
static_configs:

targets: [""]

This pattern keeps your Prometheus config clean: the exporter maps OIDs into names and labels, and Prometheus only needs to know which exporter module and credentials to use.

A recommended SNMP Exporter module profile (what to include)

A focused module for the ER605 should walk both device and table trees and explicitly map the most useful OIDs to Prometheus-friendly metric names and index labels. The critical elements are:

walk: the system tree, interface tree, high-capacity interface counters, processor table, and storage table.
get: sysUpTime as a scalar.
metrics: map hrProcessorLoad, hrStorage* members, ifDescr/ifName/ifAlias, ifAdminStatus/ifOperStatus, ifInErrors/ifOutErrors, ifInDiscards/ifOutDiscards, ifHCInOctets/ifHCOutOctets, and ifHighSpeed. Use sensible metric types (gauge vs counter) and include index label mappings (ifIndex or hrStorageIndex) so you can join per-interface series to human-friendly labels.

Using a complete module like this ensures your SNMP Exporter exposes structured, machine-readable metrics that downstream systems and dashboards can use.

Why always use 64-bit counters for traffic

Modern network interfaces and even home/SMB routers can see spikes that overflow 32-bit counters quickly. The ifHC* (High Capacity) counters are 64-bit and are the correct source for throughput calculations. If you accidentally monitor ifInOctets/ifOutOctets (32-bit), you’ll get wraparound artifacts and misleading reports on busy links. Always prefer ifHCInOctets and ifHCOutOctets in your SNMP module.

Alerts that turn observability into action

Once metrics flow into Prometheus and Grafana, you can create targeted alerts. Useful examples:

Interface down: fire when ifOperStatus == 2 (down) for more than a short window. This detects physical link failures.
Rapid error rates: rate(ifInErrors[5m]) > 10 to catch spikes that may indicate a failing NIC or cabling issues.
Saturation: detect when outgoing bps exceeds a threshold of the link capacity—e.g., rate(ifHCOutOctets[1m]) 8 > (link_speed threshold). Replace the numeric example with the actual interface capacity; a sample threshold could be > 800,000,000 bps for a gigabit-plus link.
Storage or CPU exhaustion: alert when storage percent used or averaged CPU crosses business-defined limits.

Avoid alert fatigue: tune durations, thresholds and suppression rules so alerts are noisy only when action is required.

How this setup changes operations and diagnosis

With SNMP Exporter, the router stops being a black box. Instead of asking “is the router working?” you can answer “how is the router behaving?” and “which interface is saturated?” This enables:

Network troubleshooting based on real throughput and error signals, not guesswork.
Capacity planning using historical interface utilization.
Faster incident diagnosis: correlate sysUpTime increases (reboots) with configuration changes or network events.
More useful runbooks: when an alert triggers, responders can quickly check per-interface bps, errors, and link state to decide whether to escalate to hardware replacement, vendor support, or configuration rollback.

Integration with Grafana and related ecosystems

SNMP Exporter + Prometheus plays well with the broader observability stack. Build Grafana dashboards that include:

Per-interface utilization panels showing inbound/outbound bps alongside ifHighSpeed capacity.
Error and discard charts to reveal intermittent physical problems.
Device health tiles: averaged CPU, storage percent, and uptime.
Alert annotation overlays that show when rules fired.

This is also an entry point for integrating with automation platforms and ticketing systems: Prometheus alerts can trigger webhook receivers, automation playbooks, or updates in a CRM or incident management platform to coordinate operator response.

Developer and operator implications

From a development and operations perspective:

Automation: treat your SNMP Exporter module and Prometheus job configuration as code—store it in version control, test changes in staging routers or lab gear, and deploy via CI/CD.
Observability hygiene: tag metrics with consistent labels (instance, ifIndex, ifName) so dashboards and alerting rules can be reused across multiple devices and vendors.
Extension: the same approach scales to switches and access points that support SNMP; you can reuse module ideas and build vendor-specific overrides.
Documentation and runbooks: include steps to rotate SNMP credentials, manage firmware compatibility, and fallback procedures if SNMP becomes unreliable.

Security trade-offs and operational best practices

SNMP v3 with authNoPriv is better than SNMP v1/v2c, but without encryption SNMP payloads could be observed by an attacker if the management network is exposed. Best practices include:

Put SNMP traffic on an isolated management VLAN or VPN.
Use firewall rules to restrict which hosts can talk SNMP to devices.
Rotate credentials and keep passwords in a secrets manager, not plain text in repos.
Test SNMP auth protocols against the specific device model and firmware—some devices like the ER605 may require MD5 for reliable authentication, while SHA may fail intermittently. Monitor for compatibility changes after firmware updates.

Vendor and ecosystem context

The approach described here is vendor-agnostic: SNMP Exporter is commonly used to expose metrics from routers, switches and firewalls to Prometheus. However, vendor quirks—MIB naming, available OIDs, and SNMP v3 behavior—vary. When adopting this pattern across multiple device families, maintain vendor-specific module files and a discovery process to map interface indices to names consistently. Consider complementary telemetry approaches where available—e.g., native Prometheus endpoints on newer appliances, sFlow/NetFlow for sampled traffic, or vendor APIs that provide richer context.

Real-world deployment checklist

Before you flip the switch in production, validate these items:

Confirm SNMP access from the SNMP Exporter host to the ER605 and test the authentication profile (try MD5 if SHA fails).
Validate that the exporter’s module maps names and index labels the way you expect; check for correct ifIndex → ifName mapping.
Ensure Prometheus scrape intervals and rewrite rules produce sensible series volume—monitor cardinality to avoid explosion.
Build baseline Grafana dashboards and test alert thresholds during a quiet window to reduce false positives.
Lock down network and secrets for SNMP credentials and record recovery/rotation procedures.

Broader implications for monitoring and network observability

Using SNMP Exporter to instrument edge devices like the ER605 illustrates a broader shift: network devices are increasingly treated as first-class telemetry producers, not opaque infrastructure. This has downstream effects for platform teams and vendors. Platform teams must invest in tooling that maps network-level signals to application-level impacts—connecting interface saturation to degraded user experience, for example. Vendors may begin to expose richer telemetry (native Prometheus exporters, metrics APIs or streaming telemetry) that reduces dependency on translation layers, but until then SNMP remains a pragmatic, widely-supported bridge between networking hardware and cloud-native monitoring stacks.

Future-proofing your monitoring strategy means designing for both: supporting SNMP-based collection now while keeping an architecture flexible enough to ingest native metrics, streaming telemetry (gNMI/Telemetry), or cloud-managed API feeds later.

Observability on the network edge is no longer optional—it’s a competitive advantage for operations teams that must diagnose outages, optimize capacity, and automate responses. Putting a small translator like SNMP Exporter in front of a router multiplies the value of existing Prometheus and Grafana investment and brings network telemetry into the same toolkit used for applications and infrastructure.

Looking ahead, expect device telemetry to unify further with application observability: vendors will increasingly offer richer, structured metrics and event streams, while open-source collectors will evolve to normalize those sources. For now, SNMP Exporter provides a pragmatic, maintainable path to meaningful metrics on commodity routers like the Omada ER605—if you standardize on 64-bit counters, manage SNMP v3 credentials carefully, and integrate Prometheus rules and Grafana dashboards into your operational playbooks, you’ll gain visibility that changes the way your team detects, diagnoses, and prevents network problems.