Skip to content

Monitoring

Monitor SSH-KLM health, performance, and security metrics.

Terminal window
# Basic health check
curl https://ssh-klm.example.com/health
# Detailed health
curl https://ssh-klm.example.com/health/detailed

Response:

{
"status": "healthy",
"components": {
"database": "healthy",
"redis": "healthy",
"vault": "healthy"
},
"version": "2.0.0"
}

Metrics available at /metrics:

MetricTypeDescription
sshklm_hosts_totalGaugeTotal managed hosts
sshklm_keys_totalGaugeTotal SSH keys
sshklm_rotations_totalCounterKey rotations performed
sshklm_rotations_failed_totalCounterFailed rotations
sshklm_discoveries_duration_secondsHistogramDiscovery scan duration
sshklm_api_requests_totalCounterAPI requests by endpoint
sshklm_agents_connectedGaugeConnected agents
scrape_configs:
- job_name: 'ssh-klm'
static_configs:
- targets: ['ssh-klm.example.com:9090']
metrics_path: /metrics
scheme: https
bearer_token: 'YOUR_METRICS_TOKEN'

Import the official dashboard: ID: 18456

Or create custom panels:

# Total hosts by status
sum by (status) (sshklm_hosts_total)
# Hosts with outdated keys
sshklm_keys_total{algorithm="dsa"} + sshklm_keys_total{algorithm="rsa-1024"}
# Rotation success rate (last 24h)
sum(rate(sshklm_rotations_total[24h])) /
(sum(rate(sshklm_rotations_total[24h])) + sum(rate(sshklm_rotations_failed_total[24h])))
# Rotations per hour
sum(rate(sshklm_rotations_total[1h])) * 3600
groups:
- name: ssh-klm
rules:
- alert: SSHKLMHighFailedRotations
expr: rate(sshklm_rotations_failed_total[5m]) > 0.1
for: 10m
labels:
severity: warning
annotations:
summary: High rotation failure rate
- alert: SSHKLMAgentDisconnected
expr: sshklm_agents_connected < sshklm_agents_registered
for: 5m
labels:
severity: warning
annotations:
summary: Agents disconnected
- alert: SSHKLMDatabaseUnhealthy
expr: sshklm_health_database != 1
for: 1m
labels:
severity: critical
annotations:
summary: Database unhealthy
{
"level": "info",
"timestamp": "2026-01-06T10:00:00Z",
"message": "Key rotation completed",
"keyId": "key_abc123",
"hostId": "host_xyz789",
"duration_ms": 1234,
"traceId": "abc123"
}

Datadog:

logs:
- type: file
path: /var/log/ssh-klm/*.log
service: ssh-klm
source: ssh-klm

ELK Stack:

filebeat.inputs:
- type: log
paths:
- /var/log/ssh-klm/*.log
json.keys_under_root: true

SSH-KLM logs all security-relevant events:

  • User authentication
  • Key rotations
  • Policy changes
  • Agent registrations
  • API key creation/deletion

Query audit logs:

Terminal window
ssh-klm admin audit:query \
--start "2026-01-01" \
--end "2026-01-07" \
--action "key.rotate"