Planned

Monitoring Scale

Full observability. Zero bolt-ons.

Monitoring Scale

The Monitoring Scale provides complete observability for your Tophan cluster. Host metrics, VM metrics, application logs, alerting, and dashboards — all integrated, all in one place.

Features

Feature	Description	Status
Host Metrics	CPU, memory, disk, network, and temperature for every physical node.	Planned
VM Metrics	Per-VM resource consumption, I/O patterns, and performance counters.	Planned
Alerting	Configurable alert rules with multiple notification channels: email, webhook, chat. Escalation policies and silencing.	Planned
Dashboards	Built-in dashboards in Dragon’s Eye. Custom dashboards with drag-and-drop layout.	Planned
Log Aggregation	Centralised log collection from all nodes, VMs, and Scales. Structured logging, full-text search.	Planned
SNMP	SNMP polling and trap reception for physical network equipment. Integrate everything into one view.	Planned
Green Metrics	Power consumption tracking and efficiency scoring. Know your actual energy cost per workload.	Planned
SLA Tracking	Define SLA targets, track compliance, and generate reports. Uptime, response time, availability.	Planned
Metric Retention	Configurable retention with automatic downsampling. Full resolution for recent data, aggregated for historical.	Planned
API Access	Full metrics API for custom integrations and external tools.	Planned

Why Not Use an Existing Stack?

You can. Tophan doesn’t prevent you from running your preferred observability tools. But the Monitoring Scale offers something external tools can’t: deep integration with every Scale.

The Monitoring Scale doesn’t scrape endpoints and hope for the best. It receives structured telemetry directly from the Hypervisor, Storage, Networking, and Security Scales. It knows the difference between “VM CPU is high” and “VM CPU is high because the underlying storage is degraded” — because it has context from the Storage Scale, not just a metric from the hypervisor.

Green Metrics

Energy cost per workload is becoming a compliance requirement, not a nice-to-have. The Monitoring Scale tracks power consumption at the node level and attributes it to individual workloads based on resource utilisation.

This lets you answer: “How much does it cost to run this application?” — not in infrastructure terms, but in actual energy consumption and carbon impact.

Architecture

┌──────────────────────────────┐
│       Dragon's Eye           │  Dashboards, alerts
├──────────────────────────────┤
│    Monitoring Scale API      │  Query, config
├──────────────────────────────┤
│   Time-Series Storage        │  Metrics + logs
├──────────┬──────┬────────────┤
│  Node    │  VM  │  Scale     │  Telemetry sources
│  Agents  │ Agt  │  Agents    │
└──────────┴──────┴────────────┘

Lightweight agents on every node collect and forward telemetry. The central store handles retention, downsampling, and query serving. Dragon’s Eye renders it. The AIOps Scale reasons about it.