Full observability. Zero bolt-ons.
The Monitoring Scale provides complete observability for your Tophan cluster. Host metrics, VM metrics, application logs, alerting, and dashboards — all integrated, all in one place.
| Feature | Description | Status |
|---|---|---|
| Host Metrics | CPU, memory, disk, network, and temperature for every physical node. | Planned |
| VM Metrics | Per-VM resource consumption, I/O patterns, and performance counters. | Planned |
| Alerting | Configurable alert rules with multiple notification channels: email, webhook, chat. Escalation policies and silencing. | Planned |
| Dashboards | Built-in dashboards in Dragon’s Eye. Custom dashboards with drag-and-drop layout. | Planned |
| Log Aggregation | Centralised log collection from all nodes, VMs, and Scales. Structured logging, full-text search. | Planned |
| SNMP | SNMP polling and trap reception for physical network equipment. Integrate everything into one view. | Planned |
| Green Metrics | Power consumption tracking and efficiency scoring. Know your actual energy cost per workload. | Planned |
| SLA Tracking | Define SLA targets, track compliance, and generate reports. Uptime, response time, availability. | Planned |
| Metric Retention | Configurable retention with automatic downsampling. Full resolution for recent data, aggregated for historical. | Planned |
| API Access | Full metrics API for custom integrations and external tools. | Planned |
You can. Tophan doesn’t prevent you from running your preferred observability tools. But the Monitoring Scale offers something external tools can’t: deep integration with every Scale.
The Monitoring Scale doesn’t scrape endpoints and hope for the best. It receives structured telemetry directly from the Hypervisor, Storage, Networking, and Security Scales. It knows the difference between “VM CPU is high” and “VM CPU is high because the underlying storage is degraded” — because it has context from the Storage Scale, not just a metric from the hypervisor.
Energy cost per workload is becoming a compliance requirement, not a nice-to-have. The Monitoring Scale tracks power consumption at the node level and attributes it to individual workloads based on resource utilisation.
This lets you answer: “How much does it cost to run this application?” — not in infrastructure terms, but in actual energy consumption and carbon impact.
┌──────────────────────────────┐
│ Dragon's Eye │ Dashboards, alerts
├──────────────────────────────┤
│ Monitoring Scale API │ Query, config
├──────────────────────────────┤
│ Time-Series Storage │ Metrics + logs
├──────────┬──────┬────────────┤
│ Node │ VM │ Scale │ Telemetry sources
│ Agents │ Agt │ Agents │
└──────────┴──────┴────────────┘
Lightweight agents on every node collect and forward telemetry. The central store handles retention, downsampling, and query serving. Dragon’s Eye renders it. The AIOps Scale reasons about it.