Planned

Monitoring Scale

Full observability. Zero bolt-ons.

Monitoring Scale

The Monitoring Scale provides complete observability for your Tophan cluster. Host metrics, VM metrics, application logs, alerting, and dashboards — all integrated, all in one place.

Features

FeatureDescriptionStatus
Host MetricsCPU, memory, disk, network, and temperature for every physical node.Planned
VM MetricsPer-VM resource consumption, I/O patterns, and performance counters.Planned
AlertingConfigurable alert rules with multiple notification channels: email, webhook, chat. Escalation policies and silencing.Planned
DashboardsBuilt-in dashboards in Dragon’s Eye. Custom dashboards with drag-and-drop layout.Planned
Log AggregationCentralised log collection from all nodes, VMs, and Scales. Structured logging, full-text search.Planned
SNMPSNMP polling and trap reception for physical network equipment. Integrate everything into one view.Planned
Green MetricsPower consumption tracking and efficiency scoring. Know your actual energy cost per workload.Planned
SLA TrackingDefine SLA targets, track compliance, and generate reports. Uptime, response time, availability.Planned
Metric RetentionConfigurable retention with automatic downsampling. Full resolution for recent data, aggregated for historical.Planned
API AccessFull metrics API for custom integrations and external tools.Planned

Why Not Use an Existing Stack?

You can. Tophan doesn’t prevent you from running your preferred observability tools. But the Monitoring Scale offers something external tools can’t: deep integration with every Scale.

The Monitoring Scale doesn’t scrape endpoints and hope for the best. It receives structured telemetry directly from the Hypervisor, Storage, Networking, and Security Scales. It knows the difference between “VM CPU is high” and “VM CPU is high because the underlying storage is degraded” — because it has context from the Storage Scale, not just a metric from the hypervisor.

Green Metrics

Energy cost per workload is becoming a compliance requirement, not a nice-to-have. The Monitoring Scale tracks power consumption at the node level and attributes it to individual workloads based on resource utilisation.

This lets you answer: “How much does it cost to run this application?” — not in infrastructure terms, but in actual energy consumption and carbon impact.

Architecture

┌──────────────────────────────┐
│       Dragon's Eye           │  Dashboards, alerts
├──────────────────────────────┤
│    Monitoring Scale API      │  Query, config
├──────────────────────────────┤
│   Time-Series Storage        │  Metrics + logs
├──────────┬──────┬────────────┤
│  Node    │  VM  │  Scale     │  Telemetry sources
│  Agents  │ Agt  │  Agents    │
└──────────┴──────┴────────────┘

Lightweight agents on every node collect and forward telemetry. The central store handles retention, downsampling, and query serving. Dragon’s Eye renders it. The AIOps Scale reasons about it.