Prometheus + Grafana + node_exporter Install Prometheus, node_exporter, and Grafana on a HolyCloud VPS to monitor CPU, RAM, disk, and network with dashboards. ~16 min read Advanced #prometheus #grafana #monitoring #node_exporter Prometheus + Grafana + node_exporter This open source stack collects metrics (Prometheus), exposes them via node_exporter, and visualizes them in Grafana. Ideal on a HolyCloud Performance VPS dedicated to monitoring, or on the monitored VPS itself for small fleets. Prerequisites Linux VPS 2 GB+ RAM (4 GB if Grafana + Prometheus + multiple targets) Ports: 9090 (Prometheus), 3000 (Grafana), 9100 (node_exporter) — restrict with firewall Domain name or VPN access recommended for Grafana UI Backup Docker volumes or data directories Architecture [VPS target] node_exporter:9100 ↓ scrape [Prometheus] :9090 → [Grafana] :3000 (datasource Prometheus) Install node_exporter (on each VPS) NODE_EXPORTER_VERSION=1.8.2 cd /tmp curl -LO https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXPORTER_VERSION}/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz tar xzf node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz sudo cp node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64/node_exporter /usr/local/bin/ systemd service: sudo tee /etc/systemd/system/node_exporter.service << 'EOF' [Unit] Description=Prometheus Node Exporter After=network.target [Service] User=nobody ExecStart=/usr/local/bin/node_exporter --web.listen-address=127.0.0.1:9100 Restart=on-failure [Install] WantedBy=multi-user.target EOF sudo systemctl daemon-reload sudo systemctl enable --now node_exporter curl -s http://127.0.0.1:9100/metrics | head Listen on 127.0.0.1; expose via SSH tunnel or authenticated reverse proxy. Prometheus (Docker recommended) sudo mkdir -p /opt/prometheus/{data,config} sudo tee /opt/prometheus/config/prometheus.yml << 'EOF' global: scrape_interval: 15s scrape_configs: - job_name: node static_configs: - targets: - 127.0.0.1:9100 labels: instance: vps-performance-01 EOF docker run -d --name prometheus --restart unless-stopped \ -p 127.0.0.1:9090:9090 \ -v /opt/prometheus/config/prometheus.yml:/etc/prometheus/prometheus.yml:ro \ -v /opt/prometheus/data:/prometheus \ prom/prometheus:latest \ --config.file=/etc/prometheus/prometheus.yml \ --storage.tsdb.retention.time=15d Add other VPS instances: - targets: ['10.0.0.5:9100', '10.0.0.6:9100'] Grafana docker run -d --name grafana --restart unless-stopped \ -p 127.0.0.1:3000:3000 \ -v /opt/grafana:/var/lib/grafana \ grafana/grafana:latest First login: http://127.0.0.1:3000 → admin / admin (change the password). Connections → Data sources → Prometheus URL: http://prometheus:9090 if shared Docker network, otherwise http://host.docker.internal:9090 or host IP Dashboards → Import → ID 1860 (Node Exporter Full) Connect containers: docker network create monitoring docker network connect monitoring prometheus docker network connect monitoring grafana Internal datasource URL: http://prometheus:9090. Nginx reverse proxy (HTTPS) server { listen 443 ssl; server_name grafana.votredomaine.tld; location / { proxy_pass http://127.0.0.1:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } } Enable Grafana authentication (OAuth, LDAP) in production. Alerts (introduction) rules.yml file: groups: - name: host rules: - alert: HostHighCpu expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90 for: 10m labels: severity: warning Mount the file in Prometheus and reference rule_files in prometheus.yml. Connect Alertmanager for email/Slack (beyond minimal scope). Security Do not expose 9100 and 9090 on the Internet without TLS + auth Prefer a separate monitoring VPS that scrapes over WireGuard VPN Limit disk retention (retention.time, retention.size) Verification docker ps curl -s http://127.0.0.1:9090/api/v1/targets | jq '.data.activeTargets[].health' In Grafana: CPU graphs, node_memory_MemAvailable_bytes, node_filesystem_avail_bytes. Troubleshooting | Problem | Approach | |----------|-------| | Target DOWN | Firewall, wrong IP, node_exporter stopped | | Grafana no data | Datasource URL, Docker network | | Disk full | Reduce Prometheus retention | Need help? HolyCloud can confirm bandwidth quotas if you centralize scraping from many VPS instances to a single Prometheus node. Continue reading Previous article PHP-FPM on Performance VPS Read Next article Redis as application cache Read