
Documentation
❤️System Health Monitoring
Real-time monitoring, diagnostics, and health checks to ensure your AI system runs at peak performance
← Back to DocumentationAlways-On System Health
Monitor every aspect of your Hive AI system in real-time. Track API health, model availability, performance metrics, and get instant alerts when issues arise. Keep your AI infrastructure running smoothly.
System Status
🔌 API Health
Provider connections, response times, error rates
🤖 Model Status
Availability, performance, rate limits
💾 System Resources
Database, storage, memory usage
💻 Terminal/CLI Commands
Quick Health Check
Detailed Diagnostics
Monitor & Watch
Troubleshooting
🔧 IDE Integration (Claude Code, Cursor, Windsurf)
💡 MCP Tool Name: hive_health
- Monitor system health directly from your IDE
Basic Health Check
Quick system status check
Detailed Report
Comprehensive diagnostics
Component Check
Check specific components
Performance Stats
Performance metrics
📊 Health Metrics Explained
🔌 API Health Metrics
- Connection Status: Active/Failed/Retry
- Response Time: Average latency in ms
- Error Rate: Failed requests percentage
- Rate Limits: Usage vs available quota
🤖 Model Health Metrics
- Availability: Online/Offline status
- Performance: Token generation speed
- Queue Time: Wait time for requests
- Success Rate: Completed vs failed
💾 System Resource Metrics
- Database: Connection pool status
- Storage: Used vs available space
- Memory: RAM usage patterns
- Cache: Hit rate and efficiency
🔐 Circuit Breakers & Protection
Hive AI includes automatic circuit breakers that protect your system from cascading failures:
Provider Circuit Breakers
- Auto-disable failing providers
- Retry with exponential backoff
- Automatic recovery testing
- Fallback to alternate providers
Cost Protection
- Budget limit enforcement
- Spike detection alerts
- Automatic pause on anomalies
- Usage trend monitoring
🚨 Alerts & Notifications
Configure Health Alerts
🔴 Critical Alerts
- API connection lost
- License expired
- Database failure
🟡 Warning Alerts
- High error rate
- Slow response times
- Budget threshold
🔵 Info Alerts
- New models available
- System updates
- Usage reports
📈 Health Dashboard
Real-Time Monitoring View
The dashboard updates in real-time, showing system vitals, performance graphs, and active alerts. Use arrow keys to navigate between sections, press 'q' to quit.
✅ Health Monitoring Best Practices
Regular Checks
- Run daily health checks
- Monitor after deployments
- Check before critical tasks
- Review weekly reports
- Test recovery procedures
Proactive Monitoring
- Set up automated alerts
- Monitor usage trends
- Track performance baselines
- Document incident responses
- Plan capacity ahead
🛡️ Enterprise-Grade Reliability
With comprehensive health monitoring, you'll know your AI system status at all times. Get alerts before issues impact your work, and maintain peak performance with proactive diagnostics.