Monitoring and Alertingο
This document provides information about the monitoring and alerting setup for APIFromAnything.
Overviewο
APIFromAnything includes a comprehensive monitoring and alerting system that helps you track the performance and health of your API. The system is built on the following components:
Prometheus: For metrics collection and storage
Grafana: For visualization and dashboards
AlertManager: For alert management and notifications
Metricsο
APIFromAnything exposes various metrics that can be collected by Prometheus:
Request Metricsο
apifrom_request_count: Count of requests received, labeled by method, endpoint, and status codeapifrom_request_latency_seconds: Histogram of request latency in seconds, labeled by method and endpointapifrom_requests_in_progress: Gauge of requests currently being processed, labeled by method and endpointapifrom_error_count: Count of errors occurred, labeled by method, endpoint, and exception type
Database Metricsο
apifrom_db_query_latency_seconds: Histogram of database query latency in seconds, labeled by operation and table
Cache Metricsο
apifrom_cache_hit_count: Count of cache hits, labeled by cache nameapifrom_cache_miss_count: Count of cache misses, labeled by cache name
System Metricsο
apifrom_system_info: System information, labeled by version and Python version
Monitoring Setupο
Docker Composeο
The monitoring stack is included in the docker-compose.yml file and consists of the following services:
prometheus: Collects and stores metricsgrafana: Visualizes metrics and provides dashboardsalertmanager: Manages alerts and notifications
Configurationο
Prometheusο
Prometheus is configured in monitoring/prometheus/prometheus.yml. The configuration includes:
Scrape configurations for various services
Alert rules
AlertManager configuration
Grafanaο
Grafana is configured with:
Datasources in
monitoring/grafana/provisioning/datasources/prometheus.ymlDashboards in
monitoring/grafana/provisioning/dashboards/
AlertManagerο
AlertManager is configured in monitoring/alertmanager/alertmanager.yml. The configuration includes:
Notification receivers (email, Slack, PagerDuty)
Routing configuration
Inhibition rules
Dashboardsο
APIFromAnything comes with a pre-configured Grafana dashboard that provides insights into the performance and health of your API. The dashboard includes the following panels:
Request Rate
Request Latency
Error Rate
Requests In Progress
Database Query Latency
Cache Hit/Miss Rate
Alertsο
APIFromAnything includes pre-configured alerts that notify you when certain conditions are met. The alerts include:
HighRequestLatency: Triggered when the 95th percentile of request latency is above 1s for an endpoint
HighErrorRate: Triggered when the error rate is above 5% for an endpoint
CriticalErrorRate: Triggered when the error rate is above 20% for an endpoint
HighRequestRate: Triggered when the request rate is above 100 requests per second for an endpoint
HighDatabaseLatency: Triggered when the 95th percentile of database query latency is above 0.5s
HighCacheMissRate: Triggered when the cache miss rate is above 80%
InstanceDown: Triggered when an instance is down for more than 1 minute
HighMemoryUsage: Triggered when memory usage is above 1GB
Integration with Application Codeο
Middlewareο
APIFromAnything includes a Prometheus middleware that automatically collects request metrics. The middleware is configured in apifrom/monitoring.py.
Custom Metricsο
You can add custom metrics to your application by using the apifrom.monitoring module. For example:
from apifrom.monitoring import DatabaseMetrics, CacheMetrics
# Track database query time
async with DatabaseMetrics.track_query_time("select", "users"):
result = await db.fetch_all("SELECT * FROM users")
# Record cache hit/miss
if result_from_cache:
CacheMetrics.record_hit("user_cache")
else:
CacheMetrics.record_miss("user_cache")
Accessing Monitoring Toolsο
When running with Docker Compose, the monitoring tools are available at the following URLs:
Prometheus: http://localhost:9090
Grafana: http://localhost:3000 (default credentials: admin/admin)
AlertManager: http://localhost:9093
Customizing Monitoringο
Adding Custom Metricsο
You can add custom metrics by modifying the apifrom/monitoring.py file. For example, to add a new counter:
from prometheus_client import Counter
MY_COUNTER = Counter(
'apifrom_my_counter',
'Description of my counter',
['label1', 'label2'],
registry=registry
)
# Increment the counter
MY_COUNTER.labels(label1='value1', label2='value2').inc()
Adding Custom Dashboardsο
You can add custom Grafana dashboards by adding JSON files to the monitoring/grafana/provisioning/dashboards/ directory.
Customizing Alertsο
You can customize alerts by modifying the monitoring/prometheus/rules/alerts.yml file.
Best Practicesο
Monitor both application-level metrics (request rate, latency) and system-level metrics (CPU, memory)
Set up alerts for critical conditions that require immediate attention
Use dashboards to visualize trends and identify potential issues before they become critical
Regularly review and adjust alert thresholds based on your applicationβs performance characteristics
Implement proper logging alongside metrics for better debugging capabilities