# Monitoring and Alerting

This document provides information about the monitoring and alerting setup for APIFromAnything.

## Overview

APIFromAnything includes a comprehensive monitoring and alerting system that helps you track the performance and health of your API. The system is built on the following components:

- **Prometheus**: For metrics collection and storage
- **Grafana**: For visualization and dashboards
- **AlertManager**: For alert management and notifications

## Metrics

APIFromAnything exposes various metrics that can be collected by Prometheus:

### Request Metrics

- `apifrom_request_count`: Count of requests received, labeled by method, endpoint, and status code
- `apifrom_request_latency_seconds`: Histogram of request latency in seconds, labeled by method and endpoint
- `apifrom_requests_in_progress`: Gauge of requests currently being processed, labeled by method and endpoint
- `apifrom_error_count`: Count of errors occurred, labeled by method, endpoint, and exception type

### Database Metrics

- `apifrom_db_query_latency_seconds`: Histogram of database query latency in seconds, labeled by operation and table

### Cache Metrics

- `apifrom_cache_hit_count`: Count of cache hits, labeled by cache name
- `apifrom_cache_miss_count`: Count of cache misses, labeled by cache name

### System Metrics

- `apifrom_system_info`: System information, labeled by version and Python version

## Monitoring Setup

### Docker Compose

The monitoring stack is included in the `docker-compose.yml` file and consists of the following services:

- `prometheus`: Collects and stores metrics
- `grafana`: Visualizes metrics and provides dashboards
- `alertmanager`: Manages alerts and notifications

### Configuration

#### Prometheus

Prometheus is configured in `monitoring/prometheus/prometheus.yml`. The configuration includes:

- Scrape configurations for various services
- Alert rules
- AlertManager configuration

#### Grafana

Grafana is configured with:

- Datasources in `monitoring/grafana/provisioning/datasources/prometheus.yml`
- Dashboards in `monitoring/grafana/provisioning/dashboards/`

#### AlertManager

AlertManager is configured in `monitoring/alertmanager/alertmanager.yml`. The configuration includes:

- Notification receivers (email, Slack, PagerDuty)
- Routing configuration
- Inhibition rules

## Dashboards

APIFromAnything comes with a pre-configured Grafana dashboard that provides insights into the performance and health of your API. The dashboard includes the following panels:

- Request Rate
- Request Latency
- Error Rate
- Requests In Progress
- Database Query Latency
- Cache Hit/Miss Rate

## Alerts

APIFromAnything includes pre-configured alerts that notify you when certain conditions are met. The alerts include:

- **HighRequestLatency**: Triggered when the 95th percentile of request latency is above 1s for an endpoint
- **HighErrorRate**: Triggered when the error rate is above 5% for an endpoint
- **CriticalErrorRate**: Triggered when the error rate is above 20% for an endpoint
- **HighRequestRate**: Triggered when the request rate is above 100 requests per second for an endpoint
- **HighDatabaseLatency**: Triggered when the 95th percentile of database query latency is above 0.5s
- **HighCacheMissRate**: Triggered when the cache miss rate is above 80%
- **InstanceDown**: Triggered when an instance is down for more than 1 minute
- **HighMemoryUsage**: Triggered when memory usage is above 1GB

## Integration with Application Code

### Middleware

APIFromAnything includes a Prometheus middleware that automatically collects request metrics. The middleware is configured in `apifrom/monitoring.py`.

### Custom Metrics

You can add custom metrics to your application by using the `apifrom.monitoring` module. For example:

```python
from apifrom.monitoring import DatabaseMetrics, CacheMetrics

# Track database query time
async with DatabaseMetrics.track_query_time("select", "users"):
    result = await db.fetch_all("SELECT * FROM users")

# Record cache hit/miss
if result_from_cache:
    CacheMetrics.record_hit("user_cache")
else:
    CacheMetrics.record_miss("user_cache")
```

## Accessing Monitoring Tools

When running with Docker Compose, the monitoring tools are available at the following URLs:

- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (default credentials: admin/admin)
- AlertManager: http://localhost:9093

## Customizing Monitoring

### Adding Custom Metrics

You can add custom metrics by modifying the `apifrom/monitoring.py` file. For example, to add a new counter:

```python
from prometheus_client import Counter

MY_COUNTER = Counter(
    'apifrom_my_counter',
    'Description of my counter',
    ['label1', 'label2'],
    registry=registry
)

# Increment the counter
MY_COUNTER.labels(label1='value1', label2='value2').inc()
```

### Adding Custom Dashboards

You can add custom Grafana dashboards by adding JSON files to the `monitoring/grafana/provisioning/dashboards/` directory.

### Customizing Alerts

You can customize alerts by modifying the `monitoring/prometheus/rules/alerts.yml` file.

## Best Practices

- Monitor both application-level metrics (request rate, latency) and system-level metrics (CPU, memory)
- Set up alerts for critical conditions that require immediate attention
- Use dashboards to visualize trends and identify potential issues before they become critical
- Regularly review and adjust alert thresholds based on your application's performance characteristics
- Implement proper logging alongside metrics for better debugging capabilities