Skip to main content

Logs & Monitoring

Complete guide for logging, monitoring, and observability of Chatty AI deployment.

Log Locations

Docker Container Logs

All container logs are managed by Docker:

# View log location
docker inspect chattyai | grep LogPath

# Typical location
/var/lib/docker/containers/<container-id>/<container-id>-json.log

Application Logs

Logs are written to:

  • stdout/stderr: Captured by Docker
  • Container filesystem: /app/logs/ (if configured)
  • Volumes: Persistent logs in mounted volumes

Viewing Logs

Real-time Monitoring

# All services
docker compose logs -f

# Specific service
docker compose logs -f chattyai
docker compose logs -f nginx
docker compose logs -f db
docker compose logs -f qdrant

Historical Logs

# Last 100 lines
docker compose logs --tail=100 chattyai

# Last hour
docker compose logs --since=1h chattyai

# Specific time range
docker compose logs --since="2024-04-09T10:00:00" --until="2024-04-09T11:00:00" chattyai

# All logs
docker compose logs chattyai

Filter Logs

# Search for errors
docker compose logs chattyai | grep -i error

# Search for specific pattern
docker compose logs chattyai | grep "database connection"

# Count occurrences
docker compose logs chattyai | grep -c "ERROR"

Log Levels

Chatty AI Application

Log levels (from most to least verbose):

  • DEBUG: Detailed debugging information
  • INFO: General informational messages
  • WARNING: Warning messages
  • ERROR: Error messages
  • CRITICAL: Critical errors

Configure via environment variable:

LOG_LEVEL=INFO  # Default

Nginx

Nginx logs:

  • Access logs: All HTTP requests
  • Error logs: Nginx errors
# Access logs
docker compose logs nginx | grep "GET\|POST"

# Error logs
docker compose logs nginx | grep error

PostgreSQL

Database logs:

# All database logs
docker compose logs db

# Connection logs
docker compose logs db | grep "connection"

# Query logs (if enabled)
docker compose logs db | grep "LOG:"

Log Management

Log Rotation

Docker automatically rotates logs. Configure in /etc/docker/daemon.json:

{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}

Restart Docker after changes:

sudo systemctl restart docker

Export Logs

# Export to file
docker compose logs chattyai > chattyai-$(date +%Y%m%d).log

# Export with timestamps
docker compose logs --timestamps chattyai > chattyai-$(date +%Y%m%d).log

# Compress logs
docker compose logs chattyai | gzip > chattyai-$(date +%Y%m%d).log.gz

Clean Old Logs

# Truncate logs (CAREFUL!)
truncate -s 0 $(docker inspect --format='{{.LogPath}}' chattyai)

# Or restart container (clears logs)
docker compose restart chattyai

Monitoring Metrics

Container Resource Usage

# Real-time stats
docker stats

# Formatted output
docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}"

# Specific container
docker stats chattyai qdrant db

System Metrics

# CPU usage
top
htop

# Memory usage
free -h
vmstat 1

# Disk usage
df -h
du -sh /var/lib/docker/volumes/*

# Disk I/O
iostat -x 1

# Network
iftop
nethogs

Health Checks

Container Health

# Check health status
docker ps --format "table {{.Names}}\t{{.Status}}"

# Detailed health
docker inspect chattyai | grep -A 10 Health

# Health check logs
docker inspect chattyai | jq '.[0].State.Health'

Application Health Endpoints

# Chatty AI health
curl -k https://YOUR_DOMAIN/health

# Qdrant health
docker exec chattyai curl http://qdrant:6333/health

# Database health
docker exec db pg_isready -U chattyAdmin

Service Availability

# Check all services responding
curl -I https://YOUR_CHATTYAI_DOMAIN
curl -I https://YOUR_N8N_DOMAIN
curl -I https://YOUR_DATABASES_DOMAIN

Performance Monitoring

Response Times

Monitor application response times:

# Test response time
time curl -k https://YOUR_DOMAIN

# Detailed timing
curl -w "@curl-format.txt" -o /dev/null -s https://YOUR_DOMAIN

Create curl-format.txt:

time_namelookup:  %{time_namelookup}\n
time_connect: %{time_connect}\n
time_appconnect: %{time_appconnect}\n
time_pretransfer: %{time_pretransfer}\n
time_redirect: %{time_redirect}\n
time_starttransfer: %{time_starttransfer}\n
----------\n
time_total: %{time_total}\n

Database Performance

# Active connections
docker exec db psql -U chattyAdmin -d chattydb -c "SELECT count(*) FROM pg_stat_activity;"

# Long-running queries
docker exec db psql -U chattyAdmin -d chattydb -c "SELECT pid, now() - query_start as duration, query FROM pg_stat_activity WHERE state = 'active' ORDER BY duration DESC;"

# Database size
docker exec db psql -U chattyAdmin -d chattydb -c "SELECT pg_size_pretty(pg_database_size('chattydb'));"

Qdrant Performance

# Collection stats
docker exec chattyai curl http://qdrant:6333/collections

# Cluster info
docker exec chattyai curl http://qdrant:6333/cluster

Alerting

Basic Monitoring Script

Create /usr/local/bin/chatty-monitor.sh:

#!/bin/bash

# Check if containers are running
if [ $(docker ps | grep -c chattyai) -eq 0 ]; then
echo "ALERT: Chatty AI container is down!"
# Send alert (email, slack, etc.)
fi

# Check disk space
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ $DISK_USAGE -gt 80 ]; then
echo "ALERT: Disk usage is ${DISK_USAGE}%"
fi

# Check memory
MEM_USAGE=$(free | grep Mem | awk '{print ($3/$2) * 100.0}' | cut -d. -f1)
if [ $MEM_USAGE -gt 90 ]; then
echo "ALERT: Memory usage is ${MEM_USAGE}%"
fi

Run via cron:

# Add to crontab
*/5 * * * * /usr/local/bin/chatty-monitor.sh

Log Analysis

Common Patterns

Find errors:

docker compose logs chattyai | grep -i "error\|exception\|failed"

Find slow queries:

docker compose logs db | grep "duration:"

Find authentication failures:

docker compose logs chattyai | grep "authentication failed"

Find certificate issues:

docker compose logs nginx | grep -i "ssl\|certificate"

Log Statistics

# Count log entries by level
docker compose logs chattyai | grep -oP '(DEBUG|INFO|WARNING|ERROR|CRITICAL)' | sort | uniq -c

# Requests per minute (nginx)
docker compose logs nginx | grep -oP '\d{2}:\d{2}' | uniq -c

# Top error messages
docker compose logs chattyai | grep ERROR | sort | uniq -c | sort -rn | head -10

Monitoring Tools (Optional)

Prometheus + Grafana

For advanced monitoring, consider:

  1. Prometheus: Metrics collection
  2. Grafana: Visualization dashboards
  3. cAdvisor: Container metrics
  4. Node Exporter: System metrics

ELK Stack

For centralized logging:

  1. Elasticsearch: Log storage
  2. Logstash: Log processing
  3. Kibana: Log visualization

Simple Alternatives

  • Portainer: Built-in monitoring (if using Portainer)
  • Docker stats: Basic resource monitoring
  • Netdata: Real-time system monitoring
  • Glances: Terminal-based monitoring

Troubleshooting with Logs

Container Won't Start

# Check why container exited
docker compose logs --tail=50 chattyai

# Check for common issues
docker compose logs chattyai | grep -i "error\|failed\|cannot"

Application Errors

# Find recent errors
docker compose logs --since=10m chattyai | grep ERROR

# Check for stack traces
docker compose logs chattyai | grep -A 20 "Traceback"

Performance Issues

# Check for resource warnings
docker compose logs chattyai | grep -i "memory\|cpu\|timeout"

# Check database connection pool
docker compose logs chattyai | grep "connection pool"

Network Issues

# Check for connection errors
docker compose logs chattyai | grep -i "connection refused\|timeout\|unreachable"

# Check nginx errors
docker compose logs nginx | grep error

Best Practices

Regular Monitoring

  • Check logs daily for errors
  • Monitor resource usage weekly
  • Review performance trends monthly
  • Test alerting quarterly

Log Retention

  • Keep recent logs (7-30 days) on server
  • Archive old logs to backup storage
  • Compress archived logs
  • Document retention policy

Security

  • Sanitize logs (no passwords/tokens)
  • Restrict log access to admins only
  • Monitor for security events
  • Audit log access

Monitoring Checklist

Daily:

  • Check container status
  • Review error logs
  • Verify application accessible
  • Check resource usage

Weekly:

  • Review performance metrics
  • Check disk space trends
  • Analyze error patterns
  • Verify backups successful

Monthly:

  • Review capacity planning
  • Update monitoring scripts
  • Test alerting
  • Document incidents

Next Steps