Loki Log Aggregation
Deploy Grafana Loki for centralized log collection and querying
What is Loki?
Loki is a log aggregation system designed by Grafana Labs that works like Prometheus but for logs. Key features:
- Label-based indexing: Index logs by labels instead of full text (reducing storage costs)
- LogQL: Powerful query language similar to Prometheus PromQL
- Scalability: Processes multi-terabyte log volumes efficiently
- Grafana integration: Native datasource support for visualization
- Multiple scrapers: Promtail, Filebeat, Fluentd compatibility
- Cost-effective: Lower resource usage compared to traditional log stacks
Docker Compose Installation
Create a complete Loki stack with Promtail and Grafana:
version: '3.8'
services:
loki:
image: grafana/loki:latest
container_name: loki
ports:
- "3100:3100"
volumes:
- ./loki-config.yaml:/etc/loki/local-config.yaml
- loki_data:/loki
command: -config.file=/etc/loki/local-config.yaml
networks:
- logging
promtail:
image: grafana/promtail:latest
container_name: promtail
volumes:
- ./promtail-config.yaml:/etc/promtail/config.yml
- /var/log:/var/log:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock
command: -config.file=/etc/promtail/config.yml
depends_on:
- loki
networks:
- logging
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
volumes:
- grafana_data:/var/lib/grafana
depends_on:
- loki
networks:
- logging
volumes:
loki_data:
grafana_data:
networks:
logging:Deploy the stack:
docker-compose up -dLoki Configuration
Create loki-config.yaml:
auth_enabled: false
ingester:
chunk_idle_period: 3m
max_chunk_age: 1h
max_streams_per_user: 10000
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
filesystem:
directory: /loki/chunks
boltdb_shipper:
active_index_directory: /loki/boltdb-shipper-active
shared_store: filesystem
retention_config:
enabled: true
retention_deletes_enabled: true
retention_period: 720h # 30 days
limits_config:
ingestion_rate_mb: 128
ingestion_burst_size_mb: 256
max_line_length: 262144
reject_old_samples: true
reject_old_samples_max_age: 168h
server:
http_listen_port: 3100
log_level: infoPromtail Configuration
Create promtail-config.yaml:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
# Scrape system logs
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: system
__path__: /var/log/*.log
pipeline_stages:
- multiline:
line_start_pattern: '^\d{4}-\d{2}-\d{2}'
- regex:
expression: '(?P<timestamp>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})\s+(?P<level>\w+)\s+(?P<message>.*)'
- timestamp:
source: timestamp
format: "2006-01-02 15:04:05"
- labels:
level:
# Scrape nginx access logs
- job_name: nginx
static_configs:
- targets:
- localhost
labels:
job: nginx
__path__: /var/log/nginx/access.log
pipeline_stages:
- regex:
expression: '(?P<ip>\S+) - (?P<user>\S+) \[(?P<timestamp>[^\]]+)\] "(?P<method>\w+) (?P<path>\S+) (?P<protocol>\S+)" (?P<status>\d+) (?P<size>\d+)'
- timestamp:
source: timestamp
format: "02/Jan/2006:15:04:05 -0700"
- labels:
job: nginx
status:
method:
# Docker container logs
- job_name: docker
static_configs:
- targets:
- localhost
labels:
job: docker
docker:
host: unix:///var/run/docker.sock
labels:
container_name:
image_name:
pipeline_stages:
- json:
expressions:
level: level
message: msg
- labels:
container_name:
level:Add Loki Datasource to Grafana
- Open Grafana:
http://localhost:3000(default: admin/admin) - Navigate to Connections → Add new connection
- Search for Loki
- Click Create a Loki data source
- Configure:
- Name:
Loki - URL:
http://loki:3100 - Skip TLS Verify: true (for local/testing)
- Name:
- Click Save & test
LogQL Query Language
Basic Queries
# Select all logs from a job
{job="nginx"}
# Filter by label value
{job="nginx", status="500"}
# Pattern matching
{job="nginx"} |= "error"
# Regex matching
{job="nginx"} |~ "5\d\d"
# Exclude pattern
{job="nginx"} != "200"Metric Queries
Convert logs to metrics:
# Count logs per second
rate({job="nginx"} [1m])
# Count 5xx errors per minute
sum by (method) (rate({job="nginx", status=~"5.."} [5m]))
# Bytes per second (requires parsing)
sum by (job) (rate({job="nginx"} | json | unwrap size [1m]))Log Parsing
Extract fields from logs:
# JSON parsing
{job="app"} | json | level="error"
# Regex parsing with extraction
{job="nginx"} | regexp "status=(?P<status>\d+)" | status="500"
# Pattern parsing
{job="app"} | pattern "<_> - <_> [<_>] \"<method> <path> <_>\" <status> <size>"Advanced Queries
# Top 10 status codes
topk(10, count_over_time({job="nginx"} | json [5m]))
# Error rate percentage
(
sum(rate({job="app"} |= "error" [5m]))
/
sum(rate({job="app"} [5m]))
) * 100
# Logs with parsing multiple fields
{job="app"}
| json
| line_format "{{.timestamp}} [{{.level}}] {{.message}}"
| level="ERROR"Docker Logging Driver
Enable Loki logging driver for all containers:
# Install plugin (if not using native driver)
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissionsConfigure in docker-compose.yml:
services:
myapp:
image: myapp:latest
logging:
driver: loki
options:
loki-url: "http://loki:3100/loki/api/v1/push"
loki-batch-size: "400"
labels: "job=myapp,env=production"
max-buffer-size: "4m"Or configure in /etc/docker/daemon.json for all containers:
{
"log-driver": "loki",
"log-opts": {
"loki-url": "http://localhost:3100/loki/api/v1/push",
"labels": "job=docker"
}
}Nginx Access Log Pipeline
Advanced Promtail pipeline for parsing Nginx logs:
- job_name: nginx
static_configs:
- targets:
- localhost
labels:
job: nginx
__path__: /var/log/nginx/access.log
pipeline_stages:
# Parse nginx combined log format
- regex:
expression: |
^(?P<remote_addr>[\w\.\-]+) (?P<remote_user>[\w\.\-]+|-) \[(?P<time_local>[^\]]+)\]
\"(?P<method>\w+) (?P<uri>[^ ]+) (?P<protocol>[^ ]+)\"
(?P<status>\d+) (?P<bytes_sent>\d+|-)
\"(?P<http_referer>[^\"]*|-)\" \"(?P<http_user_agent>[^\"]*|-)\"
# Extract timestamp
- timestamp:
source: time_local
format: "02/Jan/2006:15:04:05 -0700"
# Convert bytes_sent to number
- metrics:
bytes_sent:
type: counter
description: "Total bytes sent"
source: bytes_sent
# Add labels
- labels:
status:
method:
remote_addr:Retention Policy Configuration
Automatically clean up old logs:
In loki-config.yaml:
retention_config:
enabled: true
retention_deletes_enabled: true
retention_period: 720h # 30 days
compactor:
working_directory: /loki/boltdb-shipper-compactor
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 10m
retention_delete_worker_count: 10Scaling Considerations
For production deployments:
# Increase ingestion limits
limits_config:
ingestion_rate_mb: 512
ingestion_burst_size_mb: 1024
max_streams_per_user: 50000
# Tune chunk settings
ingester:
chunk_idle_period: 1h
max_chunk_age: 2h
chunk_encoding: snappy
# Configure persistent storage
storage_config:
s3:
s3: "s3://bucket-name/path"
endpoint: "s3.amazonaws.com"
region: "us-east-1"Alloy is the modern replacement for Promtail. It's a vendor-neutral distribution of the OpenTelemetry Collector:
- Better performance and resource efficiency
- Support for metrics, traces, and logs (not just logs)
- More flexible pipeline configuration
- Replaces Promtail, Fluentd, and Telegraf
Install: docker run grafana/alloy:latest
Troubleshooting
Check Loki Health
# From inside container
curl http://loki:3100/ready
# Check metrics
curl http://loki:3100/metricsVerify Promtail Connection
# Check Promtail logs
docker logs promtail
# Verify position file is updating
docker exec promtail cat /tmp/positions.yamlQuery Empty Results
- Verify labels are set correctly in scrape config
- Check Promtail is scraping files: look at
/tmp/positions.yaml - Test LogQL query with simpler pattern:
{job="nginx"} - Check retention hasn't deleted logs:
sum(rate({job="nginx"}[5m]))
Next Steps
- Set up alerting rules in Grafana for error logs
- Create dashboards for application-specific metrics
- Implement log sampling for high-volume applications
- Configure S3-compatible storage for long-term retention
- Migrate from ELK stack to Loki for cost savings