Skip to content

sharanch/log-explainer

Repository files navigation

Log Explainer

A platform engineering observability tool that tails a live log file and uses a local LLM to explain each line in plain English, in real time. Built for SREs and on-call engineers who need to understand what is happening fast, without context-switching to docs or pasting logs into an external service.

Optionally ships LLM-enriched log explanations to Elasticsearch via Filebeat — so when you open a Kibana incident, the plain-English context is already there alongside the raw log.

CI Docker Package Python 3.12 License: MIT


Features

  • Plain-English explanations of log lines, generated by a local LLM via Ollama
  • Automatic severity classification: INFO, WARN, ERROR, CRITICAL
  • Pattern spike detection: alerts when the same error repeats 5 or more times in 60 seconds
  • Incident summarization: generates a 2-3 sentence summary with a suggested action when 10 or more errors occur in 120 seconds
  • ELK integration (--elk-output): writes enriched JSON documents to a file that Filebeat ships to Elasticsearch — query raw logs and LLM explanations together in Kibana
  • Fully local: no API keys, no data sent externally, runs entirely on your machine
  • Installable as a system package: .deb for Debian/Ubuntu, .rpm for RHEL/Fedora
  • Docker and docker-compose support
  • Load generator for testing (loadgen.py)

Architecture

Standalone mode

Live log file
    ↓
log-explainer (tails log, sends to Ollama)
    ↓
Plain-English explanation printed to stdout

ELK pipeline mode

Live log file
    ↓
log-explainer --elk-output (tails log, sends to Ollama)
    ↓
Structured JSON written to explained-logs.jsonl
    ↓
Filebeat ships to Elasticsearch
    ↓
Kibana — query raw_log + explanation + error_type together

Each document in Elasticsearch contains:

Field Description
@timestamp When the log line appeared
raw_log Original unmodified log line
explanation LLM-generated plain-English explanation
level Severity: INFO / WARN / ERROR / CRITICAL
app_service Service name parsed from the log line
error_type system, business, or none
model Ollama model used

Installation

Option 1: System package (recommended)

Download the latest .deb or .rpm from the Releases page.

Debian / Ubuntu:

sudo dpkg -i log-explainer_1.0.0_all.deb

RHEL / Fedora / CentOS:

sudo rpm -i log-explainer-1.0.0-1.noarch.rpm

The package installs log-explainer to /usr/bin and installs the requests dependency via pip in the post-install hook. Ollama must be installed separately.

After install, verify:

log-explainer --help

Option 2: Run from source

Prerequisites: Python 3.12+, Ollama installed and running.

git clone https://github.com/sharanch/log-explainer.git
cd log-explainer
pip install -r requirements.txt
ollama pull qwen2.5-coder:1.5b
python3 log_parser.py /var/log/myapp.log --model qwen2.5-coder:1.5b

Option 3: Docker Compose

cp .env.example .env
# Edit .env and set LOG_FILE_PATH to an absolute path
docker compose up

With inline overrides:

LOG_FILE_PATH=/var/log/myapp.log \
APP_CONTEXT="Django REST API with Postgres and Redis" \
OLLAMA_MODEL=qwen2.5-coder:1.5b \
MIN_SEVERITY=WARN \
docker compose up

Option 4: Pull from GHCR

docker pull ghcr.io/sharanch/log-explainer:latest

docker run --rm -it \
  -v /var/log/myapp.log:/logs/app.log:ro \
  --network host \
  ghcr.io/sharanch/log-explainer:latest \
  /logs/app.log --model qwen2.5-coder:1.5b

Prerequisites

Ollama must be installed and a model must be pulled before running log-explainer.

# Install Ollama: https://ollama.com/download

# Start Ollama (keep this running in a dedicated terminal)
ollama serve

# Pull a model
ollama pull qwen2.5-coder:1.5b

# Confirm it is available
ollama list

Usage

log-explainer <logfile> [options]

# or from source:
python3 log_parser.py <logfile> [options]

Arguments:
  logfile               Path to the log file to tail

Options:
  --model MODEL         Ollama model to use (default: qwen2.5-coder:1.5b)
  --context CONTEXT     App description to improve explanation quality
                        e.g. "FastAPI service with PostgreSQL and Redis"
  --severity LEVEL      Minimum severity to display: INFO, WARN, ERROR, CRITICAL
                        (default: INFO)
  --elk-output          Write explained logs as JSON for Filebeat/Elasticsearch ingestion
  --elk-file PATH       Output file for ELK JSON documents (default: explained-logs.jsonl)

The tool seeks to EOF on start. Only lines written after launch are processed.

Examples

# Basic usage
log-explainer /var/log/myapp.log --model qwen2.5-coder:1.5b

# Reduce noise during an incident — show warnings and above only
log-explainer /var/log/myapp.log \
  --model qwen2.5-coder:1.5b \
  --severity WARN

# Provide app context for better explanations
log-explainer /var/log/myapp.log \
  --model qwen2.5-coder:1.5b \
  --context "FastAPI service with PostgreSQL and Redis" \
  --severity WARN

# Tail a remote log over SSH — Ollama runs locally, logs stay on the server
ssh user@server "tail -f /var/log/app.log" | log-explainer /dev/stdin \
  --model qwen2.5-coder:1.5b

# ELK pipeline — explain errors and ship to Elasticsearch
nohup python3 scripts/loadgen.py --output /tmp/output.log --duration 300 --rate 2 & \
tail -f /tmp/output.log | python3 log_parser.py /dev/stdin \
  --severity ERROR \
  --elk-output \
  --elk-file explained-logs.jsonl

ELK integration

Quick start

# 1 — create the output file
touch explained-logs.jsonl

# 2 — start the ELK stack
docker compose up elasticsearch kibana filebeat -d

# 3 — run log-explainer with ELK output enabled
tail -f /var/log/myapp.log | python3 log_parser.py /dev/stdin \
  --severity ERROR \
  --elk-output \
  --elk-file explained-logs.jsonl

# 4 — open Kibana at http://localhost:5601
#     Stack Management → Data Views → Create data view
#     Pattern: log-explainer-*   Timestamp: @timestamp

Kibana queries

# All system errors with explanations
error_type: system

# Business errors
error_type: business

# Critical events
level: CRITICAL

# OOM events
raw_log: *OutOfMemoryError*

# Slow queries
raw_log: *slow query*

Production recommendation

Run with --severity ERROR in production — only error-level and above go to Ollama, keeping token cost and latency low. INFO and WARN logs are still shipped raw via standard Filebeat if needed.


Example output

17:30:04  [WARN    ] Retrying database connection attempt 1 of 5
          -> The app is having trouble reaching the database and is attempting to reconnect.

17:30:07  [ERROR   ] Connection refused to postgres:5432
          -> The app cannot reach PostgreSQL. The database may be down or the host/port is incorrect.

17:30:08  [ERROR   ] Connection refused to postgres:5432
17:30:09  [ERROR   ] Connection refused to postgres:5432
17:30:10  [ERROR   ] Connection refused to postgres:5432
17:30:11  [ERROR   ] Connection refused to postgres:5432
          Pattern repeated 5x in 60s — possible recurring issue

INCIDENT SPIKE DETECTED — generating summary...
The application has lost connectivity to PostgreSQL after exhausting all retry
attempts. This is likely a database crash or network partition. Immediate action:
check PostgreSQL service health and verify network connectivity from the app host.

Testing with the load generator

scripts/loadgen.py appends realistic log lines to a file at a configurable rate. Use it to test log-explainer locally without a live application.

# Default: append to sample.log for 5 seconds at 2 lines/sec
python3 scripts/loadgen.py

# Custom output file and duration
python3 scripts/loadgen.py --output /tmp/test.log --duration 30

# Higher rate to trigger pattern spike and incident summary alerts
python3 scripts/loadgen.py --rate 10 --duration 60

The load generator draws from a pool of 100 log lines with the following weighted distribution:

Severity Weight Description
INFO 40% Normal operation: startup, requests, cache, connections
WARN 30% Retry attempts, slow queries, resource pressure, cert expiry
ERROR 20% Connection failures, exceptions, failed jobs, 500s
CRITICAL 10% OOM, FATAL, panic, segfault, data loss

To run a full demo with both terminals:

# Terminal 1: start log-explainer
log-explainer sample.log --model qwen2.5-coder:1.5b --severity WARN

# Terminal 2: run the load generator
python3 scripts/loadgen.py --duration 30 --rate 3

To trigger the incident summary (10 errors in 120 seconds), use a higher rate:

python3 scripts/loadgen.py --rate 10 --duration 30

Recommended models

Model Size Best for
qwen2.5-coder:1.5b 1GB Default: fast, low memory
qwen2.5-coder 4.7GB Better quality for code and stack traces
mistral 4.1GB Fast, solid general quality
llama3 4.7GB Best general explanations
phi3 2.3GB Low-resource machines

Pull any model with: ollama pull <model-name>


Development

# Install dev dependencies
pip install -r requirements.txt ruff pytest pytest-cov

# Run tests with coverage
PYTHONPATH=. pytest tests/ -v --cov=log_parser --cov-report=term-missing

# Lint
ruff check log_parser.py tests/

CI/CD

Workflow Trigger What it does
CI Every push and pull request Lint with ruff, run pytest with 80% coverage gate
Docker Push to main or version tag Build image, push to GHCR
Package Version tag only (v*..) Build .deb and .rpm, create GitHub Release, attach packages

To cut a release:

git tag v1.0.0
git push origin v1.0.0

This triggers the Docker and Package workflows simultaneously. The GitHub Releases page will have the .deb, .rpm attached, and the Docker image will be tagged v1.0.0 on GHCR.

Docker images are tagged automatically: latest, branch name, sha-<short>, and semver on version tags.


Demo: https://github.com/sharanch/log-explainer/assets/Demo.mp4

License

MIT

About

A log parser that use local llm to explain log lines in plain english.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors