Skip to content

timescale/rsigma

Repository files navigation

RSigma

CI crates.io MSRV Docker GitHub Release License: MIT

The RSigma project is a complete Rust toolkit for the Sigma detection standard, including a parser, evaluation engine, rule conversion, streaming runtime, linter, CLI, and LSP.

RSigma parses Sigma YAML rules into a strongly-typed AST, compiles them into optimized matchers, and evaluates them against log events in real time. It handles stateful correlation logic in-process with memory-efficient compressed event storage. Or as Zack Allen put it in DEW #149, "RSigma is essentially a SIEM."

You can send events in many formats, including JSON, syslog (RFC 3164/5424), logfmt, CEF, plain text, and OTLP (OpenTelemetry Protocol), with auto-detection by default. pySigma-compatible processing pipelines handle field mapping and backend configuration. OTLP support lets any OpenTelemetry-compatible agent (Grafana Alloy, Vector, Fluent Bit, OTel Collector) forward logs to RSigma via HTTP or gRPC for detection.

For rule quality and editor integration, a built-in linter validates rules against 66 checks derived from the Sigma v2.1.0 specification, and an LSP server provides real-time diagnostics, completions, hover documentation, and quick-fix code actions in any editor.

Supported Features

  • Parse Sigma YAML into a strongly-typed AST with support for detection, correlation, and filter rules
  • Compile and evaluate rules against JSON events in real time with stateless detection and stateful correlation (sliding windows, group-by, chaining, suppression)
  • Accept JSON, syslog (RFC 3164/5424), logfmt, CEF, plain text, and OTLP logs with format auto-detection
  • pySigma-compatible processing pipelines for field mapping, transformations, conditions, and finalizers
  • Convert rules into backend-native query strings via a pluggable backend trait (PostgreSQL/TimescaleDB SQL, LynxDB)
  • Run as a streaming detection daemon with hot-reload, Prometheus metrics, and HTTP/NATS/OTLP input
  • NATS JetStream support with authentication (credentials, mTLS), replay, consumer groups, and dead-letter queues
  • OTLP support for any OpenTelemetry-compatible agent (Grafana Alloy, Vector, Fluent Bit, OTel Collector) via HTTP or gRPC
  • Built-in linter with 66 checks, four severity levels, a full suppression system, and auto-fix (--fix) for 13 safe rules
  • LSP server with real-time diagnostics, completions, hover documentation, document symbols, and quick-fix code actions
  • Multi-arch Docker images (linux/amd64, linux/arm64) with cosign signatures, SBOM, and SLSA Build L3 provenance
  • Cross-platform binaries for Linux, macOS, and Windows on amd64 and arm64

Crates

Crate Description
rsigma-parser Parse Sigma YAML into a strongly-typed AST
rsigma-eval Compile and evaluate rules against JSON events
rsigma-convert Transform rules into backend-native query strings
rsigma-runtime Streaming runtime with input adapters, log processor, and hot-reload
rsigma CLI for parsing, validating, linting, evaluating, converting rules, field catalog, and running a detection daemon
rsigma-lsp Language Server Protocol (LSP) server for IDE support

Note

RSigma has been featured in:

  • Detection Engineering Weekly #149 (March 2026) "Building a tool like RSigma is challenging because the Sigma specification has evolved into a robust domain-specific language over the years."
  • tl;dr sec #320 (March 2026) "Accurately evaluating the full spectrum of what Sigma rules can express is quite complex, it's pretty neat to read about how RSigma handles all of these conditional expressions, correlating across rules, etc."
  • Detection Engineering Weekly #154 (April 2026) "RSigma is not a SIEM, but it's an impressive feat to build a self-contained Rust binary that operates much like one. For teams doing pre-SIEM rule validation or forensics, it's a solid plug-and-play option."

Installation

# Build all crates
cargo build --release

# Install the CLI
cargo install rsigma

# Install the LSP server
cargo install --path crates/rsigma-lsp

Docker

Multi-arch images (linux/amd64, linux/arm64) are published to GHCR on every release.

docker pull ghcr.io/timescale/rsigma:latest
docker run --rm ghcr.io/timescale/rsigma:latest --help

Run with full runtime hardening:

docker run --rm \
  --read-only \
  --cap-drop=ALL \
  --security-opt=no-new-privileges:true \
  -v /path/to/rules:/rules:ro \
  ghcr.io/timescale/rsigma:latest validate /rules/

Verify the image signature:

cosign verify \
  --certificate-identity-regexp 'github.com/timescale/rsigma' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  ghcr.io/timescale/rsigma:latest

Quick Start

# Evaluate a single event against Sigma rules
rsigma eval -r rules/ -e '{"CommandLine": "cmd /c whoami"}'

# Stream NDJSON from stdin
cat events.ndjson | rsigma eval -r rules/

# Run as a daemon with hot-reload and Prometheus metrics
rsigma daemon -r rules/ -p ecs.yml --api-addr 0.0.0.0:9090

# Accept events via HTTP POST
rsigma daemon -r rules/ --input http

# Convert rules to PostgreSQL SQL
rsigma convert rules/ -t postgres

See the CLI README for complete documentation of all subcommands and flags.

Daemon Input Modes

The daemon accepts events from multiple sources. The --input flag selects the primary source, and OTLP is always available as an additional ingestion path when the daemon-otlp feature is enabled.

# stdin (default): pipe events from any source
hel run | rsigma daemon -r rules/ -p ecs.yml

# HTTP: POST NDJSON events to /api/v1/events
rsigma daemon -r rules/ --input http
curl -X POST http://localhost:9090/api/v1/events -d '{"CommandLine":"whoami"}'

# NATS JetStream (requires daemon-nats feature)
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --output nats://localhost:4222/detections

# OTLP (requires daemon-otlp feature): always active alongside any --input mode
# Agents (Grafana Alloy, Vector, Fluent Bit, OTel Collector) send logs to /v1/logs (HTTP) or gRPC
rsigma daemon -r rules/ --input http
curl -X POST http://localhost:9090/v1/logs -H 'Content-Type: application/json' -d '{"resourceLogs":[...]}'

NATS Streaming

Production-grade NATS JetStream support with authentication, at-least-once delivery, replay, and horizontal scaling via consumer groups.

# Credentials file authentication
rsigma daemon -r rules/ --input nats://nats.example.com:4222/events.> --nats-creds /etc/rsigma/nats.creds

# Mutual TLS
rsigma daemon -r rules/ --input nats://localhost:4222/events.> \
  --nats-tls-cert client.pem --nats-tls-key client-key.pem --nats-require-tls

# Replay from a point in time
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-time 2026-04-30T00:00:00Z

# Replay with automatic state restore (forward catch-up)
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --replay-from-sequence 1001 --state-db state.db

# Consumer groups for horizontal scaling
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --consumer-group detection-workers

# Dead-letter queue for events that fail processing
rsigma daemon -r rules/ --input nats://localhost:4222/events.> --dlq file:///var/log/rsigma-dlq.ndjson

Input Formats and Pipelines

Events are parsed with auto-detection by default (JSON, syslog, plain text). Feature-gated formats: logfmt, cef, evtx. Processing pipelines handle field mapping between source schemas and Sigma field names.

# With a processing pipeline for field mapping
rsigma eval -r rules/ -p pipelines/ecs.yml -e '{"process.command_line": "whoami"}'

# Explicit syslog with timezone offset
tail -f /var/log/syslog | rsigma eval -r rules/ --input-format syslog --syslog-tz +0530

# logfmt (requires logfmt feature)
rsigma eval -r rules/ --input-format logfmt < app.log

# CEF / ArcSight (requires cef feature)
rsigma eval -r rules/ --input-format cef < arcsight.log

Rule Conversion

Convert Sigma rules into backend-native queries for historical threat hunting.

# PostgreSQL SQL
rsigma convert rules/ -t postgres

# PostgreSQL with OCSF field mapping
rsigma convert rules/ -t postgres -p pipelines/ocsf_postgres.yml

# PostgreSQL views, TimescaleDB continuous aggregates, or sliding window correlation
rsigma convert rules/ -t postgres -f view
rsigma convert rules/ -t postgres -f continuous_aggregate
rsigma convert rules/ -t postgres -f sliding_window

# JSONB mode: access fields inside a JSONB column
rsigma convert rules/ -t postgres -O table=okta_events -O json_field=data -O timestamp_field=time

# LynxDB search queries
rsigma convert rules/ -t lynxdb

# List all fields referenced by a ruleset
rsigma fields -r rules/

# Show fields after pipeline mapping
rsigma fields -r rules/ -p ecs.yml --json

# List available backends and formats
rsigma list-targets
rsigma list-formats postgres

Library Usage

Or use the library directly:

use rsigma_parser::parse_sigma_yaml;
use rsigma_eval::Engine;
use rsigma_eval::event::JsonEvent;
use serde_json::json;

let yaml = r#"
title: Detect Whoami
logsource:
    product: windows
    category: process_creation
detection:
    selection:
        CommandLine|contains: 'whoami'
    condition: selection
level: medium
"#;

let collection = parse_sigma_yaml(yaml).unwrap();
let mut engine = Engine::new();
engine.add_collection(&collection).unwrap();

let event = JsonEvent::borrow(&json!({"CommandLine": "cmd /c whoami"}));
let matches = engine.evaluate(&event);
assert_eq!(matches[0].rule_title, "Detect Whoami");

Streaming Runtime

rsigma-runtime provides a reusable pipeline for streaming log detection. It handles input parsing (JSON, syslog, logfmt, CEF, plain text, auto-detect), batch evaluation with parallel detection + sequential correlation, atomic hot-reload via ArcSwap, and pluggable metrics.

use std::sync::Arc;
use rsigma_eval::CorrelationConfig;
use rsigma_runtime::{InputFormat, LogProcessor, NoopMetrics, RuntimeEngine};

// Load rules
let mut engine = RuntimeEngine::new(
    "rules/".into(),
    vec![],
    CorrelationConfig::default(),
    false,
);
engine.load_rules().unwrap();

let processor = LogProcessor::new(engine, Arc::new(NoopMetrics));

// Process a batch of raw log lines (any format)
let batch = vec![
    r#"{"CommandLine": "cmd /c whoami", "EventID": 1}"#.to_string(),
];
let results = processor.process_batch_with_format(
    &batch,
    &InputFormat::Json,
    None,
);

for result in &results {
    for det in &result.detections {
        println!("Detection: {}", det.rule_title);
    }
}

Input formats are selected via --input-format on the CLI or InputFormat in the library. Auto-detect (the default) tries JSON → syslog → plain text. Feature-gated formats: logfmt, cef.

See examples/jsonl_stdin.rs and examples/tail_syslog.rs for complete working examples.

Architecture

Everything starts with a Sigma rule in YAML format:

  • Parsing: serde_yaml deserializes the YAML into a raw value, then rsigma-parser turns it into a strongly-typed AST. A PEG grammar (sigma.pest) handles the document structure while a Pratt parser (condition.rs) handles condition expressions. Supporting modules define value types (value.rs: SigmaStr, wildcards, timespans) and AST nodes (ast.rs: modifiers, enums). The result is a SigmaRule, CorrelationRule, FilterRule, or SigmaCollection.

From there, the AST can go in three directions depending on what you need:

  • Evaluation: rsigma-eval compiles rules into optimized matchers (compiler.rs), runs stateless detection through Engine, and tracks stateful correlation (correlation.rs: sliding windows, group-by, chaining, suppression) across events. Processing pipelines handle field mapping, transformations, conditions, and finalizers before compilation. Events are accessed through a trait with implementations for JSON, key-value, and plain text.

  • Conversion: rsigma-convert transforms rules into backend-native query strings through a pluggable Backend trait. A condition walker traverses the AST and delegates to the backend for each node. TextQueryConfig exposes ~90 configuration fields for text-based backends. Concrete implementations include PostgreSQL/TimescaleDB (SQL for historical threat hunting) and LynxDB (SPL2-compatible search queries for log analytics).

  • Editor support: rsigma-lsp provides an LSP server over stdio (via tower-lsp) with real-time diagnostics (lint + parse + compile errors), completions, hover documentation, document symbols, and code actions. Works with VSCode, Neovim, Helix, Zed, and any LSP-capable editor.

When running as a streaming detection engine, rsigma-eval feeds into rsigma-runtime:

  • Input: Format adapters parse raw log lines (JSON, syslog, logfmt*, CEF*, plain text, with auto-detection) into EventInputDecoded. Sources include stdin, HTTP POST, NATS JetStream, and OTLP* (HTTP protobuf/JSON and gRPC).
  • Processing: LogProcessor runs batch evaluation with parallel detection and sequential correlation. RuntimeEngine wraps Engine and CorrelationEngine with rule loading and ArcSwap hot-reload.
  • Output: Sinks write detection results to stdout, files, or NATS. Multiple sinks can run in fan-out. The output is MatchResult and CorrelationResult, containing rule title, id, level, tags, matched selections, field matches, aggregated values, and optionally the triggering events.

Feature-gated items are marked with * in the diagram.

Architecture diagram
                    ┌──────────────────┐
   YAML input ───>  │   serde_yaml     │──> Raw YAML Value
                    └──────────────────┘
                             │
                             ▼
                    ┌──────────────────┐
                    │   parser.rs      │──> Typed AST
                    │  (YAML → AST)    │   (SigmaRule, CorrelationRule,
                    └──────────────────┘    FilterRule, SigmaCollection)
                             │
            ┌────────────────┼──────────────┐
            ▼                ▼              ▼
     ┌────────────┐  ┌────────────┐  ┌────────────┐
     │ sigma.pest │  │  value.rs  │  │   ast.rs   │
     │  (PEG      │  │ (SigmaStr, │  │ (AST types │
     │  grammar)  │  │  wildcards,│  │  modifiers,│
     │     +      │  │  timespan) │  │  enums)    │
     │condition.rs│  └────────────┘  └────────────┘
     │  (Pratt    │
     │  parser)   │
     └────────────┘
           │
     ┌─────┴───────────────────────────────────────────────────────────┐
     │                                   │                             │
     ▼                                   ▼                             ▼
    ┌─────────────────────────┐   ┌─────────────────────┐   ┌────────────────────┐
    │      rsigma-eval        │   │   rsigma-convert    │   │    rsigma-lsp      │
    │                         │   │                     │   │                    │
    │  Event trait ──>        │   │  Backend trait ──>  │   │  LSP server over   │
    │    JsonEvent, KvEvent,  │   │    pluggable query  │   │  stdio (tower-lsp) │
    │    PlainEvent           │   │    generation       │   │                    │
    │                         │   │                     │   │  • diagnostics     │
    │  pipeline/ ──>          │   │  TextQueryConfig    │   │    (lint + parse   │
    │    Pipeline, conditions,│   │    ──> ~90 config   │   │     + compile)     │
    │    transformations,     │   │    fields for text  │   │  • completions     │
    │    state, finalizers    │   │    query backends   │   │  • hover           │
    │                         │   │                     │   │  • document        │
    │  compiler.rs ──>        │   │  Condition walker,  │   │    symbols         │
    │    CompiledRule         │   │    deferred exprs,  │   │                    │
    │  engine.rs ──>          │   │    conversion state │   │  Editors:          │
    │    Engine (stateless)   │   │                     │   │  VSCode, Neovim,   │
    │                         │   │  backends/ ──>      │   │  Helix, Zed, ...   │
    │  correlation.rs ──>     │   │    TextQueryTest,   │   └────────────────────┘
    │    sliding windows,     │   │    PostgreSQL/      │
    │    group-by, chaining,  │   │    TimescaleDB,     │
    │    suppression, events  │   │    LynxDB           │
    │                         │   └─────────────────────┘
    │                         │
    │  rsigma.* custom        │
    │    attributes           │
    └─────────────────────────┘
              │
              ▼
    ┌──────────────────────────────────────────┐
    │            rsigma-runtime                │
    │                                          │
    │  input/ ──> format adapters:             │
    │    JSON, syslog, logfmt*, CEF*,          │
    │    plain text, auto-detect               │
    │    ↓ raw line → EventInputDecoded        │
    │                                          │
    │  LogProcessor ──> batch evaluation       │
    │    ArcSwap hot-reload, MetricsHook,      │
    │    EventFilter (JSON payload extraction) │
    │                                          │
    │  RuntimeEngine ──> wraps Engine +        │
    │    CorrelationEngine with rule loading   │
    │                                          │
    │  io/ ──> EventSource (stdin, HTTP, NATS) │
    │          OTLP* (HTTP + gRPC)             │
    │          Sink (stdout, file, NATS)       │
    └──────────────────────────────────────────┘
              │                (* = feature-gated)
              ▼
     ┌────────────────────┐
     │  MatchResult       │──> rule title, id, level, tags,
     │  CorrelationResult │   matched selections, field matches,
     └────────────────────┘   aggregated values, optional events

A Mermaid version of this diagram is also available.

Reference

Releasing

All crates share a single version (set in the workspace Cargo.toml) and are published together.

Publishing a new version

  1. Bump the version in the root Cargo.toml.
  2. Commit, push to main.
  3. Create a GitHub Release (e.g. tag v0.2.0). The publish.yml workflow triggers automatically and publishes all crates in dependency order.

Dry run

Trigger the workflow manually via Actions → Publish to crates.io → Run workflow. Manual runs automatically pass --dry-run to every cargo publish invocation.

Recovering from a partial failure

If the workflow fails midway (e.g. rsigma-parser was published but rsigma-eval failed), re-running the workflow will fail at the already-published crate. To recover, publish the remaining crates manually in order:

# Skip crates that were already published successfully
cargo publish -p rsigma-eval && sleep 30
cargo publish -p rsigma
cargo publish -p rsigma-lsp

License

MIT

About

A Rust parser, linter, backend, runtime, converter and LSP for the Sigma detection and correlation standard

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages