diff --git a/README.md b/README.md index 9f4a572ba..0273a7d48 100644 --- a/README.md +++ b/README.md @@ -56,9 +56,9 @@ rshell --allow-all-commands --timeout 5s -c 'echo "hello from rshell"' Every access path is default-deny: -| Resource | Default | Opt-in | +| Resource | Default | Opt-in / Opt-out | |----------------------|-------------------------------------|----------------------------------------------| -| Command execution | All commands blocked (exit code 127)| `AllowedCommands` with namespaced command list (e.g. `rshell:cat`) | +| Command execution | All commands blocked (exit code 127)| `AllowedCommands` (namespaced names), `AllowedCommandPatterns` (argv-prefix patterns), and `DeniedCommandPatterns` (deny-first carve-outs from the allow rules) | | External commands | Blocked (exit code 127) | Provide an `ExecHandler` | | Filesystem access | Blocked | Configure `AllowedPaths` with directory list | | Environment variables| Empty (no host env inherited) | Pass variables via the `Env` option | @@ -66,6 +66,21 @@ Every access path is default-deny: **AllowedCommands** restricts which commands (builtins or external) the interpreter may execute. Commands must be specified with the `rshell:` namespace prefix (e.g. `rshell:cat`, `rshell:echo`). If not set, no commands are allowed. +**AllowedCommandPatterns** restricts execution to argv sequences shaped like `(command [, subcommand_path...])`. Each pattern is a non-empty token list; an invocation is admitted when: + +1. `argv[0]` equals `pattern[0]` exactly (the command name). +2. The leading **structural** tokens of `argv[1..]` equal `pattern[1..]`, where structural tokens are derived by skipping flag tokens according to the [`CommandSpec`](#commandspec) registered for the command. + +For example, with the built-in `ip` spec, pattern `["ip", "route"]` admits all of `ip route show`, `ip -4 route show`, `ip --brief route show` — the spec recognises `-4` and `--brief` as boolean global flags, so they're skipped during structural extraction. The same pattern blocks `ip addr show` and `ip link list` because the leading structural token after `ip` is `addr` / `link`, not `route`. + +Patterns are matched **after shell expansion**, so command-substitution-derived values (`$(...)`) cannot bypass the check — the matcher sees the resolved argv that would be handed to exec. `AllowedCommands` and `AllowedCommandPatterns` are independent permit axes joined by union: a command is allowed if its name appears in `AllowedCommands` OR its argv satisfies any pattern. + +**CommandSpec** describes the flag conventions of a single command so the matcher can distinguish flag tokens from structural tokens. The `ip` builtin is shipped with a default spec; integrators register additional specs (kubectl, git, docker, etc.) via the `CommandSpecs(map[string]CommandSpec{...})` option. Multi-token patterns whose command lacks a spec are rejected at `New()` — single-token patterns (matching only `argv[0]`) require no spec. + +> Without a spec, the matcher would have to guess which argv tokens are flag values vs. positional arguments, and that guess is the only way `kubectl delete pod get` could ever masquerade as a `kubectl get` invocation. The spec-driven matcher closes that bypass by inspecting only the leading structural position for the subcommand, so positional arguments at later positions cannot satisfy pattern slots. + +**DeniedCommandPatterns** blocks invocations whose argv satisfies any of the given patterns, regardless of whether `AllowedCommands` or `AllowedCommandPatterns` would otherwise admit the call. Denies are evaluated **first**: a deny match short-circuits the gate to a refusal even if every other axis would permit the invocation. Same shape and matching semantics as `AllowedCommandPatterns` (pattern is a token list, multi-token patterns require a registered `CommandSpec`, etc.). Used to express "allow X but carve out Y" policies — for example, `AllowedCommands={rshell:ip}` plus `DeniedCommandPatterns=[["ip","route"]]` permits `ip addr` and `ip link` but forbids `ip route`. The architectural property that allow patterns survive shell substitution applies here equally: a substitution that resolves at runtime to a denied argv is blocked. + **AllowedPaths** restricts all file operations to specified directories using Go's `os.Root` API (`openat` syscalls), making it immune to symlink traversal, TOCTOU races, and `..` escape attacks. Configured directories that cannot be opened (missing, not a directory, no permission) are skipped with a diagnostic message; by default these messages are flushed once to the runner's stderr at construction time. Callers that need to keep stderr clean of sandbox diagnostics can route them to a dedicated sink with `WarningsWriter(io.Writer)` or retrieve them programmatically via `Runner.Warnings()`. > **Note:** The `ss` and `ip route` builtins bypass `AllowedPaths` for their `/proc/net/*` reads. Both builtins open kernel pseudo-filesystem paths (e.g. `/proc/net/tcp`, `/proc/net/route`) directly with `os.Open` rather than going through the sandboxed opener. These paths are hardcoded in the implementation and are never derived from user input, so there is no sandbox-escape risk. However, operators cannot use `AllowedPaths` to block `ss` from enumerating local sockets or `ip route` from reading the routing table — these reads succeed regardless of the configured path policy. diff --git a/SHELL_FEATURES.md b/SHELL_FEATURES.md index 65be70e31..fd6a6aa1d 100644 --- a/SHELL_FEATURES.md +++ b/SHELL_FEATURES.md @@ -102,6 +102,9 @@ The in-shell `help` command mirrors these feature categories: run `help` for a c ## Execution - ✅ AllowedCommands — restricts which commands (builtins or external) may be executed; commands require the `rshell:` namespace prefix (e.g. `rshell:cat`); if not set, no commands are allowed +- ✅ AllowedCommandPatterns — restricts execution to `(command [, subcommand_path...])` shaped invocations (e.g. `["ip","route"]` permits `ip route show` AND `ip -4 route show` but not `ip addr show`); matched after shell expansion so command-substitution cannot bypass; argv[0] must equal pattern[0] exactly; remaining pattern tokens must equal the LEADING STRUCTURAL tokens of argv[1..] (where flags are skipped per the registered CommandSpec); combined with AllowedCommands by union (allow if name OR pattern admits); multi-token patterns require a CommandSpec for their command (single-token patterns do not) +- ✅ DeniedCommandPatterns — blocks invocations whose argv satisfies any of the given patterns, evaluated FIRST so a deny match overrides every allow rule (AllowAllCommands, AllowedCommands, AllowedCommandPatterns); same pattern shape and matching semantics as AllowedCommandPatterns; useful for "allow X but carve out Y" policies (e.g. allow `ip` wholesale by name, deny `ip route` specifically) +- ✅ CommandSpecs — registers per-command flag conventions (BooleanFlags, ValueFlags) used by AllowedCommandPatterns to distinguish flag tokens from structural tokens; `ip` ships with a built-in spec; integrators add their own via `interp.CommandSpecs(map[string]interp.CommandSpec{...})` - ✅ AllowedPaths filesystem sandboxing — restricts all file access to specified directories - ✅ Whole-run execution timeout — callers can bound a `Run()` call via `context.Context`, `interp.MaxExecutionTime`, or the CLI `--timeout` flag; the deadline applies to the entire script, not each individual command - ✅ ProcPath — overrides the proc filesystem path used by `ps` (default `/proc`; Linux-only; useful for testing/container environments) diff --git a/analysis/symbols_interp.go b/analysis/symbols_interp.go index 3da6c1a28..2501bc13f 100644 --- a/analysis/symbols_interp.go +++ b/analysis/symbols_interp.go @@ -56,6 +56,7 @@ var interpAllowedSymbols = []string{ "runtime.GOOS", // 🟢 current OS name constant; pure constant, no I/O. "strconv.Itoa", // 🟢 int-to-string conversion; pure function, no I/O. "strings.Builder", // 🟢 efficient string concatenation; pure in-memory buffer, no I/O. + "strings.Contains", // 🟢 checks if a substring is in a string; pure function, no I/O. "strings.ContainsRune", // 🟢 checks if a rune is in a string; pure function, no I/O. "strings.NewReader", // 🟢 wraps a string as an io.Reader; pure function, no I/O; used by ParseScript. "strings.Index", // 🟢 finds substring index; pure function, no I/O. @@ -65,6 +66,7 @@ var interpAllowedSymbols = []string{ "strings.Split", // 🟢 splits a string by separator; pure function, no I/O. "strings.ToUpper", // 🟢 converts string to uppercase; pure function, no I/O. "strings.TrimLeft", // 🟢 trims leading characters; pure function, no I/O. + "strings.TrimRight", // 🟢 trims trailing characters; pure function, no I/O. "sync.Mutex", // 🟢 mutual exclusion lock; concurrency primitive, no I/O. "sync.Once", // 🟢 ensures a function runs exactly once; concurrency primitive, no I/O. "sync.WaitGroup", // 🟢 waits for goroutines to finish; concurrency primitive, no I/O. diff --git a/builtins/builtins.go b/builtins/builtins.go index 38063ddaf..60e4ee9e9 100644 --- a/builtins/builtins.go +++ b/builtins/builtins.go @@ -171,10 +171,15 @@ type CallContext struct { // where FileInfo.Sys() lacks identity fields; Unix ignores it. FileIdentity func(path string, info fs.FileInfo) (FileID, bool) - // CommandAllowed reports whether a command name is permitted under the - // current shell policy. Used by the help builtin to list only executable - // commands. - CommandAllowed func(name string) bool + // CommandAllowed reports whether the given invocation is permitted under + // the current shell policy. name is the command name (args[0]) and args + // is the full argv. Both are provided so callers may consult either a + // name-based allowlist or an argv-prefix pattern allowlist; pass a + // single-element argv (e.g. []string{name}) when only the name is known. + // + // Used by the help builtin to filter the list of allowed commands and + // by find -exec/-execdir to validate child commands before invocation. + CommandAllowed func(name string, args []string) bool // WorkDir returns the shell's current working directory (absolute path). // Used by builtins that need to compute absolute paths for sub-operations. diff --git a/builtins/find/builtin_find_pentest_test.go b/builtins/find/builtin_find_pentest_test.go index cec3c8841..22d2f68b2 100644 --- a/builtins/find/builtin_find_pentest_test.go +++ b/builtins/find/builtin_find_pentest_test.go @@ -50,7 +50,7 @@ func TestFindExecDirCommandNotAllowed(t *testing.T) { runCalled = true return 0, nil } - callCtx.CommandAllowed = func(name string) bool { + callCtx.CommandAllowed = func(name string, _ []string) bool { return false // block everything } @@ -120,7 +120,7 @@ func testExecDirFilename(t *testing.T, filename string) { copy(capturedArgs, args) return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -202,7 +202,7 @@ func TestFindExecDirRootPath(t *testing.T) { copy(capturedArgs, args) return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -296,7 +296,7 @@ func TestFindExecDirParentDir(t *testing.T) { capturedDir = dir return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -332,7 +332,7 @@ func TestFindExecDirEmbeddedBracesReplacement(t *testing.T) { copy(capturedArgs, args) return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -366,7 +366,7 @@ func TestFindExecDirTrailingSlashPreserved(t *testing.T) { copy(capturedArgs, args) return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -400,7 +400,7 @@ func TestFindExecDirWindowsDriveRoot(t *testing.T) { capturedDir = dir return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -435,7 +435,7 @@ func TestFindExecDirCommandTokenSubstitution(t *testing.T) { capturedCmd = cmd return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -467,7 +467,7 @@ func TestFindExecCommandTokenSubstitution(t *testing.T) { capturedCmd = cmd return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -514,7 +514,7 @@ func TestFindExecCommandNotAllowed(t *testing.T) { runCalled = true return 0, nil } - callCtx.CommandAllowed = func(name string) bool { + callCtx.CommandAllowed = func(name string, _ []string) bool { return false // block everything } @@ -587,7 +587,7 @@ func testExecFilename(t *testing.T, relPath, printPath string) { copy(capturedArgs, args) return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -698,7 +698,7 @@ func TestFindExecWorkDir(t *testing.T) { capturedDir = dir return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, @@ -734,7 +734,7 @@ func TestFindExecEmbeddedBracesReplacement(t *testing.T) { copy(capturedArgs, args) return 0, nil } - callCtx.CommandAllowed = func(_ string) bool { return true } + callCtx.CommandAllowed = func(_ string, _ []string) bool { return true } ec := &evalContext{ callCtx: callCtx, diff --git a/builtins/find/eval.go b/builtins/find/eval.go index e6bae733a..fbf766d97 100644 --- a/builtins/find/eval.go +++ b/builtins/find/eval.go @@ -304,15 +304,22 @@ func evalExecLike(ec *evalContext, e *expr, name, replacement, dir string) evalR return evalResult{} } cmd := strings.ReplaceAll(e.execCmd, "{}", replacement) - if ec.callCtx.CommandAllowed != nil && !ec.callCtx.CommandAllowed(cmd) { - ec.callCtx.Errf("find: %s: '%s': command not allowed\n", name, cmd) - ec.failed = true - return evalResult{} - } args := make([]string, len(e.execArgs)) for i, a := range e.execArgs { args[i] = strings.ReplaceAll(a, "{}", replacement) } + // Construct the full argv (cmd + args) so the policy callback can + // consult both a name allowlist and an argv-prefix pattern allowlist. + // This is the authoritative check; the parse-time check at find.go is + // a name-only fast-fail. + fullArgv := make([]string, 0, len(args)+1) + fullArgv = append(fullArgv, cmd) + fullArgv = append(fullArgv, args...) + if ec.callCtx.CommandAllowed != nil && !ec.callCtx.CommandAllowed(cmd, fullArgv) { + ec.callCtx.Errf("find: %s: '%s': command not allowed\n", name, cmd) + ec.failed = true + return evalResult{} + } exitCode, err := ec.callCtx.RunCommand(ec.ctx, dir, cmd, args) if err != nil { ec.callCtx.Errf("find: '%s': %s\n", cmd, err) diff --git a/builtins/find/find.go b/builtins/find/find.go index c8676aeb9..ad69084b5 100644 --- a/builtins/find/find.go +++ b/builtins/find/find.go @@ -182,14 +182,25 @@ optLoop: } // Post-parse validation: check -exec/-execdir commands are allowed. - // Commands containing {} are skipped here — the substituted name is - // validated at eval-time when the replacement is known. - for _, cmd := range collectExecCmds(expression) { - if strings.Contains(cmd, "{}") { + // Commands whose name contains {} are skipped here — the substituted + // name is validated at eval-time when the replacement is known. + // + // We pass the unsubstituted argv (cmd + args, possibly including {} + // placeholders in trailing positions) to CommandAllowed so that + // argv-prefix patterns like ["echo","hello"] can match this fast-fail + // gate. The leading tokens of an unsubstituted argv are stable across + // {} substitution, so no false-positives slip through here that would + // be rejected at eval-time. evalExecLike still re-checks with the + // fully substituted argv before invocation. + for _, inv := range collectExecInvocations(expression) { + if strings.Contains(inv.cmd, "{}") { continue } - if callCtx.CommandAllowed != nil && !callCtx.CommandAllowed(cmd) { - callCtx.Errf("find: '%s': command not allowed\n", cmd) + fullArgv := make([]string, 0, len(inv.args)+1) + fullArgv = append(fullArgv, inv.cmd) + fullArgv = append(fullArgv, inv.args...) + if callCtx.CommandAllowed != nil && !callCtx.CommandAllowed(inv.cmd, fullArgv) { + callCtx.Errf("find: '%s': command not allowed\n", inv.cmd) return builtins.Result{Code: 1} } } @@ -612,6 +623,37 @@ func collectExecCmds(e *expr) []string { return cmds } +// execInvocation captures one -exec/-execdir clause as its command name plus +// the (still unsubstituted) argument template. Both fields are taken from the +// expression node verbatim, so {} placeholders are preserved in args. +type execInvocation struct { + cmd string + args []string +} + +// collectExecInvocations walks the expression tree and returns one entry per +// -exec/-execdir clause. Used by the parse-time policy check to construct +// the unsubstituted argv (cmd + args) for pattern matching, so that a +// multi-token pattern such as ["echo","hello"] can match an invocation +// whose argv is ["echo","hello","{}"]. +func collectExecInvocations(e *expr) []execInvocation { + var inv []execInvocation + collectExecInvocationsInto(e, &inv) + return inv +} + +func collectExecInvocationsInto(e *expr, inv *[]execInvocation) { + if e == nil { + return + } + if e.kind == exprExecDir || e.kind == exprExec { + *inv = append(*inv, execInvocation{cmd: e.execCmd, args: e.execArgs}) + } + collectExecInvocationsInto(e.left, inv) + collectExecInvocationsInto(e.right, inv) + collectExecInvocationsInto(e.operand, inv) +} + func collectExecCmdsInto(e *expr, cmds *[]string) { if e == nil { return diff --git a/builtins/help/help.go b/builtins/help/help.go index 8a559a875..6d97de028 100644 --- a/builtins/help/help.go +++ b/builtins/help/help.go @@ -67,7 +67,14 @@ func registerFlags(fs *builtins.FlagSet) builtins.HandlerFunc { printFeatureDetails(callCtx, feature) return builtins.Result{} } - if callCtx.CommandAllowed != nil && !callCtx.CommandAllowed(name) { + // help shows information about a single command name; we don't + // have a full argv to consult, so we pass []string{name}. As a + // consequence, commands that are only authorised by a + // multi-token argv-prefix pattern (e.g. "kubectl get") will + // not appear here even if their name does match a pattern's + // first token. That's a documented cosmetic limitation, not a + // security one. + if callCtx.CommandAllowed != nil && !callCtx.CommandAllowed(name, []string{name}) { callCtx.Errf("help: no help topics match '%s'\n", name) return builtins.Result{Code: 1} } @@ -105,7 +112,11 @@ func registerFlags(fs *builtins.FlagSet) builtins.HandlerFunc { allNames := builtins.Names() var allowed, notAllowed []string for _, name := range allNames { - if callCtx.CommandAllowed != nil && !callCtx.CommandAllowed(name) { + // Same caveat as the single-name lookup above: argv-prefix + // pattern authorisations whose patterns have more than one + // token are not surfaced here, because the bare name is the + // only argv we have. + if callCtx.CommandAllowed != nil && !callCtx.CommandAllowed(name, []string{name}) { notAllowed = append(notAllowed, name) continue } diff --git a/cmd/rshell/main.go b/cmd/rshell/main.go index beeacee47..200bd6b0f 100644 --- a/cmd/rshell/main.go +++ b/cmd/rshell/main.go @@ -34,12 +34,14 @@ func main() { func run(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) int { var ( - command string - allowedPaths string - allowedCommands string - allowAllCmds bool - timeout time.Duration - procPath string + command string + allowedPaths string + allowedCommands string + allowedCommandPatterns string + deniedCommandPatterns string + allowAllCmds bool + timeout time.Duration + procPath string ) cmd := &cobra.Command{ @@ -80,11 +82,35 @@ func run(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io. cmds = strings.Split(allowedCommands, ",") } + // Parse argv-prefix patterns: comma-separates patterns, + // whitespace-separates tokens within a pattern. Empty entries + // (e.g. trailing comma) are skipped so `"a,b,"` is two + // patterns, not three. Same parser for both allow and deny + // inputs. + parsePatterns := func(raw string) [][]string { + if raw == "" { + return nil + } + var out [][]string + for _, entry := range strings.Split(raw, ",") { + tokens := strings.Fields(entry) + if len(tokens) == 0 { + continue + } + out = append(out, tokens) + } + return out + } + patterns := parsePatterns(allowedCommandPatterns) + deniedPatterns := parsePatterns(deniedCommandPatterns) + execOpts := executeOpts{ - allowedPaths: paths, - allowedCommands: cmds, - allowAllCommands: allowAllCmds, - procPath: procPath, + allowedPaths: paths, + allowedCommands: cmds, + allowedCommandPatterns: patterns, + deniedCommandPatterns: deniedPatterns, + allowAllCommands: allowAllCmds, + procPath: procPath, } if commandSet { @@ -134,6 +160,8 @@ func run(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io. cmd.Flags().MarkHidden("command") //nolint:errcheck // flag is guaranteed to exist cmd.Flags().StringVarP(&allowedPaths, "allowed-paths", "p", "", "comma-separated list of directories the shell is allowed to access") cmd.Flags().StringVar(&allowedCommands, "allowed-commands", "", "comma-separated list of namespaced commands (e.g. rshell:cat,rshell:find)") + cmd.Flags().StringVar(&allowedCommandPatterns, "allowed-command-patterns", "", "comma-separated argv-prefix patterns; tokens within a pattern are space-separated (e.g. \"kubectl get,ls,echo hello\")") + cmd.Flags().StringVar(&deniedCommandPatterns, "denied-command-patterns", "", "comma-separated argv-prefix patterns to BLOCK regardless of allow rules; same syntax as --allowed-command-patterns (e.g. \"ip route,kubectl delete\")") cmd.Flags().BoolVar(&allowAllCmds, "allow-all-commands", false, "allow execution of all commands (builtins and external)") cmd.Flags().DurationVar(&timeout, "timeout", 0, "maximum execution time for the entire shell run (e.g. 100ms, 5s, 1m)") cmd.Flags().StringVar(&procPath, "proc-path", "", "path to the proc filesystem used by ps (default \"/proc\")") @@ -202,10 +230,12 @@ func rejectLongCommand(rawArgs []string) error { // executeOpts holds options for the execute function. type executeOpts struct { - allowedPaths []string - allowedCommands []string - allowAllCommands bool - procPath string + allowedPaths []string + allowedCommands []string + allowedCommandPatterns [][]string + deniedCommandPatterns [][]string + allowAllCommands bool + procPath string } func execute(ctx context.Context, script, name string, opts executeOpts, stdin io.Reader, stdout, stderr io.Writer) error { @@ -226,8 +256,22 @@ func execute(ctx context.Context, script, name string, opts executeOpts, stdin i } if opts.allowAllCommands { runOpts = append(runOpts, interpoption.AllowAllCommands().(interp.RunnerOption)) - } else if len(opts.allowedCommands) > 0 { - runOpts = append(runOpts, interp.AllowedCommands(opts.allowedCommands)) + } else { + // AllowedCommands and AllowedCommandPatterns are independent + // permit axes joined by union. Apply each when set; either alone + // is enough to authorise matching invocations. + if len(opts.allowedCommands) > 0 { + runOpts = append(runOpts, interp.AllowedCommands(opts.allowedCommands)) + } + if len(opts.allowedCommandPatterns) > 0 { + runOpts = append(runOpts, interp.AllowedCommandPatterns(opts.allowedCommandPatterns)) + } + } + // Deny patterns are always applied — they're a refusal axis, not an + // allow axis. If allow-all-commands is set, deny patterns still + // override individual invocations. + if len(opts.deniedCommandPatterns) > 0 { + runOpts = append(runOpts, interp.DeniedCommandPatterns(opts.deniedCommandPatterns)) } if opts.procPath != "" { runOpts = append(runOpts, interp.ProcPath(opts.procPath)) diff --git a/cmd/rshell/main_test.go b/cmd/rshell/main_test.go index 3a6bf1e9a..c8c9bc40a 100644 --- a/cmd/rshell/main_test.go +++ b/cmd/rshell/main_test.go @@ -226,7 +226,7 @@ func TestFileArgWithAllowedPath(t *testing.T) { func TestDefaultNoCommandsAllowed(t *testing.T) { code, _, stderr := runCLI(t, "-c", `echo hello`) assert.Equal(t, 127, code) - assert.Contains(t, stderr, "command not allowed") + assert.Contains(t, stderr, "not permitted by policy") } func TestAllowedCommandsFlag(t *testing.T) { @@ -238,7 +238,7 @@ func TestAllowedCommandsFlag(t *testing.T) { func TestAllowedCommandsBlocksOther(t *testing.T) { code, _, stderr := runCLI(t, "--allowed-commands", "rshell:echo", "-c", `cat /dev/null`) assert.Equal(t, 127, code) - assert.Contains(t, stderr, "command not allowed") + assert.Contains(t, stderr, "not permitted by policy") } func TestAllowedCommandsMissingNamespace(t *testing.T) { diff --git a/examples/command_patterns.sh b/examples/command_patterns.sh new file mode 100755 index 000000000..0d2ad343c --- /dev/null +++ b/examples/command_patterns.sh @@ -0,0 +1,221 @@ +#!/usr/bin/env bash +# Curated demonstrations of the AllowedCommandPatterns feature. +# +# Run from the repository root: +# +# bash examples/command_patterns.sh +# +# Each section runs the rshell CLI with a specific allowlist + pattern +# configuration and prints the exit code so you can see allow/deny in +# action. Edit the SCRIPTS or PATTERNS to play with your own cases. +# +# The script builds rshell into a temp directory at start (so the exit +# code from a blocked command — 127 — propagates back to the shell; +# `go run` rewrites non-zero exits to 1, which would obscure whether a +# case was blocked by policy or failed for some other reason). +# +# ─── Mental model ───────────────────────────────────────────────────── +# A pattern is shaped like (command [, subcommand_path...]). The matcher +# accepts an argv when: +# - argv[0] equals pattern[0] exactly (the command name), AND +# - each remaining pattern token appears as an exact-match token +# somewhere in argv[1..] (position and order don't matter, so flags +# can interleave between argv[0] and the subcommand). +# +# Example: pattern (kubectl, get) admits BOTH "kubectl get pods" and +# "kubectl -n ns get pods", but not "kubectl delete pod foo". +# +# rshell's CLI doesn't ship kubectl as a builtin, so this script uses +# `ip` (which has real subcommands: route, addr, link) as the kubectl +# analog where it matters. echo is used for cases that only need a +# bare command name with arbitrary args. + +set -u + +# Colors for readability when run in a terminal. +if [[ -t 1 ]]; then + BOLD=$'\033[1m'; DIM=$'\033[2m'; GREEN=$'\033[32m'; RED=$'\033[31m'; RESET=$'\033[0m' +else + BOLD=""; DIM=""; GREEN=""; RED=""; RESET="" +fi + +cd "$(dirname "$0")/.." + +# Build once so exit codes propagate accurately. go run remaps non-zero +# exits to 1, which would mask the 127 we use to detect "blocked by +# policy". +BIN_DIR=$(mktemp -d) +trap 'rm -rf "$BIN_DIR"' EXIT +printf "${BOLD:-}Building rshell once into %s …${RESET:-}\n" "$BIN_DIR" +go build -o "$BIN_DIR/rshell" ./cmd/rshell +RSHELL=("$BIN_DIR/rshell") +printf "Done. Running cases:\n\n" + +run_case() { + local title="$1"; shift + local script="$1"; shift + printf "${BOLD}%s${RESET}\n" "$title" + printf "${DIM} rshell %s${RESET}\n" "$*" + printf "${DIM} script: %s${RESET}\n" "$script" + set +e + "${RSHELL[@]}" "$@" -c "$script" + local code=$? + set -e + if [[ $code -eq 127 ]]; then + printf " ${RED}exit=%d (BLOCKED by policy)${RESET}\n\n" "$code" + elif [[ $code -eq 0 ]]; then + printf " ${GREEN}exit=%d (allowed, command succeeded)${RESET}\n\n" "$code" + else + printf " ${GREEN}exit=%d (allowed by policy; command itself returned non-zero)${RESET}\n\n" "$code" + fi +} + +# ----------------------------------------------------------------------- +# 1. Single-token pattern — admits any args. Use this shape for commands +# without subcommands (echo, ls, cat). Pattern (echo) allows every +# echo invocation. +# ----------------------------------------------------------------------- +run_case "1. Single-token pattern (echo) — admits any echo invocation" \ + "echo hello world" \ + --allowed-command-patterns "echo" -p /tmp + +# ----------------------------------------------------------------------- +# 2. Two-token pattern — command + subcommand. Pattern (ip, route) +# admits "ip route show". This is the canonical use case. +# ----------------------------------------------------------------------- +run_case "2. Pattern (ip route) — subcommand match (allowed)" \ + "ip route show" \ + --allowed-command-patterns "ip route" -p /tmp + +# ----------------------------------------------------------------------- +# 3. Sibling subcommand is blocked. Pattern (ip, route) does NOT admit +# "ip addr show" because the token "route" is absent from argv. +# ----------------------------------------------------------------------- +run_case "3. Pattern (ip route) blocks sibling subcommand (ip addr)" \ + "ip addr show" \ + --allowed-command-patterns "ip route" -p /tmp + +# ----------------------------------------------------------------------- +# 4. The flag-interleaving case — the whole point of order-insensitive +# matching. Pattern (ip, route) admits "ip -4 route show" because +# argv[0]="ip" matches and the token "route" appears in argv[1..] +# regardless of the -4 flag inserted before it. This is the kubectl +# -n ns get scenario in disguise. +# ----------------------------------------------------------------------- +run_case "4. Pattern (ip route) tolerates flags between command and subcommand" \ + "ip -4 route show" \ + --allowed-command-patterns "ip route" -p /tmp + +# ----------------------------------------------------------------------- +# 5. The architectural test — substitution-defeat. The literal command +# text $(printf ip) addr is opaque to any static caller. rshell +# expands it to ["ip","addr"] at runtime, the matcher applies +# pattern (ip, route) against that argv, and the absence of "route" +# causes a refusal at execve time. printf is allowed by name so the +# substitution itself can run. +# ----------------------------------------------------------------------- +run_case "5. Substitution-defeated escape (blocked)" \ + '$(printf ip) addr' \ + --allowed-commands rshell:printf \ + --allowed-command-patterns "ip route" \ + -p /tmp + +# ----------------------------------------------------------------------- +# 6. Substitution that resolves to a matching argv → allowed. Confirms +# the matcher inspects the post-expansion argv rather than blanket- +# rejecting interpolation. +# ----------------------------------------------------------------------- +run_case "6. Substitution that matches the pattern (allowed)" \ + '$(printf ip) route show' \ + --allowed-commands rshell:printf \ + --allowed-command-patterns "ip route" \ + -p /tmp + +# ----------------------------------------------------------------------- +# 7. Union of name allowlist + pattern allowlist. printf is allowed by +# name (any args). ip is only allowed when the argv contains "route". +# Both invocations succeed — each finds its own permit. +# ----------------------------------------------------------------------- +run_case "7. Union of name allowlist + pattern allowlist" \ + 'printf "from-printf\n"; ip route show' \ + --allowed-commands rshell:printf \ + --allowed-command-patterns "ip route" \ + -p /tmp + +# ----------------------------------------------------------------------- +# 8. Pattern enforced through find -exec. find substitutes {} at +# runtime; the eval-time gate sees the full resolved argv and +# applies the pattern. Same security model, deeper call site. +# ----------------------------------------------------------------------- +EXAMPLE_DIR=$(mktemp -d) +touch "$EXAMPLE_DIR/probe.txt" +run_case "8. find -exec respects the same patterns" \ + "find $EXAMPLE_DIR -name 'probe.txt' -exec echo hello {} \\;" \ + --allowed-commands rshell:find \ + --allowed-command-patterns "echo" \ + -p "/tmp,$EXAMPLE_DIR" +rm -rf "$EXAMPLE_DIR" + +# ----------------------------------------------------------------------- +# 9. Spec-driven matching closes the positional-arg bypass. Before +# spec-aware classification, a pattern (echo, secret) would have +# matched argv ["echo","public","secret"] because "secret" appears +# anywhere in argv. With the spec-driven structural matcher (and +# here echo treated as a positional-only command via an empty spec), +# the matcher checks pattern[1] against the LEADING structural +# token — which is "public", not "secret". Block. +# +# Note: this case requires a spec for echo (provided via the rshell +# CLI is not yet possible — operators using the library API call +# interp.CommandSpecs(...) instead). The CLI uses the built-in +# registry only, which today contains just `ip`. So this CLI +# invocation actually fails at runner construction with a "no +# registered CommandSpec" error, which is the right behaviour: the +# operator is told their pattern is unsafe without a spec. +# ----------------------------------------------------------------------- +run_case "9. Multi-token pattern without a registered spec → config error" \ + "echo public secret" \ + --allowed-command-patterns "echo secret" -p /tmp + +# ----------------------------------------------------------------------- +# 10. Name allowlist trumps pattern restriction. When a command is in +# --allowed-commands, ALL its argv forms are admitted regardless of +# patterns. To use prefix scoping, keep the command OUT of the name +# allowlist. Here ip is allowed by name so the pattern (ip route) +# doesn't restrict it; "ip addr show" runs even though argv lacks +# the "route" token. +# ----------------------------------------------------------------------- +run_case "10. Name allowlist overrides patterns" \ + "ip addr show" \ + --allowed-commands rshell:ip \ + --allowed-command-patterns "ip route" \ + -p /tmp + +# ----------------------------------------------------------------------- +# 11. Deny pattern overrides a name allowlist. Allow ip wholesale, but +# carve out ip route specifically. ip addr admits; ip route refused. +# ----------------------------------------------------------------------- +run_case "11. Deny pattern carves out a subcommand from a name allowlist" \ + "ip addr show" \ + --allowed-commands rshell:ip \ + --denied-command-patterns "ip route" \ + -p /tmp + +run_case "12. Deny fires for the carved-out subcommand" \ + "ip route show" \ + --allowed-commands rshell:ip \ + --denied-command-patterns "ip route" \ + -p /tmp + +# ----------------------------------------------------------------------- +# 13. Deny is evaluated post-expansion (architectural test, deny axis). +# A substitution that resolves to a denied argv at runtime is blocked +# regardless of how opaque the literal command text was. +# ----------------------------------------------------------------------- +run_case "13. Substitution can't bypass a deny pattern" \ + '$(printf ip) route show' \ + --allowed-commands rshell:ip,rshell:printf \ + --denied-command-patterns "ip route" \ + -p /tmp + +printf "${BOLD}Done.${RESET} Edit examples/command_patterns.sh to add your own cases.\n" diff --git a/interp/allowed_command_patterns_test.go b/interp/allowed_command_patterns_test.go new file mode 100644 index 000000000..d983d729d --- /dev/null +++ b/interp/allowed_command_patterns_test.go @@ -0,0 +1,574 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +package interp_test + +import ( + "bytes" + "context" + "errors" + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +// --- Option validation --- + +func TestAllowedCommandPatternsEmptySliceIsValid(t *testing.T) { + // Zero patterns is a valid configuration: it just contributes no + // authorisations. + _, err := interp.New(interp.AllowedCommandPatterns(nil)) + require.NoError(t, err) + + _, err = interp.New(interp.AllowedCommandPatterns([][]string{})) + require.NoError(t, err) +} + +func TestAllowedCommandPatternsRejectsEmptyPattern(t *testing.T) { + _, err := interp.New(interp.AllowedCommandPatterns([][]string{{}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "pattern 0 is empty") +} + +func TestAllowedCommandPatternsRejectsEmptyToken(t *testing.T) { + _, err := interp.New(interp.AllowedCommandPatterns([][]string{{"ip", ""}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "pattern 0 token 1 is empty") +} + +func TestAllowedCommandPatternsRejectsLeadingEmptyToken(t *testing.T) { + _, err := interp.New(interp.AllowedCommandPatterns([][]string{{"", "route"}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "pattern 0 token 0 is empty") +} + +func TestAllowedCommandPatternsAcceptsSingleTokenWithoutSpec(t *testing.T) { + // Single-token patterns don't need a spec — they only check argv[0]. + _, err := interp.New(interp.AllowedCommandPatterns([][]string{{"echo"}})) + require.NoError(t, err) +} + +func TestAllowedCommandPatternsAcceptsMultiTokenWithBuiltinSpec(t *testing.T) { + // "ip" is in the built-in spec registry, so a (ip, route) pattern is + // accepted out of the box. + _, err := interp.New(interp.AllowedCommandPatterns([][]string{{"ip", "route"}})) + require.NoError(t, err) +} + +func TestAllowedCommandPatternsRejectsMultiTokenWithoutSpec(t *testing.T) { + // "echo" has no spec; a multi-token pattern referencing it is rejected + // at runner construction time so the misconfiguration is surfaced + // early. + _, err := interp.New(interp.AllowedCommandPatterns([][]string{{"echo", "hello"}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "no registered CommandSpec") + assert.Contains(t, err.Error(), `"echo"`) +} + +func TestAllowedCommandPatternsAcceptsMultiTokenWithUserSpec(t *testing.T) { + // Operator-supplied specs unlock multi-token patterns for any command. + _, err := interp.New( + interp.CommandSpecs(map[string]interp.CommandSpec{ + "echo": {}, // empty spec = no flags; positional-only command + }), + interp.AllowedCommandPatterns([][]string{{"echo", "hello"}}), + ) + require.NoError(t, err) +} + +func TestAllowedCommandPatternsValidationRunsRegardlessOfOptionOrder(t *testing.T) { + // Patterns first, specs second — validation runs at the end of New(), + // so the order doesn't matter. + _, err := interp.New( + interp.AllowedCommandPatterns([][]string{{"echo", "hello"}}), + interp.CommandSpecs(map[string]interp.CommandSpec{ + "echo": {}, + }), + ) + require.NoError(t, err) +} + +// --- End-to-end pattern matching against the structural matcher --- + +// runWithPatterns runs a script with the given AllowedCommands, +// AllowedCommandPatterns, and CommandSpecs. AllowedPaths is set to the +// working directory so builtins that touch the filesystem don't fail for +// unrelated reasons. +func runWithPatterns(t *testing.T, script string, allowedCommands []string, patterns [][]string, extraSpecs map[string]interp.CommandSpec) (stdout, stderr string, code int) { + t.Helper() + + prog, err := syntax.NewParser().Parse(strings.NewReader(script), "") + require.NoError(t, err) + + var outBuf, errBuf bytes.Buffer + opts := []interp.RunnerOption{ + interp.StdIO(nil, &outBuf, &errBuf), + interp.AllowedPaths([]string{t.TempDir()}), + } + if extraSpecs != nil { + opts = append(opts, interp.CommandSpecs(extraSpecs)) + } + if allowedCommands != nil { + opts = append(opts, interp.AllowedCommands(allowedCommands)) + } + if patterns != nil { + opts = append(opts, interp.AllowedCommandPatterns(patterns)) + } + + runner, err := interp.New(opts...) + require.NoError(t, err) + defer runner.Close() + + runErr := runner.Run(context.Background(), prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if rerrAs(runErr, &es) { + exitCode = int(es) + } else { + t.Fatalf("unexpected non-ExitStatus error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +// rerrAs is a tiny wrapper over errors.As to keep test bodies tidy. +func rerrAs(err error, target *interp.ExitStatus) bool { + type aser interface{ As(any) bool } + for cur := err; cur != nil; { + if es, ok := cur.(interp.ExitStatus); ok { + *target = es + return true + } + if a, ok := cur.(aser); ok && a.As(target) { + return true + } + // Fall through to a single-level Unwrap. + type unwrapper interface{ Unwrap() error } + if u, ok := cur.(unwrapper); ok { + cur = u.Unwrap() + continue + } + break + } + return false +} + +func TestPatternsAdmitMatchingSubcommand(t *testing.T) { + // Pattern (ip, route) admits "ip route show". The builtin reports its + // own platform-specific error (route table reading not supported on + // macOS), but the policy gate did its job — exit code is whatever the + // builtin returns, NOT 127. + _, _, code := runWithPatterns(t, + `ip route show`, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.NotEqual(t, 127, code, "expected policy to ALLOW the call (non-127 exit)") +} + +func TestPatternsBlockSiblingSubcommand(t *testing.T) { + // Pattern (ip, route) does NOT admit "ip addr show": the structural + // position 0 is "addr", not "route". + _, stderr, code := runWithPatterns(t, + `ip addr show`, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "not permitted by policy") +} + +func TestPatternsTolerateBooleanGlobalFlagBeforeSubcommand(t *testing.T) { + // "-4" is a boolean global flag in the ip spec; it appears between + // argv[0] and the subcommand. Pattern (ip, route) should still match + // because the structural extractor skips -4. + _, _, code := runWithPatterns(t, + `ip -4 route show`, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.NotEqual(t, 127, code, "expected boolean flag interleaving to be tolerated") +} + +func TestPatternsTolerateMultipleBooleanFlags(t *testing.T) { + // Two boolean flags both interleaved between argv[0] and the + // subcommand. + _, _, code := runWithPatterns(t, + `ip -o -4 route show`, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.NotEqual(t, 127, code) +} + +func TestPatternsBlockedWhenSubcommandIsPositionalArgValue(t *testing.T) { + // The structural matcher checks pattern[1..] against the LEADING + // structural tokens, not against any structural token. A positional + // arg whose value happens to equal a pattern token does NOT satisfy + // the slot — this is the bypass that the spec-driven matcher closes. + // + // Here we register an "echo" spec (no flags, all tokens structural) + // and pattern (echo, hello). argv ["echo","goodbye","hello"] has + // structural tokens ["goodbye","hello"]; pattern[1]="hello" must + // match structural[0]="goodbye" and does not. Block. + _, stderr, code := runWithPatterns(t, + `echo goodbye hello`, + nil, + [][]string{{"echo", "hello"}}, + map[string]interp.CommandSpec{"echo": {}}, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "not permitted by policy") +} + +func TestPatternsAndAllowedCommandsAreUnion(t *testing.T) { + // "cat" is allowed by name (any args). + // "ip" is only allowed when its argv satisfies (ip, route). + stdout, _, code := runWithPatterns(t, + `cat /dev/null && ip route show`, + []string{"rshell:cat"}, + [][]string{{"ip", "route"}}, + nil, + ) + // cat /dev/null produces no output; the && short-circuits to ip route + // show, which the policy admits. The eventual exit code is whatever + // the ip builtin returns (1 on macOS), not 127. + _ = stdout + assert.NotEqual(t, 127, code) +} + +func TestPatternsDoNotShadowAllowedCommands(t *testing.T) { + // "echo" allowed by name; even though no pattern matches, the name + // allowlist alone authorises the call. + stdout, _, code := runWithPatterns(t, + `echo whatever`, + []string{"rshell:echo"}, + [][]string{{"ip", "route"}}, + nil, + ) + assert.Equal(t, 0, code) + assert.Equal(t, "whatever\n", stdout) +} + +// --- The architectural test: substitution-defeat --- + +// TestPatternsBlockSubstitutionEscape proves that argv-prefix pattern +// matching enforces post-expansion. The script forms the command name via +// $(printf ip) — opaque to any static caller — and then attempts an addr +// invocation. The matcher sees the resolved argv ["ip","addr"] at execve +// time and refuses against pattern (ip, route). +func TestPatternsBlockSubstitutionEscape(t *testing.T) { + _, stderr, code := runWithPatterns(t, + `$(printf ip) addr`, + []string{"rshell:printf"}, // printf must be allowed for $(...) to succeed + [][]string{{"ip", "route"}}, + nil, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "not permitted by policy") +} + +// TestPatternsAllowSubstitutionWhenArgvMatches is the partner case: a +// substitution that produces a matching argv is allowed. Confirms the +// matcher inspects the expanded argv rather than blanket-rejecting +// interpolation. +func TestPatternsAllowSubstitutionWhenArgvMatches(t *testing.T) { + _, _, code := runWithPatterns(t, + `$(printf ip) route show`, + []string{"rshell:printf"}, + [][]string{{"ip", "route"}}, + nil, + ) + assert.NotEqual(t, 127, code, "expected post-expansion matcher to admit the call") +} + +func TestPatternsSingleTokenPatternAdmitsAnyArgs(t *testing.T) { + // Single-token pattern (echo) — no spec required, no structural + // extraction, just argv[0] equality. + stdout, _, code := runWithPatterns(t, + `echo whatever args you like`, + nil, + [][]string{{"echo"}}, + nil, + ) + assert.Equal(t, 0, code) + assert.Equal(t, "whatever args you like\n", stdout) +} + +// --- Spec-driven flag classification (unit-level) --- + +// TestSpecValueFlagSkipsTwoTokens checks that a registered ValueFlag +// consumes the next argv token (its value), so the structural-token +// stream begins after both. We use echo as the test binary with a +// synthetic spec that pretends -n is a value flag (the real echo's -n +// is boolean, but the policy gate doesn't care — the spec dictates +// classification). +// +// argv: ["echo","-n","ns","route","show"]. With ValueFlags={"-n"}, +// the matcher consumes "-n" and "ns" as a flag/value pair, leaving +// structural ["route","show"]. Pattern (echo, route) matches. +func TestSpecValueFlagSkipsTwoTokens(t *testing.T) { + _, _, code := runWithPatterns(t, + `echo -n ns route show`, + nil, + [][]string{{"echo", "route"}}, + map[string]interp.CommandSpec{ + "echo": {ValueFlags: map[string]bool{"-n": true}}, + }, + ) + assert.Equal(t, 0, code, "value-flag pair should be skipped, leaving 'route' as the leading structural token") +} + +// TestSpecValueFlagDoesNotMatchWhenValueIsExpectedSubcommand confirms +// the partner case: when the value-flag's value happens to equal what +// would have been a matching subcommand, the matcher correctly skips it +// rather than counting it as structural. argv ["echo","-n","route","show"] +// with -n as a value flag → structural ["show"] → pattern (echo, route) +// blocked because "route" was consumed as -n's value, not as the +// subcommand. +func TestSpecValueFlagDoesNotMatchWhenValueIsExpectedSubcommand(t *testing.T) { + _, stderr, code := runWithPatterns(t, + `echo -n route show`, + nil, + [][]string{{"echo", "route"}}, + map[string]interp.CommandSpec{ + "echo": {ValueFlags: map[string]bool{"-n": true}}, + }, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "not permitted by policy") +} + +// TestSpecLongFlagWithEqualsIsSelfContained checks that "--flag=value" is +// always treated as a single skip token regardless of the spec's +// classification of "--flag". +func TestSpecLongFlagWithEqualsIsSelfContained(t *testing.T) { + // Spec doesn't even know about --output. The matcher treats any + // "--key=value" token as a single skip and continues. + _, _, code := runWithPatterns(t, + `echo --output=json route show`, + nil, + [][]string{{"echo", "route"}}, + map[string]interp.CommandSpec{"echo": {}}, + ) + assert.Equal(t, 0, code) +} + +// TestSpecUnknownFlagTreatedAsBoolean checks the conservative default: +// a flag-shaped token not in the spec is treated as boolean (skipped +// alone). This is more permissive than strictly correct but avoids +// breaking matches when the spec is incomplete. +func TestSpecUnknownFlagTreatedAsBoolean(t *testing.T) { + // Spec has no flags. argv: echo --debug route show. --debug skipped + // as boolean → structural ["route","show"] → pattern (echo, route) + // matches. + _, _, code := runWithPatterns(t, + `echo --debug route show`, + nil, + [][]string{{"echo", "route"}}, + map[string]interp.CommandSpec{"echo": {}}, + ) + assert.Equal(t, 0, code) +} + +// --- DeniedCommandPatterns --- + +// runWithPolicy is a richer variant of runWithPatterns that also accepts +// denied patterns. Used exclusively by the deny-pattern tests so the +// existing helper signature stays stable for the allow-only suite. +func runWithPolicy(t *testing.T, script string, allowedCommands []string, allowedPatterns, deniedPatterns [][]string, extraSpecs map[string]interp.CommandSpec) (stdout, stderr string, code int) { + t.Helper() + + prog, err := syntax.NewParser().Parse(strings.NewReader(script), "") + require.NoError(t, err) + + var outBuf, errBuf bytes.Buffer + opts := []interp.RunnerOption{ + interp.StdIO(nil, &outBuf, &errBuf), + interp.AllowedPaths([]string{t.TempDir()}), + } + if extraSpecs != nil { + opts = append(opts, interp.CommandSpecs(extraSpecs)) + } + if allowedCommands != nil { + opts = append(opts, interp.AllowedCommands(allowedCommands)) + } + if allowedPatterns != nil { + opts = append(opts, interp.AllowedCommandPatterns(allowedPatterns)) + } + if deniedPatterns != nil { + opts = append(opts, interp.DeniedCommandPatterns(deniedPatterns)) + } + + runner, err := interp.New(opts...) + require.NoError(t, err) + defer runner.Close() + + runErr := runner.Run(context.Background(), prog) + exitCode := 0 + if runErr != nil { + var es interp.ExitStatus + if rerrAs(runErr, &es) { + exitCode = int(es) + } else { + t.Fatalf("unexpected non-ExitStatus error: %v", runErr) + } + } + return outBuf.String(), errBuf.String(), exitCode +} + +func TestDeniedPatternsValidationRejectsEmpty(t *testing.T) { + _, err := interp.New(interp.DeniedCommandPatterns([][]string{{}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "DeniedCommandPatterns: pattern 0 is empty") +} + +func TestDeniedPatternsValidationRejectsEmptyToken(t *testing.T) { + _, err := interp.New(interp.DeniedCommandPatterns([][]string{{"ip", ""}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "DeniedCommandPatterns: pattern 0 token 1 is empty") +} + +func TestDeniedPatternsRequiresSpecForMultiToken(t *testing.T) { + // Same rule as allow patterns: multi-token deny needs a spec for + // the command name. Surfaces misconfiguration at New() time. + _, err := interp.New(interp.DeniedCommandPatterns([][]string{{"echo", "secret"}})) + require.Error(t, err) + assert.Contains(t, err.Error(), "DeniedCommandPatterns") + assert.Contains(t, err.Error(), "no registered CommandSpec") +} + +func TestDeniedPatternsAcceptsSingleTokenWithoutSpec(t *testing.T) { + // Single-token denies don't need a spec — they only check argv[0]. + _, err := interp.New(interp.DeniedCommandPatterns([][]string{{"echo"}})) + require.NoError(t, err) +} + +// TestDenyOverridesNameAllowlist is the headline use case: allow ip in +// general, but carve out ip route. ip addr admits; ip route is refused +// at the gate. +func TestDenyOverridesNameAllowlist(t *testing.T) { + // ip addr show — allow rule (name) admits, no deny matches → run. + _, _, code := runWithPolicy(t, + `ip addr show`, + []string{"rshell:ip"}, // allow ip wholesale by name + nil, + [][]string{{"ip", "route"}}, // but block ip route + nil, + ) + assert.NotEqual(t, 127, code, "ip addr should not match the deny pattern") + + // ip route show — deny matches → block, regardless of name allowlist. + _, stderr, code := runWithPolicy(t, + `ip route show`, + []string{"rshell:ip"}, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "blocked by deny pattern") + assert.Contains(t, stderr, `"ip route"`) +} + +// TestDenyOverridesAllowPattern covers the dual case: allow patterns +// admit, but a more specific deny carves out a sub-subcommand. Pattern +// (ip, route) admits ip route show; deny pattern (ip, route, get) +// blocks just ip route get. +func TestDenyOverridesAllowPattern(t *testing.T) { + // ip route show — allow admits, deny doesn't match. + _, _, code := runWithPolicy(t, + `ip route show`, + nil, + [][]string{{"ip", "route"}}, + [][]string{{"ip", "route", "get"}}, + nil, + ) + assert.NotEqual(t, 127, code) + + // ip route get 8.8.8.8 — deny matches → block. + _, stderr, code := runWithPolicy(t, + `ip route get 8.8.8.8`, + nil, + [][]string{{"ip", "route"}}, + [][]string{{"ip", "route", "get"}}, + nil, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "blocked by deny pattern") +} + +// TestDenySurvivesShellSubstitution confirms the architectural property +// extends to the deny axis: a substitution that resolves to a denied +// argv at execve time is blocked even though the literal text was +// opaque. +func TestDenySurvivesShellSubstitution(t *testing.T) { + _, stderr, code := runWithPolicy(t, + `$(printf ip) route show`, + []string{"rshell:ip", "rshell:printf"}, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.Equal(t, 127, code) + assert.Contains(t, stderr, "blocked by deny pattern") +} + +// TestDenyDoesNotMatchWhenSubcommandAtPositionalSlot confirms the deny +// matcher uses the same structural rules as the allow matcher: a +// positional argument value at a non-leading structural position does +// not satisfy a deny pattern. +// +// Pattern deny (ip, route) should NOT block "ip addr show route" — +// "route" appears as a positional value at structural[2], not at the +// subcommand slot. Without this property, the deny axis would be even +// more permissive than positional-presence matching, which would be a +// regression. +func TestDenyDoesNotMatchWhenSubcommandAtPositionalSlot(t *testing.T) { + _, _, code := runWithPolicy(t, + `ip addr show route`, + []string{"rshell:ip"}, + nil, + [][]string{{"ip", "route"}}, + nil, + ) + assert.NotEqual(t, 127, code, "deny should not match when 'route' is a positional value, not the subcommand") +} + +// TestDenyAppliesUnderAllowAllCommands — denies override even the +// permissive allow-all-commands escape hatch. Useful for "let me run +// anything except this dangerous thing" policies in dev environments. +func TestDenyAppliesUnderAllowAllCommands(t *testing.T) { + prog, err := syntax.NewParser().Parse(strings.NewReader(`ip route show`), "") + require.NoError(t, err) + + var outBuf, errBuf bytes.Buffer + runner, err := interp.New( + interp.StdIO(nil, &outBuf, &errBuf), + interp.AllowedPaths([]string{t.TempDir()}), + // Bypass-everything switch on… + interp.AllowedCommands([]string{"rshell:ip"}), + // …but the deny still fires. + interp.DeniedCommandPatterns([][]string{{"ip", "route"}}), + ) + require.NoError(t, err) + defer runner.Close() + + runErr := runner.Run(context.Background(), prog) + var es interp.ExitStatus + require.True(t, errors.As(runErr, &es)) + assert.Equal(t, 127, int(es)) + assert.Contains(t, errBuf.String(), "blocked by deny pattern") +} diff --git a/interp/api.go b/interp/api.go index 691cea130..6a50d3690 100644 --- a/interp/api.go +++ b/interp/api.go @@ -33,6 +33,80 @@ import ( ) // runnerConfig holds the immutable configuration of a [Runner]. +// CommandSpec describes the flag conventions of a single command, used by +// [AllowedCommandPatterns] to distinguish flag tokens (which the matcher +// skips when locating the subcommand path) from structural tokens +// (subcommand path and positional arguments). +// +// A token in argv is classified as follows during structural extraction: +// +// - A token containing '=' that starts with '-' (e.g. "--output=json") is +// skipped as a single token, regardless of whether it appears in +// BooleanFlags or ValueFlags. +// - A token in BooleanFlags is skipped by itself; the next argv token is +// considered a candidate structural token. +// - A token in ValueFlags is skipped together with the next argv token, +// which is treated as the flag's value. +// - A token starting with '-' that does NOT appear in either set is +// conservatively treated as a boolean flag (skipped alone). This keeps +// unknown future flags from causing matcher failures, at the cost of +// potential under-skipping if the unknown flag actually takes a value. +// - Any other token is structural. +// +// AllowedCommandPatterns matches against the leading structural tokens +// only. Positional arguments at later structural positions cannot satisfy +// pattern slots, so a value that happens to equal a subcommand token (the +// "kubectl delete pod get" bypass) does not match a pattern targeting that +// subcommand. +type CommandSpec struct { + // BooleanFlags is the set of flag tokens that do not take a value. + // Keys must include the leading '-' or '--', e.g. "-h", "--help". + BooleanFlags map[string]bool + + // ValueFlags is the set of flag tokens that take the next argv token as + // their value. Keys must include the leading '-' or '--', e.g. "-n", + // "--namespace". Flags written in "--key=value" form are recognised + // regardless of whether the key appears here. + ValueFlags map[string]bool +} + +// builtinCommandSpecs is the default CommandSpec registry, seeded with the +// rshell builtins that have multi-token subcommand structure. Today that's +// only `ip` — its global flags are all boolean (no -X TAKES-VALUE forms), +// so the spec is small and unambiguous. Integrators that wire rshell into +// a larger system (e.g. a runner that supports external commands like +// kubectl, git, docker) should provide their own specs via the +// [CommandSpecs] option. +// +// Built-in entries are merged with operator-supplied specs; duplicate keys +// are overridden by the operator-supplied value. +var builtinCommandSpecs = map[string]CommandSpec{ + "ip": { + BooleanFlags: map[string]bool{ + "-o": true, + "--oneline": true, + "--brief": true, + "-4": true, + "-6": true, + "-h": true, + "--help": true, + }, + // ip has no global flags that take a value. + ValueFlags: map[string]bool{}, + }, +} + +// cloneCommandSpecs returns a shallow copy of in. Map values (the inner +// flag sets) are aliased; we treat them as immutable after option +// processing completes. +func cloneCommandSpecs(in map[string]CommandSpec) map[string]CommandSpec { + out := make(map[string]CommandSpec, len(in)) + for k, v := range in { + out[k] = v + } + return out +} + // These fields are set during construction ([New]) and first [Runner.Reset], // and never change afterwards. type runnerConfig struct { @@ -78,6 +152,31 @@ type runnerConfig struct { // false, no commands are allowed. allowedCommands map[string]bool + // allowedCommandPatterns is a list of (command [, subcommand_path...]) + // patterns that admit commands by their structural argv rather than only + // by name. Independent of allowedCommands — a command is allowed if its + // name appears in allowedCommands OR its argv satisfies any of these + // patterns under the structural matcher (see argvMatchesAllowedPattern). + allowedCommandPatterns [][]string + + // deniedCommandPatterns is a list of (command [, subcommand_path...]) + // patterns that BLOCK commands regardless of any allow rule. Evaluated + // before allowedCommands and allowedCommandPatterns; a deny match + // short-circuits the gate to a refusal. Useful for "allow X but carve + // out Y" policies — e.g. allowedCommands={ip} plus + // deniedCommandPatterns=[(ip, route)] permits `ip addr` and `ip link` + // but forbids `ip route`. Same spec-driven structural matcher as + // allowedCommandPatterns. + deniedCommandPatterns [][]string + + // commandSpecs maps a command name to the CommandSpec that describes its + // flag conventions. Used by argvMatchesAllowedPattern to distinguish flag + // tokens (skipped during matching) from structural tokens (subcommand + // path and positional arguments). Initialised at New() to a copy of + // builtinCommandSpecs; the CommandSpecs option merges additional or + // overriding entries on top. + commandSpecs map[string]CommandSpec + // allowAllCommands bypasses the allowedCommands check and permits any // command. Intended for testing convenience. allowAllCommands bool @@ -266,7 +365,15 @@ func (e *exitStatus) fromHandlerError(err error) { func New(opts ...RunnerOption) (*Runner, error) { registerBuiltins() r := &Runner{ - runnerConfig: runnerConfig{usedNew: true}, + runnerConfig: runnerConfig{ + usedNew: true, + // Seed the spec registry with the built-in defaults before + // any option runs so that CommandSpecs() merges over them + // rather than replacing them. Operators that need to drop a + // built-in entry can pass an explicit empty CommandSpec for + // that key. + commandSpecs: cloneCommandSpecs(builtinCommandSpecs), + }, } for _, opt := range opts { if err := opt(r); err != nil { @@ -274,6 +381,12 @@ func New(opts ...RunnerOption) (*Runner, error) { return nil, err } } + // Validate AllowedCommandPatterns against the assembled spec registry + // AFTER all options have been applied, so option order doesn't matter. + if err := validateAllowedCommandPatterns(r); err != nil { + _ = r.Close() + return nil, err + } // Default to an empty environment to avoid propagating parent env vars. if r.Env == nil { @@ -765,6 +878,143 @@ func AllowedCommands(names []string) RunnerOption { } } +// CommandSpecs registers command specs used by [AllowedCommandPatterns] to +// distinguish flag tokens from structural tokens during pattern matching. +// Specs are merged on top of the built-in registry (which seeds entries for +// rshell builtins with multi-token subcommand structure); keys present in +// `specs` override built-in entries for the same command. +// +// A CommandSpec is required for every command name that appears as the +// leading token of a multi-token pattern. Single-token patterns (which +// only match argv[0]) do not require a spec; multi-token patterns +// referencing a command without a spec cause [New] to return an error so +// the misconfiguration is surfaced at runner construction rather than at +// dispatch. +// +// Operators wiring rshell into a larger system should call this option to +// register their CLIs of interest (kubectl, git, docker, internal tools). +// The matcher's behaviour for any given argv is fully determined by the +// spec: missing flag entries cause that flag's positional neighbours to be +// misclassified as structural tokens, which in turn causes pattern +// matches to fail. Failure mode is therefore false-negative (a +// well-formed invocation is rejected), not false-positive — keep specs up +// to date with the CLIs they describe. +func CommandSpecs(specs map[string]CommandSpec) RunnerOption { + return func(r *Runner) error { + for name, spec := range specs { + if name == "" { + return fmt.Errorf("CommandSpecs: empty command name") + } + r.commandSpecs[name] = spec + } + return nil + } +} + +// validateAllowedCommandPatterns ensures every multi-token pattern in +// r.allowedCommandPatterns and r.deniedCommandPatterns has a registered +// CommandSpec for its command name. Single-token patterns do not require +// a spec. Called from [New] after all options have been applied. +func validateAllowedCommandPatterns(r *Runner) error { + if err := validatePatternList(r, "AllowedCommandPatterns", r.allowedCommandPatterns); err != nil { + return err + } + if err := validatePatternList(r, "DeniedCommandPatterns", r.deniedCommandPatterns); err != nil { + return err + } + return nil +} + +// validatePatternList runs the spec-presence check on a single pattern +// list. Used for both allowed and denied patterns since they have the +// same structural shape and the same need for a spec. +func validatePatternList(r *Runner, optionName string, patterns [][]string) error { + for i, p := range patterns { + if len(p) <= 1 { + continue + } + if _, ok := r.commandSpecs[p[0]]; !ok { + return fmt.Errorf("%s: pattern %d references command %q which has no registered CommandSpec; pass interp.CommandSpecs(map[string]interp.CommandSpec{%q: {...}}) to register one (multi-token patterns require a spec so the matcher can distinguish flags from subcommands)", optionName, i, p[0], p[0]) + } + } + return nil +} + +// AllowedCommandPatterns restricts command execution to argv sequences whose +// leading tokens prefix-match one of the configured patterns. Each pattern is +// a non-empty list of tokens; an invocation whose argv begins with the same +// tokens (in order, by exact string equality) is allowed. +// +// Patterns are matched against argv after shell expansion, so command +// substitution and other runtime-resolved values cannot bypass the check — +// the matcher sees the resolved argv that would be handed to exec. +// +// Example: a pattern of []string{"kubectl", "get"} permits "kubectl get pods" +// but not "kubectl delete pod foo". +// +// Patterns are an additional permit axis, independent of [AllowedCommands]. +// A command is allowed if its name appears in AllowedCommands OR if its argv +// prefix-matches one of the patterns. Empty pattern slices and patterns +// containing empty tokens are rejected. +// +// When not set (default), no command is allowed via pattern matching; +// only AllowedCommands and allowAllCommands govern execution. +func AllowedCommandPatterns(patterns [][]string) RunnerOption { + return func(r *Runner) error { + if err := validatePatternSlice("AllowedCommandPatterns", patterns); err != nil { + return err + } + r.allowedCommandPatterns = patterns + return nil + } +} + +// DeniedCommandPatterns blocks command execution for argv sequences that +// satisfy any of the given patterns, regardless of whether AllowedCommands +// or AllowedCommandPatterns would otherwise admit the call. Denies are +// evaluated FIRST: a deny match short-circuits the gate to a refusal even +// if every other axis would permit the invocation. +// +// Pattern shape and matching are identical to [AllowedCommandPatterns]: +// each pattern is a non-empty token list (command [, subcommand_path...]), +// matched against the argv's leading structural tokens using the same +// CommandSpec-driven flag classification. Multi-token deny patterns +// require a registered CommandSpec; [New] returns an error otherwise. +// +// Use case: "allow ip in general but forbid ip route specifically" can be +// expressed as AllowedCommands=[rshell:ip] plus +// DeniedCommandPatterns=[["ip","route"]]. ip addr and ip link still +// admit; ip route is refused at the gate. +// +// When not set (default), no commands are denied by patterns; only the +// usual allow-rule absence (default deny) applies. +func DeniedCommandPatterns(patterns [][]string) RunnerOption { + return func(r *Runner) error { + if err := validatePatternSlice("DeniedCommandPatterns", patterns); err != nil { + return err + } + r.deniedCommandPatterns = patterns + return nil + } +} + +// validatePatternSlice runs the shared option-time validation (non-empty +// patterns, non-empty tokens) used by both AllowedCommandPatterns and +// DeniedCommandPatterns. +func validatePatternSlice(optionName string, patterns [][]string) error { + for i, p := range patterns { + if len(p) == 0 { + return fmt.Errorf("%s: pattern %d is empty", optionName, i) + } + for j, tok := range p { + if tok == "" { + return fmt.Errorf("%s: pattern %d token %d is empty", optionName, i, j) + } + } + } + return nil +} + // allowAllCommandsOpt is a convenience for tests within the interp package. func allowAllCommandsOpt() RunnerOption { return func(r *Runner) error { diff --git a/interp/runner_exec.go b/interp/runner_exec.go index 57a44b508..07ac7d7e3 100644 --- a/interp/runner_exec.go +++ b/interp/runner_exec.go @@ -12,6 +12,7 @@ import ( "io/fs" "os" "path/filepath" + "strings" "sync" "mvdan.cc/sh/v3/expand" @@ -284,14 +285,214 @@ func (r *Runner) loopStmtsBroken(ctx context.Context, stmts []*syntax.Stmt) bool return false } +// argvMatchesAllowedPattern reports whether args satisfies any of the +// configured AllowedCommandPatterns. A pattern is shaped like +// (command [, subcommand_path...]) and matches when: +// +// 1. args[0] equals pattern[0] exactly (the command name). +// 2. The leading structural tokens of args[1..] equal pattern[1..], +// where "structural tokens" are extracted by skipping flag tokens +// according to the CommandSpec registered for args[0]. See +// [CommandSpec] for the classification rules. +// +// Single-token patterns trivially match on argv[0] alone — no spec is +// consulted. Multi-token patterns require a spec for args[0]; New() +// rejects multi-token patterns whose command lacks a spec, so by the time +// this method runs we expect the lookup to succeed. +// +// args is expected to be the full argv with the command name at args[0] +// (the same shape passed to call()). Callers that hold the command name and +// arguments separately must reconstruct the full argv before invoking this +// matcher. +// +// The matcher is called after shell expansion, so command-substitution- +// derived argv elements are already resolved — this is the architectural +// guarantee of the feature. +// +// Why structural matching matters: a naive presence-only matcher would +// admit "ip addr show" against a pattern of (ip, route) if the literal +// token "route" appeared anywhere in argv (e.g. as a positional value +// like "ip addr show route"). The structural matcher checks pattern[1..] +// against the leading subcommand-path tokens only, so positional values +// at later positions cannot satisfy pattern slots. +func (r *Runner) argvMatchesAllowedPattern(args []string) bool { + _, ok := r.firstMatchingPattern(args, r.allowedCommandPatterns) + return ok +} + +// argvMatchesDeniedPattern reports whether args satisfies any of the +// configured DeniedCommandPatterns. Used by the gate to short-circuit +// dispatch with a refusal even when an allow rule would otherwise admit +// the call. Same matching algorithm as argvMatchesAllowedPattern. +// +// Returns the matching pattern (for use in error messages) so the caller +// can tell the operator exactly which deny rule fired. +func (r *Runner) firstMatchingDeniedPattern(args []string) ([]string, bool) { + return r.firstMatchingPattern(args, r.deniedCommandPatterns) +} + +// firstMatchingPattern returns the first pattern in patterns that args +// satisfies under the spec-driven structural matcher. The boolean second +// return is true iff a match was found. Patterns is iterated in +// configuration order so the returned pattern is deterministic for a +// given input. +// +// Both AllowedCommandPatterns and DeniedCommandPatterns share this matcher +// — only the precedence at the gate distinguishes them. +func (r *Runner) firstMatchingPattern(args []string, patterns [][]string) ([]string, bool) { + if len(args) == 0 { + return nil, false + } + for _, pattern := range patterns { + if len(pattern) == 0 { + // Defensive: option validator already rejects empty + // patterns, so we never expect to see one here. + continue + } + // First token must match args[0] exactly. + if args[0] != pattern[0] { + continue + } + // Single-token pattern: argv[0] match is sufficient. + if len(pattern) == 1 { + return pattern, true + } + // Multi-token pattern: walk argv[1..] and extract structural + // tokens using the spec for args[0]. validateAllowedCommandPatterns + // guarantees the spec exists; defensive check skips this pattern + // if it somehow doesn't (e.g. spec was unregistered between + // option processing and dispatch). + spec, ok := r.commandSpecs[args[0]] + if !ok { + continue + } + structural := extractStructuralTokens(args[1:], spec) + if len(pattern)-1 > len(structural) { + continue + } + matched := true + for i, ptok := range pattern[1:] { + if structural[i] != ptok { + matched = false + break + } + } + if matched { + return pattern, true + } + } + return nil, false +} + +// formatPolicyDenial returns a human-readable message explaining why +// args was rejected. Includes the full attempted invocation and, when +// patterns target the command name, the patterns the operator could +// have matched. +// +// If matchedDeny is non-nil, the message identifies the deny pattern +// that fired (the highest-precedence reason). Otherwise the message +// says the call wasn't permitted by any allow rule and lists the +// configured allow patterns for the command name as a hint. +// +// The message is intentionally short: at most two lines, one for the +// rejection itself and one optional hint. Designed for stderr where a +// script may produce many such errors and verbose multi-line output +// gets in the way. +func (r *Runner) formatPolicyDenial(args []string, matchedDeny []string) string { + if len(args) == 0 { + return "rshell: command not allowed\n" + } + name := args[0] + invocation := strings.Join(args, " ") + + var msg strings.Builder + if matchedDeny != nil { + fmt.Fprintf(&msg, "rshell: %s: blocked by deny pattern %q\n", + invocation, strings.Join(matchedDeny, " ")) + return msg.String() + } + + // Allow-side denial: list the patterns the operator could have + // intended to match for this command name. + var matchingPatterns []string + for _, p := range r.allowedCommandPatterns { + if len(p) > 0 && p[0] == name { + matchingPatterns = append(matchingPatterns, "'"+strings.Join(p, " ")+"'") + } + } + + if invocation == name { + // Bare command (no args); the prior format is still the + // clearest thing to print. + fmt.Fprintf(&msg, "rshell: %s: command not allowed\n", name) + } else { + fmt.Fprintf(&msg, "rshell: %s: invocation not permitted by policy (command name: %s)\n", invocation, name) + } + if len(matchingPatterns) > 0 { + fmt.Fprintf(&msg, " hint: allowed patterns for %q: %s\n", name, strings.Join(matchingPatterns, ", ")) + } else if r.allowedCommands[name] { + // Reachable only as a defence-in-depth message: name IS in the + // allowlist but isAllowed is false, which shouldn't happen + // under current logic. Keep the branch so the message stays + // honest if the gate composition changes. + fmt.Fprintf(&msg, " hint: %q is in AllowedCommands but the call was still refused\n", name) + } + return msg.String() +} + +// extractStructuralTokens returns the structural-token sequence (subcommand +// path followed by positional arguments) derived from the trailing-args +// portion of argv (i.e. args[1:] in the caller's view), using spec to +// classify flags. See [CommandSpec] for the classification rules. +// +// Tokens are returned in the order they appear in input, with classified +// flag tokens (and the next-token values of recognised value flags) +// elided. The matcher consumes only the leading prefix of the result. +func extractStructuralTokens(args []string, spec CommandSpec) []string { + out := make([]string, 0, len(args)) + for i := 0; i < len(args); i++ { + tok := args[i] + if !strings.HasPrefix(tok, "-") || tok == "-" || tok == "--" { + // Plain positional / subcommand token. (A bare "-" or "--" + // is a positional separator in shell convention, not a + // flag.) + out = append(out, tok) + continue + } + // Flag of some kind. + if strings.Contains(tok, "=") { + // "--flag=value" or "-f=value" form: the value is bundled + // into this single token, so we don't need to consume the + // next argv token regardless of spec classification. + continue + } + if spec.ValueFlags[tok] { + // Skip the flag and its value (next argv token, if any). + if i+1 < len(args) { + i++ + } + continue + } + // BooleanFlags or unknown flag: skip just the flag token. We + // treat unknown flags as boolean to avoid false negatives if + // the spec is incomplete; the trade-off is a possible false + // negative if the flag actually takes a value (its assumed + // "value" would then be misclassified as a structural token). + _ = spec.BooleanFlags // documented behaviour, no branch needed + } + return out +} + func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { name := args[0] r.totalCount++ - // Evaluate both policy checks upfront so the span tags reflect the - // independent facts about the command name regardless of which gate - // short-circuits dispatch. - isAllowed := r.allowAllCommands || r.allowedCommands[name] + // Evaluate the deny axis first: a deny-pattern match overrides every + // allow rule. Then evaluate the allow axes in their usual order. The + // boolean isAllowed is the final gate decision; matchedDeny is held + // so the policy-denial error can identify the rule that fired. + matchedDeny, deniedByPattern := r.firstMatchingDeniedPattern(args) + isAllowed := !deniedByPattern && (r.allowAllCommands || r.allowedCommands[name] || r.argvMatchesAllowedPattern(args)) fn, isKnown := builtins.Lookup(name) span, ctx := telemetry.StartSpanFromContext(ctx, "command") @@ -326,8 +527,11 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { } if !isAllowed { - r.errf("rshell: %s: command not allowed\n", name) - if r.allowedCommands["help"] { + r.errf("%s", r.formatPolicyDenial(args, matchedDeny)) + if r.allowedCommands["help"] && matchedDeny == nil { + // Don't suggest 'help' when the call was specifically + // denied by a deny pattern — the help listing won't show + // why the deny fired and the suggestion is misleading. r.errf("Run 'help' to see allowed commands.\n") } r.exit.code = 127 @@ -338,8 +542,20 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { r.dispatchedCount++ var runCmd func(context.Context, string, string, []string) (uint8, error) runCmd = func(ctx context.Context, dir string, cmdName string, cmdArgs []string) (uint8, error) { - if !r.allowAllCommands && !r.allowedCommands[cmdName] { - return 127, fmt.Errorf("rshell: %s: command not allowed", cmdName) + // Pattern matching expects full argv with the command name at + // args[0]. cmdArgs by convention excludes cmdName, so we + // reconstruct the canonical argv before consulting patterns. + fullArgv := make([]string, 0, len(cmdArgs)+1) + fullArgv = append(fullArgv, cmdName) + fullArgv = append(fullArgv, cmdArgs...) + matchedDeny, deniedByPattern := r.firstMatchingDeniedPattern(fullArgv) + allowed := !deniedByPattern && (r.allowAllCommands || r.allowedCommands[cmdName] || r.argvMatchesAllowedPattern(fullArgv)) + if !allowed { + // Strip the trailing newline because callers (notably + // find -exec) wrap the error in their own "find: '%s': + // %s\n" template; keeping the embedded newline would + // produce a stray blank line. + return 127, fmt.Errorf("%s", strings.TrimRight(r.formatPolicyDenial(fullArgv, matchedDeny), "\n")) } cmdFn, ok := builtins.Lookup(cmdName) if !ok { @@ -411,8 +627,11 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { } return builtins.FileID{Dev: dev, Ino: ino}, true }, - CommandAllowed: func(n string) bool { - return r.allowAllCommands || r.allowedCommands[n] + CommandAllowed: func(n string, args []string) bool { + if _, denied := r.firstMatchingDeniedPattern(args); denied { + return false + } + return r.allowAllCommands || r.allowedCommands[n] || r.argvMatchesAllowedPattern(args) }, } if r.stdin != nil { @@ -490,8 +709,11 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { } return builtins.FileID{Dev: dev, Ino: ino}, true }, - CommandAllowed: func(cmdName string) bool { - return r.allowAllCommands || r.allowedCommands[cmdName] + CommandAllowed: func(cmdName string, args []string) bool { + if _, denied := r.firstMatchingDeniedPattern(args); denied { + return false + } + return r.allowAllCommands || r.allowedCommands[cmdName] || r.argvMatchesAllowedPattern(args) }, RunCommand: runCmd, Proc: r.proc, diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_allows_substitution_when_argv_matches.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_allows_substitution_when_argv_matches.yaml new file mode 100644 index 000000000..7165e1007 --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_allows_substitution_when_argv_matches.yaml @@ -0,0 +1,21 @@ +skip_assert_against_bash: true +description: | + Partner case to pattern_blocks_substitution_escape: a substitution that + expands to an argv satisfying the pattern is allowed. Confirms the + matcher inspects the resolved argv rather than blanket-rejecting + interpolation. +input: + allowed_commands: + - rshell:printf + allowed_command_patterns: + - ["ip", "route"] + script: |+ + $(printf ip) route show +expect: + # ip route reports its own platform-specific error on macOS; the + # important thing is that the policy gate ALLOWED the call (exit + # code is whatever ip returns, NOT 127). + stdout: "" + stderr_contains: + - "route table reading is not supported" + exit_code: 1 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_and_allowed_commands_union.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_and_allowed_commands_union.yaml new file mode 100644 index 000000000..f1536f258 --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_and_allowed_commands_union.yaml @@ -0,0 +1,24 @@ +skip_assert_against_bash: true +description: | + AllowedCommands and AllowedCommandPatterns are independent permit axes + joined by union: a command is allowed if its name is on the + AllowedCommands list OR its argv satisfies a pattern. Here printf is + allowed by name and admits all argv shapes, while ip is only + authorised when its argv satisfies (ip, route). +input: + allowed_commands: + - rshell:printf + allowed_command_patterns: + - ["ip", "route"] + script: |+ + printf "from-printf\n" + ip route show +expect: + # printf prints "from-printf"; ip route show reports its own error. + # Both calls admitted by the policy. The script exits with the last + # statement's exit code (ip's 1 on macOS). + stdout: |+ + from-printf + stderr_contains: + - "route table reading is not supported" + exit_code: 1 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_other_subcommand.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_other_subcommand.yaml new file mode 100644 index 000000000..d422c60e5 --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_other_subcommand.yaml @@ -0,0 +1,17 @@ +skip_assert_against_bash: true +description: | + Pattern targets a specific subcommand; invocations of a sibling + subcommand are blocked. Pattern [ip, route] admits "ip route show" + but rejects "ip addr show" because the token "route" does not appear + anywhere in the argv. +input: + allowed_command_patterns: + - ["ip", "route"] + script: |+ + ip addr +expect: + stdout: "" + stderr_contains: + - "ip addr: invocation not permitted by policy" + - "allowed patterns for \"ip\": 'ip route'" + exit_code: 127 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_short_argv.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_short_argv.yaml new file mode 100644 index 000000000..71ac76104 --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_short_argv.yaml @@ -0,0 +1,16 @@ +skip_assert_against_bash: true +description: | + An argv shorter than the pattern cannot match (length check). Pattern + (ip, route) requires a structural token after "ip"; bare "ip" with no + subcommand does not satisfy it. +input: + allowed_command_patterns: + - ["ip", "route"] + script: |+ + ip +expect: + stdout: "" + stderr_contains: + - "rshell: ip: command not allowed" + - "allowed patterns for \"ip\": 'ip route'" + exit_code: 127 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_substitution_escape.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_substitution_escape.yaml new file mode 100644 index 000000000..0fe152acc --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_blocks_substitution_escape.yaml @@ -0,0 +1,21 @@ +skip_assert_against_bash: true +description: | + The architectural test: command substitution cannot bypass argv-prefix + patterns. The literal command text "$(printf ip) addr" is opaque to + any static caller; rshell expands it at runtime to ["ip","addr"], + applies pattern (ip, route), and refuses because the leading + structural token is "addr", not "route". printf is allowed by name so + the substitution itself can run. +input: + allowed_commands: + - rshell:printf + allowed_command_patterns: + - ["ip", "route"] + script: |+ + $(printf ip) addr +expect: + stdout: "" + stderr_contains: + - "ip addr: invocation not permitted by policy" + - "allowed patterns for \"ip\": 'ip route'" + exit_code: 127 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_does_not_match_argv.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_does_not_match_argv.yaml new file mode 100644 index 000000000..31d2bd10a --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_does_not_match_argv.yaml @@ -0,0 +1,16 @@ +skip_assert_against_bash: true +description: | + An invocation whose subcommand does not match the pattern is blocked. + Pattern (ip, route) does NOT admit "ip link show" because the leading + structural token is "link", not "route". +input: + allowed_command_patterns: + - ["ip", "route"] + script: |+ + ip link show +expect: + stdout: "" + stderr_contains: + - "ip link show: invocation not permitted by policy" + - "allowed patterns for \"ip\": 'ip route'" + exit_code: 127 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_matches_argv.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_matches_argv.yaml new file mode 100644 index 000000000..3c5c3d1ce --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_matches_argv.yaml @@ -0,0 +1,19 @@ +skip_assert_against_bash: true +description: | + An invocation whose argv satisfies a (command, subcommand) pattern is + allowed. Pattern (ip, route) admits "ip route show". The ip builtin + has a registered CommandSpec, so multi-token patterns referencing it + are accepted at runner construction time. +input: + allowed_command_patterns: + - ["ip", "route"] + script: |+ + ip route show +expect: + # ip route reports its own platform-specific error on macOS; the + # important thing is that the policy gate ALLOWED the call (exit + # code is whatever ip returns, NOT 127). + stdout: "" + stderr_contains: + - "route table reading is not supported" + exit_code: 1 diff --git a/tests/scenarios/shell/allowed_command_patterns/pattern_tolerates_flag_before_subcommand.yaml b/tests/scenarios/shell/allowed_command_patterns/pattern_tolerates_flag_before_subcommand.yaml new file mode 100644 index 000000000..31fb13b64 --- /dev/null +++ b/tests/scenarios/shell/allowed_command_patterns/pattern_tolerates_flag_before_subcommand.yaml @@ -0,0 +1,21 @@ +skip_assert_against_bash: true +description: | + An argv with a flag (and value) interleaved between the command name + and the subcommand still matches a (command, subcommand) pattern. This + is the canonical kubectl-style usage: a pattern of [ip, route] admits + "ip -4 route show" because argv[0] equals pattern[0] and the + subcommand token "route" appears somewhere in argv[1..]. +input: + allowed_command_patterns: + - ["ip", "route"] + script: |+ + ip -4 route show +expect: + # ip route is a real builtin, but route table reading isn't supported + # on every platform; the gate ALLOWS the call (exit != 127) and the + # builtin reports its own error. The point of this scenario is to + # confirm the policy gate admitted the invocation. + stdout: "" + stderr_contains: + - "route table reading is not supported" + exit_code: 1 diff --git a/tests/scenarios/shell/allowed_commands/allowed_after_blocked.yaml b/tests/scenarios/shell/allowed_commands/allowed_after_blocked.yaml index 5221b1514..41e125736 100644 --- a/tests/scenarios/shell/allowed_commands/allowed_after_blocked.yaml +++ b/tests/scenarios/shell/allowed_commands/allowed_after_blocked.yaml @@ -9,6 +9,6 @@ input: expect: stdout: |+ after - stderr: |+ - rshell: cat: command not allowed + stderr_contains: + - "cat /dev/null: invocation not permitted by policy" exit_code: 0 diff --git a/tests/scenarios/shell/allowed_commands/disallowed_command_blocked.yaml b/tests/scenarios/shell/allowed_commands/disallowed_command_blocked.yaml index 88fe08c06..ca4a9d9e0 100644 --- a/tests/scenarios/shell/allowed_commands/disallowed_command_blocked.yaml +++ b/tests/scenarios/shell/allowed_commands/disallowed_command_blocked.yaml @@ -7,6 +7,6 @@ input: cat /dev/null expect: stdout: "" - stderr: |+ - rshell: cat: command not allowed + stderr_contains: + - "cat /dev/null: invocation not permitted by policy" exit_code: 127 diff --git a/tests/scenarios/shell/allowed_commands/no_commands_allowed.yaml b/tests/scenarios/shell/allowed_commands/no_commands_allowed.yaml index 84ea932f4..a04c2f9e7 100644 --- a/tests/scenarios/shell/allowed_commands/no_commands_allowed.yaml +++ b/tests/scenarios/shell/allowed_commands/no_commands_allowed.yaml @@ -6,6 +6,6 @@ input: echo hello expect: stdout: "" - stderr: |+ - rshell: echo: command not allowed + stderr_contains: + - "echo hello: invocation not permitted by policy" exit_code: 127 diff --git a/tests/scenarios/shell/denied_command_patterns/deny_does_not_block_other_subcommand.yaml b/tests/scenarios/shell/denied_command_patterns/deny_does_not_block_other_subcommand.yaml new file mode 100644 index 000000000..6a57f49c7 --- /dev/null +++ b/tests/scenarios/shell/denied_command_patterns/deny_does_not_block_other_subcommand.yaml @@ -0,0 +1,20 @@ +skip_assert_against_bash: true +description: | + Partner case to deny_overrides_name_allowlist: with the same allow + rule and same deny pattern, a sibling subcommand still admits. The + deny is targeted, not blanket. +input: + allowed_commands: + - rshell:ip + denied_command_patterns: + - ["ip", "route"] + script: |+ + ip addr show +expect: + # The loopback interface exists on every platform we run on, and + # `ip addr show` always reports it with the LOOPBACK flag in the + # output. Using stdout_contains lets us assert "the call ran" on + # both macOS and Linux without coupling to interface lists. + stdout_contains: + - "LOOPBACK" + exit_code: 0 diff --git a/tests/scenarios/shell/denied_command_patterns/deny_overrides_allow_pattern.yaml b/tests/scenarios/shell/denied_command_patterns/deny_overrides_allow_pattern.yaml new file mode 100644 index 000000000..1d031c314 --- /dev/null +++ b/tests/scenarios/shell/denied_command_patterns/deny_overrides_allow_pattern.yaml @@ -0,0 +1,18 @@ +skip_assert_against_bash: true +description: | + Allow pattern (ip, route) admits the route subcommand in general; a + more specific deny pattern (ip, route, get) carves out the get + sub-subcommand. ip route show admits; ip route get is refused. +input: + allowed_command_patterns: + - ["ip", "route"] + denied_command_patterns: + - ["ip", "route", "get"] + script: |+ + ip route get 8.8.8.8 +expect: + stdout: "" + stderr_contains: + - "ip route get 8.8.8.8: blocked by deny pattern" + - '"ip route get"' + exit_code: 127 diff --git a/tests/scenarios/shell/denied_command_patterns/deny_overrides_name_allowlist.yaml b/tests/scenarios/shell/denied_command_patterns/deny_overrides_name_allowlist.yaml new file mode 100644 index 000000000..5e4bf70b0 --- /dev/null +++ b/tests/scenarios/shell/denied_command_patterns/deny_overrides_name_allowlist.yaml @@ -0,0 +1,19 @@ +skip_assert_against_bash: true +description: | + The headline use case for DeniedCommandPatterns: an operator allows + `ip` wholesale by name but carves out `ip route` specifically. ip + addr admits; ip route is refused even though the name allowlist + would otherwise permit it. +input: + allowed_commands: + - rshell:ip + denied_command_patterns: + - ["ip", "route"] + script: |+ + ip route show +expect: + stdout: "" + stderr_contains: + - "ip route show: blocked by deny pattern" + - '"ip route"' + exit_code: 127 diff --git a/tests/scenarios/shell/denied_command_patterns/deny_survives_substitution.yaml b/tests/scenarios/shell/denied_command_patterns/deny_survives_substitution.yaml new file mode 100644 index 000000000..abf618fef --- /dev/null +++ b/tests/scenarios/shell/denied_command_patterns/deny_survives_substitution.yaml @@ -0,0 +1,20 @@ +skip_assert_against_bash: true +description: | + The architectural property — post-expansion enforcement — applies + to denies as well as allows. A substitution that resolves at runtime + to a denied argv is blocked, even though the literal command text + was opaque to any static caller. +input: + allowed_commands: + - rshell:ip + - rshell:printf + denied_command_patterns: + - ["ip", "route"] + script: |+ + $(printf ip) route show +expect: + stdout: "" + stderr_contains: + - "ip route show: blocked by deny pattern" + - '"ip route"' + exit_code: 127 diff --git a/tests/scenarios_test.go b/tests/scenarios_test.go index d53e2dfa6..883a3e91c 100644 --- a/tests/scenarios_test.go +++ b/tests/scenarios_test.go @@ -80,6 +80,15 @@ type input struct { // scenario, no commands are allowed. When omitted, the test harness // defaults to allowing all commands for backward compatibility. AllowAllCommands *bool `yaml:"allow_all_commands"` + // AllowedCommandPatterns lists argv-prefix patterns that authorise + // invocations whose argv begins with the same tokens. Each pattern is + // a non-empty token list. When set, the harness treats this as an + // explicit configuration and does NOT fall back to allow-all. + AllowedCommandPatterns [][]string `yaml:"allowed_command_patterns"` + // DeniedCommandPatterns lists argv-prefix patterns that BLOCK the + // invocation regardless of any allow rule. Same shape as + // AllowedCommandPatterns; deny-first precedence at the gate. + DeniedCommandPatterns [][]string `yaml:"denied_command_patterns"` } // expected holds the expected output for a scenario. @@ -207,18 +216,28 @@ func runScenario(t *testing.T, sc scenario) { // runner unrestricted. opts = append(opts, interp.AllowedPaths(resolved)) } + hasExplicitAllowConfig := (sc.Input.AllowAllCommands != nil) || + len(sc.Input.AllowedCommands) > 0 || + len(sc.Input.AllowedCommandPatterns) > 0 if sc.Input.AllowAllCommands != nil && *sc.Input.AllowAllCommands { opts = append(opts, interpoption.AllowAllCommands().(interp.RunnerOption)) - } else if len(sc.Input.AllowedCommands) > 0 { - opts = append(opts, interp.AllowedCommands(sc.Input.AllowedCommands)) - } else if sc.Input.AllowAllCommands == nil { + } else if !hasExplicitAllowConfig { // Default: allow all commands for backward compatibility with // existing scenarios that predate the allowedCommands feature. opts = append(opts, interpoption.AllowAllCommands().(interp.RunnerOption)) } - // When allow_all_commands is explicitly false and allowed_commands is - // empty, no AllowedCommands/AllowAllCommands option is added, so the - // interpreter defaults to blocking all commands. + if len(sc.Input.AllowedCommands) > 0 { + opts = append(opts, interp.AllowedCommands(sc.Input.AllowedCommands)) + } + if len(sc.Input.AllowedCommandPatterns) > 0 { + opts = append(opts, interp.AllowedCommandPatterns(sc.Input.AllowedCommandPatterns)) + } + if len(sc.Input.DeniedCommandPatterns) > 0 { + opts = append(opts, interp.DeniedCommandPatterns(sc.Input.DeniedCommandPatterns)) + } + // When no allow_all_commands / allowed_commands / allowed_command_patterns + // are configured (explicit closed config), no allow option is added, so + // the interpreter defaults to blocking all commands. if sc.Containerized { opts = append(opts, interp.HostPrefix(filepath.Join(dir, "host"))) }