diff --git a/README.md b/README.md index e339ee9c2..7d0c0a3bb 100644 --- a/README.md +++ b/README.md @@ -66,6 +66,12 @@ Every access path is default-deny: **AllowedCommands** restricts which commands (builtins or external) the interpreter may execute. Commands must be specified with the `rshell:` namespace prefix (e.g. `rshell:cat`, `rshell:echo`). If not set, no commands are allowed. +> ### ⚠️ Demo only — `host:` namespace (not for production) +> +> A `host:=` entry (e.g. `host:logrotate=/usr/sbin/logrotate`) allowlists a single host binary at the given absolute path. When dispatch sees a non-builtin command name matching ``, the binary at `` is exec'd directly (no PATH lookup, no shell, no argv filtering). Stdin/stdout/stderr are plumbed through, the binary's exit code is propagated, context cancellation kills the process via `SIGKILL`, and the env passed to the binary is filtered to `PATH`, `HOME`, `LANG`. Linux-only; on darwin/windows the dispatch returns 127 with "host execution not supported on this platform". +> +> **This entry-point fundamentally changes rshell's threat model — the entire reason rshell exists is to *not* execute host binaries.** It exists for the host-remediation demo (step 6: `logrotate -f /etc/logrotate.d/app`) and would need a real design pass before becoming a product feature. The current implementation has no audit logging, no argv filtering, no multi-binary scaffolding, and no operator-friendly configuration story. + **AllowedPaths** restricts all file operations to specified directories using Go's `os.Root` API (`openat` syscalls), making it immune to symlink traversal, TOCTOU races, and `..` escape attacks. Both reads and writes are sandboxed by the same mechanism — files outside the allowlist cannot be opened, created, truncated, or appended to. The cross-root symlink fallback is read-only: a symlink that points outside its `os.Root` is followed for reads but never for writes (avoids a TOCTOU window where a malicious link target could be swapped between resolution and open). File-target output redirections (`>`, `>>`, `2>`, `&>`, `&>>`) open through the same sandbox: writes inside `AllowedPaths` succeed, anything else fails with `permission denied` and exit 1. The literal target `/dev/null` is short-circuited to a discarded sink without going through the sandbox. Configured directories that cannot be opened (missing, not a directory, no permission) are skipped with a diagnostic message; by default these messages are flushed once to the runner's stderr at construction time. Callers that need to keep stderr clean of sandbox diagnostics can route them to a dedicated sink with `WarningsWriter(io.Writer)` or retrieve them programmatically via `Runner.Warnings()`. > **Note:** The `ss`, `ip route`, and `df` builtins bypass `AllowedPaths` for their kernel-state reads. `ss` and `ip route` open `/proc/net/*` paths directly; `df` reads `/proc/self/mountinfo` (Linux) or calls `getfsstat(2)` (macOS), then issues `unix.Statfs(2)` against every kernel-reported mount point. These paths are hardcoded — never derived from user input — and `Statfs` returns metadata only (block / inode counts, filesystem type, block size). There is no sandbox-escape risk, but operators cannot use `AllowedPaths` to block `ss` from enumerating local sockets, `ip route` from reading the routing table, or `df` from reporting mount-table capacity — these reads succeed regardless of the configured path policy. diff --git a/SHELL_FEATURES.md b/SHELL_FEATURES.md index 466e6c0bb..d517705f6 100644 --- a/SHELL_FEATURES.md +++ b/SHELL_FEATURES.md @@ -104,6 +104,7 @@ The in-shell `help` command mirrors these feature categories: run `help` for a c ## Execution - ✅ AllowedCommands — restricts which commands (builtins or external) may be executed; commands require the `rshell:` namespace prefix (e.g. `rshell:cat`); if not set, no commands are allowed +- ⚠️ **Demo only:** `host:=` AllowedCommands entries (e.g. `host:logrotate=/usr/sbin/logrotate`) allowlist a single host binary that rshell will exec directly when invoked. Linux-only. Plumbs stdin/stdout/stderr, propagates exit code, kills via `SIGKILL` on context cancel, forwards only `PATH`/`HOME`/`LANG`. This fundamentally changes rshell's threat model and is intended for the host-remediation demo, not production — see README "demo only" section. - ✅ AllowedPaths filesystem sandboxing — restricts all file access (read and write) to specified directories; cross-root symlink fallback is read-only to avoid TOCTOU on writes - ✅ Whole-run execution timeout — callers can bound a `Run()` call via `context.Context`, `interp.MaxExecutionTime`, or the CLI `--timeout` flag; the deadline applies to the entire script, not each individual command - ✅ ProcPath — overrides the proc filesystem path used by `ps` (default `/proc`; Linux-only; useful for testing/container environments) diff --git a/cmd/rshell/main.go b/cmd/rshell/main.go index beeacee47..bebb9c70a 100644 --- a/cmd/rshell/main.go +++ b/cmd/rshell/main.go @@ -133,7 +133,7 @@ func run(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io. cmd.Flags().StringVarP(&command, "command", "c", "", "shell command string to execute") cmd.Flags().MarkHidden("command") //nolint:errcheck // flag is guaranteed to exist cmd.Flags().StringVarP(&allowedPaths, "allowed-paths", "p", "", "comma-separated list of directories the shell is allowed to access") - cmd.Flags().StringVar(&allowedCommands, "allowed-commands", "", "comma-separated list of namespaced commands (e.g. rshell:cat,rshell:find)") + cmd.Flags().StringVar(&allowedCommands, "allowed-commands", "", "comma-separated list of namespaced commands (e.g. rshell:cat,rshell:find,host:logrotate=/usr/sbin/logrotate)") cmd.Flags().BoolVar(&allowAllCmds, "allow-all-commands", false, "allow execution of all commands (builtins and external)") cmd.Flags().DurationVar(&timeout, "timeout", 0, "maximum execution time for the entire shell run (e.g. 100ms, 5s, 1m)") cmd.Flags().StringVar(&procPath, "proc-path", "", "path to the proc filesystem used by ps (default \"/proc\")") diff --git a/cmd/rshell/main_test.go b/cmd/rshell/main_test.go index 0ae52aca5..402f5ed42 100644 --- a/cmd/rshell/main_test.go +++ b/cmd/rshell/main_test.go @@ -276,7 +276,7 @@ func TestAllowedCommandsMissingNamespace(t *testing.T) { } func TestAllowedCommandsUnknownNamespace(t *testing.T) { - code, _, stderr := runCLI(t, "--allowed-commands", "host:echo", "-c", `echo hello`) + code, _, stderr := runCLI(t, "--allowed-commands", "bogus:echo", "-c", `echo hello`) assert.Equal(t, 1, code) assert.Contains(t, stderr, "unknown namespace") } diff --git a/interp/allowed_commands_test.go b/interp/allowed_commands_test.go index 38a115597..798f7d2fe 100644 --- a/interp/allowed_commands_test.go +++ b/interp/allowed_commands_test.go @@ -6,10 +6,16 @@ package interp_test import ( + "bytes" + "context" + "errors" + "path/filepath" + "strings" "testing" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" "github.com/DataDog/rshell/interp" ) @@ -21,7 +27,7 @@ func TestAllowedCommandsNamespaceRequired(t *testing.T) { } func TestAllowedCommandsUnknownNamespace(t *testing.T) { - _, err := interp.New(interp.AllowedCommands([]string{"host:echo"})) + _, err := interp.New(interp.AllowedCommands([]string{"bogus:echo"})) require.Error(t, err) assert.Contains(t, err.Error(), "unknown namespace") } @@ -62,3 +68,64 @@ func TestAllowedCommandsEmpty(t *testing.T) { require.Error(t, err) assert.Contains(t, err.Error(), "empty command name") } + +// TestHostEntryDoesNotAuthorizeBuiltin is a regression test: a host: entry +// whose name collides with a builtin must NOT silently authorize the +// builtin. Without the !isKnown gate in call(), an entry like +// "host:cat=" would flip isAllowed=true and the builtin cat would +// run with stdin/AllowedPaths access, never executing the host path. +// Cross-platform: this exercises the dispatch gate, not the actual host +// exec, so it runs on darwin/windows too. The host path uses +// t.TempDir() rather than a hardcoded /bin/... so AllowedCommands' +// filepath.IsAbs check passes on every OS (Windows requires drive +// letter or UNC). +func TestHostEntryDoesNotAuthorizeBuiltin(t *testing.T) { + hostPath := filepath.Join(t.TempDir(), "fake-binary") + stdout := &bytes.Buffer{} + stderr := &bytes.Buffer{} + r, err := interp.New( + interp.AllowedCommands([]string{"host:cat=" + hostPath}), + interp.StdIO(strings.NewReader(""), stdout, stderr), + ) + require.NoError(t, err) + t.Cleanup(func() { _ = r.Close() }) + + prog, err := syntax.NewParser().Parse(strings.NewReader("cat"), "") + require.NoError(t, err) + + runErr := r.Run(context.Background(), prog) + var status interp.ExitStatus + require.True(t, errors.As(runErr, &status), "expected ExitStatus error, got %v", runErr) + assert.Equal(t, interp.ExitStatus(127), status) + assert.Contains(t, stderr.String(), "command not allowed") + assert.Empty(t, stdout.String(), "cat builtin must not have run") +} + +// TestHostEntryAuthorizesNonBuiltin verifies the positive case for the +// dispatch gate — a host: entry whose name does NOT collide with a +// builtin still passes the allowlist check (it would fail with +// "command not allowed" if the gate were too strict). The actual exec +// path is platform-specific and tested in host_exec_test.go (linux); +// here we only assert that dispatch reaches it (on darwin/windows the +// host-exec stub returns 127 with a different message). t.TempDir() +// produces an absolute path on every OS so AllowedCommands accepts it +// on Windows too. +func TestHostEntryAuthorizesNonBuiltin(t *testing.T) { + hostPath := filepath.Join(t.TempDir(), "fake-binary") + stderr := &bytes.Buffer{} + r, err := interp.New( + interp.AllowedCommands([]string{"host:somenonsensename=" + hostPath}), + interp.StdIO(strings.NewReader(""), &bytes.Buffer{}, stderr), + ) + require.NoError(t, err) + t.Cleanup(func() { _ = r.Close() }) + + prog, err := syntax.NewParser().Parse(strings.NewReader("somenonsensename"), "") + require.NoError(t, err) + + _ = r.Run(context.Background(), prog) + // Whatever exit code we got, the rejection path ("command not + // allowed") must NOT have been taken — that's the only thing the + // dispatch gate is responsible for here. + assert.NotContains(t, stderr.String(), "command not allowed") +} diff --git a/interp/api.go b/interp/api.go index b3dda78f2..bb0b111ea 100644 --- a/interp/api.go +++ b/interp/api.go @@ -82,6 +82,15 @@ type runnerConfig struct { // command. Intended for testing convenience. allowAllCommands bool + // hostCommands maps allowlisted host-binary names (e.g. "logrotate") to + // the absolute path of the binary on disk (e.g. "/usr/sbin/logrotate"). + // Populated by AllowedCommands entries of the form "host:=". + // + // DEMO ONLY: running host binaries fundamentally changes rshell's threat + // model. This entry-point exists for the host-remediation demo and is not + // intended to ship as a product feature. + hostCommands map[string]string + // maxExecutionTime bounds the duration of each Run call. Zero disables // the limit. When non-zero, Run derives a child context with this timeout. maxExecutionTime time.Duration @@ -734,10 +743,19 @@ func HostPrefix(prefix string) RunnerOption { } // AllowedCommands restricts command execution to the specified command names. -// Names must use the "rshell:" namespace prefix (e.g. "rshell:cat", -// "rshell:find"). Names without a colon separator or with an unknown namespace -// are rejected. The bare command name (after the prefix) is stored internally -// and matched exactly against the command name (args[0]) at execution time. +// Names must use a namespace prefix. +// +// Two namespaces are accepted: +// - "rshell:" — allowlist a builtin (e.g. "rshell:cat", "rshell:find"). +// The bare command name (after the prefix) is matched exactly against +// args[0] at execution time. +// - "host:=" — DEMO ONLY: allowlist a host binary at +// the given absolute path (e.g. "host:logrotate=/usr/sbin/logrotate"). +// When dispatch sees a non-builtin command name matching , the +// binary at is exec'd. See [hostCommands] for the threat- +// model caveat. Linux-only. +// +// Names without a colon separator or with an unknown namespace are rejected. // // Only commands whose name appears in the list may be executed; all others are // rejected with ": command not allowed". @@ -750,28 +768,52 @@ func HostPrefix(prefix string) RunnerOption { func AllowedCommands(names []string) RunnerOption { return func(r *Runner) error { m := make(map[string]bool, len(names)) + var hostMap map[string]string for _, n := range names { if n == "" { return fmt.Errorf("AllowedCommands: empty command name") } idx := strings.Index(n, ":") if idx < 0 { - return fmt.Errorf("AllowedCommands: %q missing namespace prefix (expected \"rshell:\")", n) + return fmt.Errorf("AllowedCommands: %q missing namespace prefix (expected \"rshell:\" or \"host:=\")", n) } ns := n[:idx] - cmd := n[idx+1:] - if strings.Index(cmd, ":") >= 0 { - return fmt.Errorf("AllowedCommands: %q contains multiple colons; expected format \"rshell:\"", n) - } - if ns != "rshell" { - return fmt.Errorf("AllowedCommands: %q has unknown namespace %q (only \"rshell\" is supported)", n, ns) - } - if cmd == "" { - return fmt.Errorf("AllowedCommands: %q has empty command name", n) + rest := n[idx+1:] + switch ns { + case "rshell": + if strings.Index(rest, ":") >= 0 { //nolint:gosimple // strings.Contains is not on the interp allowlist + return fmt.Errorf("AllowedCommands: %q contains multiple colons; expected format \"rshell:\"", n) + } + if rest == "" { + return fmt.Errorf("AllowedCommands: %q has empty command name", n) + } + m[rest] = true + case "host": + eq := strings.Index(rest, "=") + if eq < 0 { + return fmt.Errorf("AllowedCommands: %q missing \"=\" (expected format \"host:=\")", n) + } + name := rest[:eq] + path := rest[eq+1:] + if name == "" { + return fmt.Errorf("AllowedCommands: %q has empty host command name", n) + } + if path == "" { + return fmt.Errorf("AllowedCommands: %q has empty host binary path", n) + } + if !filepath.IsAbs(path) { + return fmt.Errorf("AllowedCommands: %q host binary path must be absolute, got %q", n, path) + } + if hostMap == nil { + hostMap = make(map[string]string) + } + hostMap[name] = path + default: + return fmt.Errorf("AllowedCommands: %q has unknown namespace %q (only \"rshell\" and \"host\" are supported)", n, ns) } - m[cmd] = true } r.allowedCommands = m + r.hostCommands = hostMap return nil } } diff --git a/interp/host_exec_linux.go b/interp/host_exec_linux.go new file mode 100644 index 000000000..ed2864655 --- /dev/null +++ b/interp/host_exec_linux.go @@ -0,0 +1,111 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +// DEMO ONLY: this file enables rshell to execute a small allowlist of host +// binaries. Running host binaries fundamentally changes rshell's threat +// model — the entire reason rshell exists is to *not* execute host binaries. +// This entry-point exists for the host-remediation demo (see +// docs/RULES.md / SHELL_FEATURES.md "demo only" section) and would need a +// real design pass before becoming a product feature. + +package interp + +import ( + "context" + "errors" + "os/exec" +) + +// hostEnvAllowlist is the set of environment variable names forwarded to +// host binaries. Anything else in the runner environment is stripped so +// that host invocations do not leak ambient configuration that the rest +// of the shell deliberately keeps out. +var hostEnvAllowlist = []string{"PATH", "HOME", "LANG"} + +// runHostCommand executes the host binary at path with args[1:] as its argv, +// plumbing the runner's stdin/stdout/stderr through. It returns the binary's +// exit code as a uint8 (so $? works) and propagates context cancellation by +// killing the process with SIGKILL — exec.CommandContext's default Cancel +// uses os.Kill (SIGKILL on Unix), which matches the timeout behaviour the +// rest of the runner applies to builtins. +// +// args[0] is the user-visible command name; args[1:] is the binary's argv. +// The binary path itself comes from the hostCommands allowlist entry, not +// from args, so PATH lookup is intentionally not performed. +func (r *Runner) runHostCommand(ctx context.Context, path string, args []string) uint8 { + cmd := exec.CommandContext(ctx, path, args[1:]...) + cmd.Dir = r.Dir + cmd.Env = r.filterHostEnv() + if r.stdin != nil { + cmd.Stdin = r.stdin + } + cmd.Stdout = r.stdout + cmd.Stderr = r.stderr + + err := cmd.Run() + // If the runner's context expired (MaxExecutionTime, parent cancel, + // or builtin-style cooperative cancel), exec.CommandContext kills + // the child with SIGKILL and cmd.Run returns *exec.ExitError with + // ExitCode() == -1. We must surface ctx.Err() back through Run() + // rather than mapping the signal to a numeric exit code, otherwise + // Run() returns ExitStatus(130) and the CLI's timeout path + // (context.DeadlineExceeded → "execution timed out", exit 124) + // never fires and run-span telemetry is misclassified as success. + // Use exit.fatal so the err is recorded; the returned uint8 is only + // observed when r.exit.err is nil, so the value is symbolic here. + if ctxErr := ctx.Err(); ctxErr != nil { + r.exit.fatal(ctxErr) + return 130 + } + if err == nil { + return 0 + } + var exitErr *exec.ExitError + if errors.As(err, &exitErr) { + // ExitCode returns -1 if the process was terminated by a signal + // without a context cancel having fired (e.g. an external + // SIGKILL from outside this process). Map that to 130 — the + // shell-conventional "terminated" code — so the caller can still + // observe a non-zero exit. + code := exitErr.ExitCode() + if code < 0 { + return 130 + } + return uint8(code) + } + // Failure to start or some other I/O error: surface to stderr and + // return 127 (the shell convention for "command not found / not + // executable"). + r.errf("rshell: %s: %v\n", args[0], err) + return 127 +} + +// filterHostEnv builds a minimal env slice for host binaries from the +// runner's environment overlay (r.writeEnv) — NOT the ambient Go +// process env — forwarding only the names in hostEnvAllowlist that +// are also marked Exported. Matches bash semantics: a script-level +// assignment like `PATH=/tmp; hostcmd` does not propagate to the +// child unless PATH was previously exported (via interp.Env or an +// `export` statement). Inline command assignments +// (`PATH=/safe hostcmd`) propagate because call() forces +// vr.Exported = true before dispatch. +// +// Reading from r.writeEnv is what makes the runner's documented +// "empty by default, no host env inherited" guarantee hold for host +// binaries: an unset PATH/HOME/LANG in the runner is simply omitted, +// regardless of what the surrounding Go process exports. +func (r *Runner) filterHostEnv() []string { + out := make([]string, 0, len(hostEnvAllowlist)) + for _, name := range hostEnvAllowlist { + vr := r.writeEnv.Get(name) + if !vr.Declared() || !vr.Exported { + continue + } + out = append(out, name+"="+vr.String()) + } + return out +} diff --git a/interp/host_exec_other.go b/interp/host_exec_other.go new file mode 100644 index 000000000..5516601f6 --- /dev/null +++ b/interp/host_exec_other.go @@ -0,0 +1,21 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build !linux + +// DEMO ONLY: see host_exec_linux.go. The host-binary entry-point is a +// Linux-only demo; on darwin/windows we simply refuse to dispatch. + +package interp + +import "context" + +// runHostCommand is a stub for non-Linux platforms. It writes an explanatory +// error to the runner's stderr and returns 127 (shell-conventional "command +// not found / not executable") so callers can rely on a non-zero exit. +func (r *Runner) runHostCommand(_ context.Context, _ string, args []string) uint8 { + r.errf("rshell: %s: host execution not supported on this platform\n", args[0]) + return 127 +} diff --git a/interp/host_exec_test.go b/interp/host_exec_test.go new file mode 100644 index 000000000..b473ca5a7 --- /dev/null +++ b/interp/host_exec_test.go @@ -0,0 +1,291 @@ +// Unless explicitly stated otherwise all files in this repository are licensed +// under the Apache License Version 2.0. +// This product includes software developed at Datadog (https://www.datadoghq.com/). +// Copyright 2026-present Datadog, Inc. + +//go:build linux + +// DEMO ONLY: tests for the host-binary entry-point. See host_exec_linux.go. + +package interp_test + +import ( + "bytes" + "context" + "errors" + "io" + "os" + "strings" + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + "mvdan.cc/sh/v3/syntax" + + "github.com/DataDog/rshell/interp" +) + +// runExitCode runs node and returns the resulting exit code. A nil error +// from Run means the script exited 0; a non-nil error wraps an +// interp.ExitStatus that we surface as the code. Any other error is +// reported via t.Fatalf. +func runExitCode(t *testing.T, r *interp.Runner, node syntax.Node) uint8 { + t.Helper() + err := r.Run(context.Background(), node) + if err == nil { + return 0 + } + var es interp.ExitStatus + if errors.As(err, &es) { + return uint8(es) + } + t.Fatalf("Run returned unexpected error: %v", err) + return 0 +} + +func parseHostExecScript(t *testing.T, src string) *syntax.File { + t.Helper() + prog, err := syntax.NewParser().Parse(strings.NewReader(src), "") + require.NoError(t, err) + return prog +} + +// newHostExecRunner builds a Runner that allows just the host: entries the +// caller passes in. stdin/stdout/stderr are wired up to the supplied buffers +// so tests can assert on captured output. stdin is provided as a *os.File +// (via os.Pipe) when nonempty, since StdIO expects a Reader and the runner +// converts it into a pipe internally — providing a real file when we need +// real stdin keeps the host-exec path identical to production. +func newHostExecRunner(t *testing.T, hostEntries []string, stdin io.Reader) (*interp.Runner, *bytes.Buffer, *bytes.Buffer) { + t.Helper() + return newHostExecRunnerWithEnv(t, hostEntries, stdin, nil) +} + +// newHostExecRunnerWithEnv is like newHostExecRunner but also seeds the +// runner environment via interp.Env. The host-exec env-filtering tests +// must provide env this way (NOT via t.Setenv) because the +// implementation reads from r.writeEnv, not the ambient Go process env. +func newHostExecRunnerWithEnv(t *testing.T, hostEntries []string, stdin io.Reader, envPairs []string) (*interp.Runner, *bytes.Buffer, *bytes.Buffer) { + t.Helper() + stdout := &bytes.Buffer{} + stderr := &bytes.Buffer{} + if stdin == nil { + stdin = strings.NewReader("") + } + opts := []interp.RunnerOption{ + interp.AllowedCommands(hostEntries), + interp.StdIO(stdin, stdout, stderr), + } + if envPairs != nil { + opts = append(opts, interp.Env(envPairs...)) + } + r, err := interp.New(opts...) + require.NoError(t, err) + t.Cleanup(func() { _ = r.Close() }) + return r, stdout, stderr +} + +// Tests below intentionally use command names that do NOT collide with +// rshell builtins (echo/cat/false/true are all registered builtins). The +// !isKnown gate in call() means a host: entry is honored only when the +// name does not resolve to a builtin, so binding host:echo=/bin/echo and +// then running `echo` would dispatch the rshell echo builtin and never +// touch the host path under test. + +func TestHostExecStdoutPlumbed(t *testing.T) { + r, stdout, _ := newHostExecRunner(t, []string{"host:hostecho=/bin/echo"}, nil) + err := r.Run(context.Background(), parseHostExecScript(t, `hostecho hello world`)) + require.NoError(t, err) + assert.Equal(t, "hello world\n", stdout.String()) +} + +func TestHostExecExitCodePropagated(t *testing.T) { + r, _, _ := newHostExecRunner(t, []string{"host:hostfalse=/usr/bin/false"}, nil) + code := runExitCode(t, r, parseHostExecScript(t, `hostfalse`)) + assert.Equal(t, uint8(1), code) +} + +func TestHostExecStdinPlumbed(t *testing.T) { + stdin := strings.NewReader("piped-input\n") + r, stdout, _ := newHostExecRunner(t, []string{"host:hostcat=/bin/cat"}, stdin) + err := r.Run(context.Background(), parseHostExecScript(t, `hostcat`)) + require.NoError(t, err) + assert.Equal(t, "piped-input\n", stdout.String()) +} + +func TestHostExecRejectsNonAllowlistedName(t *testing.T) { + r, _, stderr := newHostExecRunner(t, []string{"host:hostecho=/bin/echo"}, nil) + code := runExitCode(t, r, parseHostExecScript(t, `notallowlisted`)) + assert.Equal(t, uint8(127), code) + assert.Contains(t, stderr.String(), "command not allowed") +} + +func TestHostExecContextCancellationKillsProcess(t *testing.T) { + r, _, _ := newHostExecRunner(t, []string{"host:sleep=/bin/sleep"}, nil) + + ctx, cancel := context.WithCancel(context.Background()) + defer cancel() + + done := make(chan error, 1) + go func() { + // 30s sleep — deliberately long so we know the cancel was what + // stopped it, not natural completion. + done <- r.Run(ctx, parseHostExecScript(t, `sleep 30`)) + }() + + // Give the process a chance to actually start before cancelling. + time.Sleep(100 * time.Millisecond) + cancel() + + select { + case err := <-done: + // The cancellation must propagate as context.Canceled, not as a + // numeric exit status. Otherwise the CLI's timeout/cancel path + // would be skipped and run telemetry would be misclassified. + require.ErrorIs(t, err, context.Canceled) + case <-time.After(2 * time.Second): + t.Fatal("host process was not killed within 2s of context cancel") + } +} + +// TestHostExecDeadlineSurfacesAsDeadlineExceeded verifies that a parent +// context deadline kills the host binary AND surfaces as +// context.DeadlineExceeded back through Run(), so that the CLI's +// --timeout path ("execution timed out after Xms" / exit 124) and +// run-span outcome classification both work for host-binary invocations +// the same way they do for builtins. +func TestHostExecDeadlineSurfacesAsDeadlineExceeded(t *testing.T) { + r, _, _ := newHostExecRunner(t, []string{"host:sleep=/bin/sleep"}, nil) + + ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond) + defer cancel() + + start := time.Now() + err := r.Run(ctx, parseHostExecScript(t, `sleep 30`)) + elapsed := time.Since(start) + + require.ErrorIs(t, err, context.DeadlineExceeded, + "host-exec context deadline must surface as context.DeadlineExceeded, "+ + "not ExitStatus(130). Got %v", err) + assert.Less(t, elapsed, 2*time.Second, + "deadline should fire within ~100ms, but Run() took %s", elapsed) +} + +func TestHostExecEnvFilteredToAllowlist(t *testing.T) { + // Env values come from the runner's environment overlay (interp.Env), + // not the ambient Go process env. Use t.Setenv to plant a sentinel in + // the process env that must NOT leak through, and provide a + // runner-scoped PATH that the host binary should see. + t.Setenv("RSHELL_HOST_EXEC_LEAK_CHECK", "should-not-appear") + + r, stdout, _ := newHostExecRunnerWithEnv(t, + []string{"host:env=/usr/bin/env"}, nil, + []string{"PATH=/usr/bin:/bin"}, + ) + err := r.Run(context.Background(), parseHostExecScript(t, `env`)) + require.NoError(t, err) + out := stdout.String() + assert.Contains(t, out, "PATH=/usr/bin:/bin", "PATH must be forwarded from runner env") + assert.NotContains(t, out, "RSHELL_HOST_EXEC_LEAK_CHECK", + "ambient Go process env must NOT leak — host binaries see only the runner env") +} + +func TestHostExecForwardsHomeAndLang(t *testing.T) { + r, stdout, _ := newHostExecRunnerWithEnv(t, + []string{"host:env=/usr/bin/env"}, nil, + []string{"HOME=/tmp/host-exec-home", "LANG=C.UTF-8"}, + ) + err := r.Run(context.Background(), parseHostExecScript(t, `env`)) + require.NoError(t, err) + out := stdout.String() + assert.Contains(t, out, "HOME=/tmp/host-exec-home") + assert.Contains(t, out, "LANG=C.UTF-8") +} + +// TestHostExecDefaultEnvIsEmpty verifies the runner's "no host env +// inherited by default" guarantee: with no interp.Env option, even +// allowlisted names (PATH/HOME/LANG) must not be forwarded to the host +// binary, regardless of the ambient Go process env. Regression for the +// previous os.LookupEnv-based filter that ignored runner env. +func TestHostExecDefaultEnvIsEmpty(t *testing.T) { + // Plant ambient env that must be ignored. + t.Setenv("PATH", "/should/not/leak") + t.Setenv("HOME", "/should/not/leak/home") + t.Setenv("LANG", "should-not-leak") + + r, stdout, _ := newHostExecRunner(t, []string{"host:env=/usr/bin/env"}, nil) + err := r.Run(context.Background(), parseHostExecScript(t, `env`)) + require.NoError(t, err) + out := stdout.String() + assert.NotContains(t, out, "/should/not/leak", + "runner default env is empty; host binary must not see ambient PATH/HOME/LANG") +} + +// TestHostExecInlineAssignmentTakesEffect verifies that inline command +// assignments (PATH=/safe hostcmd) are visible to host binaries. They +// flow through r.writeEnv before call() dispatches, which is exactly +// where filterHostEnv now reads from. +func TestHostExecInlineAssignmentTakesEffect(t *testing.T) { + r, stdout, _ := newHostExecRunnerWithEnv(t, + []string{"host:env=/usr/bin/env"}, nil, + []string{"PATH=/initial"}, + ) + err := r.Run(context.Background(), parseHostExecScript(t, `PATH=/inline-override env`)) + require.NoError(t, err) + out := stdout.String() + assert.Contains(t, out, "PATH=/inline-override", + "inline assignment must override the runner env for that command") + assert.NotContains(t, out, "PATH=/initial", + "the original PATH must not also appear") +} + +// TestHostExecUnexportedAssignmentNotForwarded mirrors bash semantics: +// a standalone assignment without a prior export (e.g. +// `LANG=fresh; hostcmd`) is shell-local and must NOT propagate to +// child processes. Verified against +// `bash -c 'CUSTOM_VAR=/tmp; env | grep CUSTOM_VAR'` which prints +// nothing. filterHostEnv now requires vr.Exported in addition to +// vr.Declared(). +func TestHostExecUnexportedAssignmentNotForwarded(t *testing.T) { + // LANG is not in the runner's initial env, so a script-level + // assignment to it creates a fresh, unexported variable. + r, stdout, _ := newHostExecRunner(t, []string{"host:env=/usr/bin/env"}, nil) + err := r.Run(context.Background(), + parseHostExecScript(t, `LANG=should-not-leak +env`)) + require.NoError(t, err) + out := stdout.String() + assert.NotContains(t, out, "should-not-leak", + "unexported script-level assignment must not propagate to host binaries (matches bash)") +} + +// TestHostExecBuiltinTakesPrecedence verifies that when a name is both an +// allowlisted builtin (rshell:echo) AND a configured host: entry, the +// builtin runs and the host binary is never exec'd. The host: entry +// points at /usr/bin/false on purpose: if the host path were +// erroneously preferred, the script would exit 1; because the builtin +// wins it exits 0 with the expected stdout. +func TestHostExecBuiltinTakesPrecedence(t *testing.T) { + r, stdout, _ := newHostExecRunner(t, + []string{"rshell:echo", "host:echo=/usr/bin/false"}, nil) + err := r.Run(context.Background(), parseHostExecScript(t, `echo from-builtin`)) + require.NoError(t, err, "builtin echo must win — non-zero exit means /usr/bin/false ran") + assert.Equal(t, "from-builtin\n", stdout.String()) +} + +// Sanity: make sure t.Setenv actually plants the var in the ambient Go +// process env. Used as a baseline for tests that assert host binaries +// do NOT see ambient process env (so we can be sure the negative +// assertion isn't trivially satisfied by t.Setenv being a no-op). +func TestHostExecOSEnvironHasUnrelatedVar(t *testing.T) { + t.Setenv("RSHELL_HOST_EXEC_LEAK_CHECK", "yes") + found := false + for _, kv := range os.Environ() { + if strings.HasPrefix(kv, "RSHELL_HOST_EXEC_LEAK_CHECK=") { + found = true + break + } + } + require.True(t, found, "test setup: t.Setenv should make var visible via os.Environ") +} diff --git a/interp/runner_exec.go b/interp/runner_exec.go index 97a21e11a..1aaa179c4 100644 --- a/interp/runner_exec.go +++ b/interp/runner_exec.go @@ -502,8 +502,19 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { // Evaluate both policy checks upfront so the span tags reflect the // independent facts about the command name regardless of which gate // short-circuits dispatch. - isAllowed := r.allowAllCommands || r.allowedCommands[name] fn, isKnown := builtins.Lookup(name) + hostBinaryPath, isHostAllowed := r.hostCommands[name] + // A host: entry is honoured for dispatch only when the name does not + // resolve to a builtin. Otherwise an entry like + // "host:cat=/bin/false" would silently authorize the cat builtin + // (because isAllowed would flip true) and the isKnown branch below + // would run cat with stdin/AllowedPaths access — never executing the + // host binary the entry actually pointed at. Builtins still take + // precedence when the name is also explicitly allowlisted via + // rshell:; this gate just prevents a host: entry from + // laundering builtin access. + hostDispatch := isHostAllowed && !isKnown + isAllowed := r.allowAllCommands || r.allowedCommands[name] || hostDispatch span, ctx := telemetry.StartSpanFromContext(ctx, "command") span.SetResourceName(name) @@ -511,6 +522,15 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { span.SetTag("rshell.command.argc", len(args)-1) span.SetTag("rshell.command.is_allowed", isAllowed) span.SetTag("rshell.command.is_known", isKnown) + if hostDispatch { + // DEMO ONLY: tag invocations that reach for a host binary so the + // demo narrative ("look, we can see exactly when the agent + // reached for the host") is visible in trace data. Tagged on + // hostDispatch (not isHostAllowed) so the tag reflects whether + // the host path was actually exec'd, not just whether an entry + // was configured. + span.SetTag("rshell.command.host_exec", true) + } // has_stdin_pipe / has_output_redirect reflect whether the command's // stdin/stdout were reassigned from the Runner's originals — true for // both pipeline stages and file redirects. @@ -785,6 +805,15 @@ func (r *Runner) call(ctx context.Context, pos syntax.Pos, args []string) { } return } + // DEMO ONLY: a host-binary entry takes precedence over the default + // execHandler so that an allowlisted host:= entry actually + // runs the binary instead of being rejected as "command not allowed". + // hostDispatch is already gated on !isKnown above, so builtins always win. + if hostDispatch { + r.dispatchedCount++ + r.exit.code = r.runHostCommand(ctx, hostBinaryPath, args) + return + } // Allowed but not known: the default execHandler (noExecHandler) will // reject with exit 127. unknownCount was already incremented above. r.exec(ctx, pos, args)