Skip to content

MLflow tracing with async Stop hook (opt-in) β€” resolves #9Β #11

@dgokeeffe

Description

@dgokeeffe

Summary

Fix for #9 (MLflow tracing silently disabled). Migrating from datasciencemonkey PR #139.

  • Opt-in tracing via MLFLOW_CLAUDE_TRACING_ENABLED=true in app.yaml β€” keeps default behaviour unchanged for existing deployments, but gives users a single env-var to flip.
  • Stop hook delegates to mlflow-trace-stop.sh, which backgrounds the handler via nohup timeout 30 … & disown. Returns in <1s so the rest of the Stop hook chain isn't blocked.
  • Hook-event JSON via temp file captured synchronously before backgrounding β€” naive nohup would redirect stdin to /dev/null and the handler would lose the transcript path.
  • Hard 30s ceiling on the backgrounded flush prevents a wedged handler from leaking memory/CPU.
  • Pins mlflow-skinny and mlflow-tracing to 3.11.1 to match the Apps runtime β€” version mismatches caused silent import failures.

Why this resolves #9

#9 documents that MLFLOW_CLAUDE_TRACING_ENABLED="false" is hardcoded today; the upstream Stop hook short-circuits and no traces are written despite the README claiming auto-tracing. This PR introduces the env-var override so users can flip it on without modifying source, and adds the async wrapper so enabling it doesn't slow session teardown.

Branch

feat/mlflow-tracing β€” about to be pushed.

Diff scope

+128 / -32, 3 files. Tests in tests/test_mlflow_tracing.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions