Skip to content

fix: enforce policy-based access control on artifact downloads#7009

Open
ycombinator wants to merge 8 commits into
elastic:mainfrom
ycombinator:fix/artifact-access-control
Open

fix: enforce policy-based access control on artifact downloads#7009
ycombinator wants to merge 8 commits into
elastic:mainfrom
ycombinator:fix/artifact-access-control

Conversation

@ycombinator
Copy link
Copy Markdown
Contributor

@ycombinator ycombinator commented May 11, 2026

What is the problem this PR solves?

The artifact download endpoint (/api/fleet/artifacts/{id}/{sha256}) only validates the agent's API key but never checks whether the requested artifact belongs to the agent's assigned policy. This means an agent enrolled under one policy can download artifacts belonging to a different policy if it knows the artifact ID and SHA256 hash. For example, an agent enrolled under a policy with no integrations can retrieve Elastic Defend trust lists, exception lists, and other security artifacts from another policy.

How does this PR solve the problem?

Implements the authorizeArtifact() function (previously a no-op that returned nil) to enforce policy-based access control:

  1. Adds a GetPolicy(ctx, policyID) method to the policy.Monitor interface that returns the cached policy for a given ID (reloads from ES on cache miss).
  2. In authorizeArtifact, fetches the agent's policy via the monitor using agent.AgentPolicyID and verifies that the requested artifact (identifier + decoded_sha256) appears in the policy's inputs[].artifact_manifest.artifacts.
  3. Returns 403 Forbidden (ErrUnauthorizedArtifact) if the artifact is not listed in the agent's assigned policy.

How to test this PR locally

  1. Set up Fleet Server with Elasticsearch and Kibana
  2. Create two agent policies: Victim-Policy with Elastic Defend integration (add a trusted application), and Attacker-Policy with no integrations
  3. Create an enrollment token for Attacker-Policy and enroll an agent
  4. Attempt to download an artifact belonging to Victim-Policy using the attacker agent's API key — should now receive 403 Forbidden instead of the artifact contents
  5. Verify that an agent enrolled under Victim-Policy can still download its own artifacts normally (200 OK)

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

The artifact download endpoint (/api/fleet/artifacts/{id}/{sha256})
previously only validated the agent's API key but never checked whether
the requested artifact belonged to the agent's assigned policy. This
allowed an agent enrolled under one policy to download artifacts from
a different policy if it knew the artifact ID and SHA256 hash.

Add authorizeArtifact implementation that fetches the agent's policy
from the in-memory policy monitor cache and verifies the requested
artifact appears in the policy's artifact_manifest before serving it.
Returns 403 Forbidden if the artifact is not in the agent's policy.

Resolves: elastic/security#8396

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ycombinator ycombinator requested a review from a team as a code owner May 11, 2026 22:44
@ycombinator ycombinator requested review from macdewee and samuelvl May 11, 2026 22:44
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 11, 2026

This pull request does not have a backport label. Could you fix it @ycombinator? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@ycombinator ycombinator requested review from michel-laterman and samuelvl and removed request for macdewee and samuelvl May 11, 2026 22:52
@ycombinator ycombinator added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-active-all Automated backport with mergify to all the active branches labels May 11, 2026
ycombinator and others added 2 commits May 11, 2026 15:57
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ycombinator ycombinator requested review from cmacknz and kruskall May 11, 2026 23:09
Comment thread internal/pkg/api/handleArtifacts.go Outdated
if !ok {
continue
}
amMap, ok := am.(map[string]interface{})
Copy link
Copy Markdown
Contributor

@michel-laterman michel-laterman May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go's newer conventions prefers any over interface{}
this is being enforced by the go fix check that can be ran with mage check:fix

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in b00c5df.

Comment thread internal/pkg/api/handleArtifacts.go Outdated

func policyHasArtifact(pd *model.PolicyData, id, sha2 string) bool {
for _, input := range pd.Inputs {
am, ok := input["artifact_manifest"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define this artifact_manifest as a struct somewhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added model.ArtifactManifest and model.ManifestEntry structs in 3e57a6e.

ycombinator and others added 2 commits May 12, 2026 05:49
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Defines model.ArtifactManifest and model.ManifestEntry structs so
policyHasArtifact no longer navigates untyped map[string]any chains.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

TL;DR

check-ci is failing because internal/pkg/api/handleArtifacts.go now references model.ArtifactManifest, but that type was added in a generated file (internal/pkg/model/schema.go) and gets removed during CI generation. Move the type to a non-generated model file (or add it to schema source + regenerate) so go fix can compile.

Remediation

  • Define ArtifactManifest/ManifestEntry in a non-generated file such as internal/pkg/model/ext.go or add equivalent definitions in model/schema.json and run mage generate so generated code consistently contains the type.
  • Re-run mage check:ci (or at least mage generate && go fix ./...) to confirm internal/pkg/api compiles under all tags.
Investigation details

Root Cause

handleArtifacts.go now uses model.ArtifactManifest in parseArtifactManifest:

  • internal/pkg/api/handleArtifacts.go:192
  • internal/pkg/api/handleArtifacts.go:201

But the type was introduced in internal/pkg/model/schema.go (:313), which is generated (Code generated by schema-generate. DO NOT EDIT. at :5). CI runs generation before fix:

  • magefile.go:640-643 (check:ci runs Generate first)
  • magefile.go:467-470 (Generate executes go generate ./...)

Because model/schema.json does not define this model, generation drops the manual type addition, and subsequent go fix fails with undefined symbols.

Evidence

internal/pkg/api/handleArtifacts.go:192:58: undefined: model.ArtifactManifest
internal/pkg/api/handleArtifacts.go:201:21: undefined: model.ArtifactManifest
Error: running "go fix ./..." failed with exit code 1

Verification

  • Local reproduction of full CI flow was not completed due environment/toolchain download latency in this run; findings are based on direct Buildkite log evidence plus source/CI target inspection.

Follow-up

  • If you choose the schema-driven route, keep schema.go generated-only and avoid manual edits to generated files going forward.

Note

🔒 Integrity filter blocked 2 items

The following items were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

What is this? | From workflow: PR Buildkite Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

@ycombinator ycombinator enabled auto-merge (squash) May 12, 2026 19:50
ycombinator and others added 2 commits May 12, 2026 16:17
schema.go is code-generated and gets overwritten by mage generate.
Moving ArtifactManifest and ManifestEntry to ext.go keeps them stable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ArtifactManifest and ManifestEntry are not ES document types and only
exist to support parsing within handleArtifacts.go, so they belong there
rather than in the model package.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@michel-laterman michel-laterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Agents enrolled under dummy-policy cannot download Elastic Defend
artifacts because that policy has no artifact_manifest. Enroll the test
agent under security-policy (which has the Elastic Defend integration)
instead.

Add FleetPolicyHasArtifact scaffold helper that polls .fleet-policies
until the policy document references the artifact, ensuring fleet-server's
policy monitor cache is up-to-date before the download attempt. Also
retry the download on 403 to tolerate any remaining cache propagation lag.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants