Skip to content

fix: delete orphaned digest directories when last tag is removed#111

Open
hiroTamada wants to merge 4 commits intomainfrom
fix/gc-orphaned-image-digests
Open

fix: delete orphaned digest directories when last tag is removed#111
hiroTamada wants to merge 4 commits intomainfrom
fix/gc-orphaned-image-digests

Conversation

@hiroTamada
Copy link
Contributor

@hiroTamada hiroTamada commented Feb 27, 2026

Summary

  • Fix orphaned digest directory accumulation by implementing eager garbage collection
  • When DeleteImage removes the last tag referencing a digest, the digest directory is now automatically deleted
  • If multiple tags share the same digest, it's preserved until the last tag is removed

Background

Previously, DeleteImage only removed the tag symlink, leaving digest directories (containing erofs rootfs files) on disk indefinitely. This was documented in the README as "Old digests remain until explicitly garbage collected" but no GC was ever implemented.

At current volume (~20 GB/month), the 3.5 TB disk on prod-iad-hypeman-1 has runway, but this becomes a concern at higher deployment volumes post-rollout.

Changes

  • storage.go: Added countTagsForDigest() and deleteDigest() helper functions
  • manager.go: Updated DeleteImage() to check for orphaned digests and delete them
  • manager_test.go: Updated existing test + added TestDeleteImagePreservesSharedDigest
  • README.md: Updated documentation to reflect new behavior

CI fixes (pre-existing issues on main)

  • go.mod: Updated oapi-codegen/runtime v1.1.2 → v1.2.0 (fixes missing StyleParamWithOptions)
  • Makefile: Pinned oapi-codegen to v2.5.1 (v2.6.0 generates incompatible struct tags)

Test plan

  • TestDeleteImage - verifies digest is deleted when orphaned
  • TestDeleteImagePreservesSharedDigest - verifies digest is preserved when other tags reference it
  • TestDeleteImageNotFound - unchanged, still passes
  • Manual test on staging: deploy, stop, verify disk space is reclaimed

Note on CI fixes

This PR includes fixes for two pre-existing CI issues unrelated to the GC changes:

  1. Runtime version: CI regenerates oapi.go with oapi-codegen@latest, which uses runtime.StyleParamWithOptions added in v1.2.0, but go.mod had v1.1.2.

  2. Generator version: oapi-codegen v2.6.0 changed struct tag generation (adds omitempty to some fields), causing type mismatches. Pinned to v2.5.1 to match committed code.

Both issues caused CI failures starting with commit 08958b8 on main.


Note

Medium Risk
Changes deletion semantics for on-disk image storage by removing digest directories automatically, with added locking to avoid races; bugs here could delete still-referenced data or leave dangling tags under concurrency.

Overview
Image deletion now performs eager cleanup. DeleteImage resolves the tag’s digest, removes the tag symlink, then (under createMu) counts remaining tags pointing at that digest and deletes the digest directory when it becomes orphaned.

Storage helpers + tests/documentation updated. Adds countTagsForDigest/deleteDigest, updates tests to assert digest dir removal and adds a shared-digest preservation test, and updates the images README to reflect the new GC behavior.

Build/CI stability tweaks. Pins oapi-codegen in the Makefile and bumps github.com/oapi-codegen/runtime to v1.2.0 to match generated code expectations.

Written by Cursor Bugbot for commit 0f6b308. This will update automatically on new commits. Configure here.

Previously, DeleteImage only removed the tag symlink, leaving the digest
directory (containing the erofs rootfs) on disk indefinitely. This caused
orphaned digests to accumulate over time.

Now DeleteImage checks if the digest is orphaned after removing the tag,
and deletes the digest directory if no other tags reference it. This is
eager GC - cleanup happens immediately at delete time.

Made-with: Cursor
@hiroTamada hiroTamada marked this pull request as ready for review February 27, 2026 16:48
hiroTamada added 2 commits February 27, 2026 11:51
The CI workflow runs `make oapi-generate` which uses `oapi-codegen@latest`.
The latest generator outputs code using `runtime.StyleParamWithOptions` which
was added in runtime v1.2.0, but go.mod had v1.1.2 pinned.

This caused CI failures on main starting with commit 08958b8.

Made-with: Cursor
The CI workflow regenerates oapi.go using `oapi-codegen@latest`. When v2.6.0
was released, it changed struct tag generation (adding omitempty to some fields),
causing type mismatches with code in instances.go that uses inline struct literals.

Pinning to v2.5.1 ensures CI regenerates the same code that's committed.

Made-with: Cursor
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

1. Race condition (medium severity): Hold createMu during the orphan
   check and delete sequence to prevent a concurrent CreateImage from
   creating a tag pointing to the same digest between count and delete.

2. Silent errors (low severity): Actually log errors when countTagsForDigest
   or deleteDigest fails, instead of silently swallowing them.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants