Skip to content

Use time-based retention for MSQ query result file cleanup#19074

Merged
cecemei merged 6 commits intoapache:masterfrom
cecemei:retention
Mar 10, 2026
Merged

Use time-based retention for MSQ query result file cleanup#19074
cecemei merged 6 commits intoapache:masterfrom
cecemei:retention

Conversation

@cecemei
Copy link
Copy Markdown
Contributor

@cecemei cecemei commented Mar 3, 2026

Description

Adds a configurable retention duration (default: 6 hours) to DurableStorageCleanerConfig and updates the cleaner to retain query result files based on task creation time rather than checking known task IDs.

Release Note

The durable storage cleaner now supports configurable time-based retention for MSQ query results. Previously, query results were retained for all known tasks list, which was unreliable for completed tasks. With this change, query results are retained for a configurable time period based on the task creation time.

The new configuration property druid.msq.intermediate.storage.cleaner.durationToRetain controls the retention period for query results. The default retention period is 6 hours.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@github-actions github-actions Bot added Area - Batch Ingestion Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Mar 3, 2026
@cecemei cecemei marked this pull request as ready for review March 4, 2026 01:00
@cecemei cecemei changed the title Use time-based retention for query result file cleanup, consistent with TaskLogAutoCleanerConfig. Use time-based retention for MSQ query result file cleanup Mar 4, 2026
DurableStorageUtils.QUERY_RESULTS_DIR.equals(nextDirName)
&& DurableStorageUtils.isQueryResultFileActive(
currentFile,
taskId -> Optional.fromNullable(taskStorage.getTaskInfo(taskId)).transform(TaskInfo::getCreatedTime),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be expensive; each usage of taskStorage is a call to the metadata store. It's better to batch these, such as by using one call to get all of the task IDs that have completed within the retention period. Then you can delete anything that doesn't appear in that list.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, ideally that call to get all those task IDs should only look at type query_controller. It will generally be a much smaller and more manageable list than "all tasks".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to only fetch completed task within retention period, we don't have filter by type in CompleteTaskLookup, it wont be too much work to add such filter. i just assume that even without this filter we wont have too much tasks finished within the last 6 hours.

Comment thread processing/src/main/java/org/apache/druid/frame/util/DurableStorageUtils.java Outdated
Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me but please consider the comments prior to merging. They could simplify things.

Comment thread processing/src/main/java/org/apache/druid/frame/util/DurableStorageUtils.java Outdated
@cecemei cecemei merged commit bfcc0e5 into apache:master Mar 10, 2026
36 of 37 checks passed
@github-actions github-actions Bot added this to the 37.0.0 milestone Mar 10, 2026
GWphua pushed a commit to GWphua/druid that referenced this pull request Mar 11, 2026
)

* retention

* final

* format

* review

* review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Release Notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants