Skip to content

Optimize ListRedisScheduleSource: replace SCAN with sorted set time index#121

Open
tmkarthi wants to merge 5 commits intotaskiq-python:mainfrom
8loop-ai:main
Open

Optimize ListRedisScheduleSource: replace SCAN with sorted set time index#121
tmkarthi wants to merge 5 commits intotaskiq-python:mainfrom
8loop-ai:main

Conversation

@tmkarthi
Copy link

@tmkarthi tmkarthi commented Feb 25, 2026

Summary

  • Replace the SCAN-based discovery of past time keys with a ZRANGEBYSCORE lookup on a new {prefix}:time_index sorted set, eliminating full keyspace scans that strain Redis when there are many keys
  • Track time keys in the sorted set automatically on add_schedule, and clean up stale entries (older than 5 minutes with empty lists) lazily from delete_schedule, rate-limited to once per minute to avoid excess Redis calls
  • Fetch past time schedules on every get_schedules call (not just the first run) so that schedules added in a past minute after the previous call are never missed
  • Pass current_time from get_schedules into _get_previous_time_schedules to prevent duplicate fetches when the minute boundary crosses between the two calls
  • Add populate_time_index constructor parameter for users upgrading from older versions: set to True once to backfill the index via a one-time SCAN, then False for all subsequent runs
  • Update README with new Redis key documentation and populate_time_index migration guide

Test plan

  • Existing tests for cron, interval, timed schedules, removal, and migration all pass
  • test_time_index_populated_on_add — verifies add_schedule writes to the sorted set
  • test_time_index_not_eagerly_cleaned_on_delete — verifies no race-prone eager cleanup
  • test_cleanup_removes_old_empty_entries — stale empty entries older than 5 min are removed
  • test_cleanup_keeps_non_empty_entries — entries with remaining schedules are preserved
  • test_cleanup_keeps_recent_empty_entries — entries within 5-min window are not removed
  • test_past_schedules_found_via_time_index — past schedules discovered via sorted set
  • test_populate_time_index_from_existing_keys — populate_time_index=True backfills index
  • test_post_send_triggers_cleanup — full lifecycle: add → get → post_send → cleanup
  • test_cleanup_rate_limited — cleanup runs at most once per minute
  • test_cron_and_interval_not_in_time_index — non-time schedules don't pollute the index

- Introduced `populate_time_index` parameter to backfill the time index from existing keys.
- Updated `startup` method to populate the time index if `populate_time_index` is set to True.
- Modified schedule addition and deletion to manage the time index sorted set.
- Added tests to verify time index population and cleanup behavior.
- Added `_maybe_cleanup_time_index` method to manage time index cleanup at most once per minute.
- Introduced `_cleanup_time_index` method to remove stale entries older than one hour with empty time key lists.
- Updated `delete_schedule` to call `_maybe_cleanup_time_index` for efficient cleanup.
- Enhanced tests to verify the behavior of the new cleanup methods, ensuring proper handling of stale and recent entries.
…ameter

- Updated the _get_previous_time_schedules method to take current_time as an argument, allowing for more precise cutoff calculations.
- Adjusted the logic to use the provided current_time for determining previous schedules, ensuring no overlap with the current window.
- Modified the call to _get_previous_time_schedules in the first run logic to pass the current_time parameter.
- Removed the `_is_first_run` flag to simplify the schedule fetching logic.
- Updated `get_schedules` to fetch past time schedules on every call, ensuring no schedules are missed within the current minute and previous minute.
- Enhanced documentation to reflect the new behavior of schedule retrieval.
- Expanded README to include details on interval tasks and the new `{prefix}:time_index` sorted set for tracking schedules.
- Updated cleanup logic in `ListRedisScheduleSource` to remove stale entries older than 5 minutes instead of 1 hour.
- Modified tests to reflect the new 5-minute threshold for cleanup, ensuring accurate verification of the time index behavior.
@tmkarthi tmkarthi changed the title feat: replace SCAN with sorted set index for time-based schedule lookups Optimize ListRedisScheduleSource: replace SCAN with sorted set time index Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant