Add DurableFuture#or_timeout + Restate::TimeoutError#12
Open
junyuanz1 wants to merge 3 commits into
Open
Conversation
Brings the Ruby SDK to feature parity with the TypeScript and Java SDKs for the "race a future against a deadline" use case. Today the Ruby SDK has no direct equivalent of: * TypeScript: +RestatePromise.orTimeout(duration)+ → https://github.com/restatedev/sdk-typescript/blob/main/packages/libs/restate-sdk/src/promises.ts * Java: +Awaitable.orTimeout(Duration)+ → https://github.com/restatedev/sdk-java/blob/main/sdk-common/src/main/java/dev/restate/sdk/common/TimeoutException.java Ruby users currently have to hand-roll +Restate.sleep+ + +Restate.wait_any+ + +completed?+ branching at every call site, which is verbose and easy to get wrong (especially around when to .cancel the call invocation on timeout). Changes: * +Restate::TimeoutError+ — subclass of +Restate::TerminalError+, default message "Timeout occurred", HTTP status 408. Inheriting from TerminalError lets the existing +rescue Restate::TerminalError+ idiom in user handlers catch timeouts uniformly with other terminal failures, matching the TS type hierarchy (+TimeoutError extends TerminalError+ in +types/errors.ts+). * +DurableFuture#or_timeout(duration)+ — race against +Restate.sleep+. Returns the future's value on win; raises +TimeoutError+ if the sleep wins. * +DurableCallFuture#or_timeout(duration)+ — refines the base to call +#cancel+ on the remote invocation when the timeout wins, so the callee doesn't keep running after the caller gave up. Same refinement TS makes — see the +InvocationPromise+ specialization in TS that calls +ctx.cancel(invocationId)+ on timeout. * RSpec coverage at +spec/or_timeout_spec.rb+ — 8 examples covering happy/timeout paths on both future types plus error-class invariants. Stubs +Restate.sleep+/+Restate.wait_any+ so the spec runs without a live VM, matching the existing +server_context_outbound_middleware_spec.rb+ style. * +docs/USER_GUIDE.md+ "Timeouts" subsection with the usage pattern and a documented caveat about the orphan-sleep footprint (see "Design notes" below). == Why HTTP status 408 (not 409) 408 (Request Timeout) is the correct HTTP semantic for a timeout and matches the TypeScript SDK (+packages/libs/restate-sdk/src/types/errors.ts+): export const TIMEOUT_ERROR_CODE = 408; export const CANCEL_ERROR_CODE = 409; The Java SDK's +TimeoutException+ uses 409, but 409 is what TS reserves for +CancelledError+ — Java appears to be the outlier and the choice there looks like a copy-paste from CancelledException. Picking 408 here keeps the Ruby SDK aligned with both standard HTTP semantics and the larger TS ecosystem. == Design note: the orphan-sleep footprint Both this implementation and the existing TS +orTimeout+ have the same property: when the work future wins the race, the sleep handle remains in the journal because +restate-sdk-shared-core+ 0.7.0 exposes no +sys_cancel_handle+ primitive. The wake-up is a no-op against a completed handler but keeps the invocation row alive in Restate's state until the timer fires — meaningful on long deadlines. The TS implementation has this footprint too (see +packages/libs/restate-sdk/src/promises.ts+'s +orTimeout+, which uses raw +ctx.sleep+ inside the combinator). This PR matches that behavior 1:1 and documents the caveat + the workaround (cancellable-deadline pattern via a separate scheduled invocation + +SendHandle#cancel+) in the user guide. A follow-up could either: * Add a +#with_cancellable_deadline+ helper that routes the timer through a small bundled +DeadlineTrigger+ service, or * Raise the gap against +restate-sdk-shared-core+ for a real +sys_cancel_handle+ primitive — which would let every SDK fix the leak at the source. Out of scope for this PR. Test results: +bundle exec rspec+ — 82 examples, 0 failures.
Fills in the four remaining surfaces that callers of the new API touch: * +sig/restate.rbs+ — adds +TimeoutError < TerminalError+ and +DurableFuture#or_timeout+ / +DurableCallFuture#or_timeout+ signatures alongside the existing ones. +bundle exec steep check+ passes. * +docs/USER_GUIDE.md+ — adds a +TimeoutError+ subsection inside +## Error Handling+ so the +rescue Restate::TimeoutError+ pattern is discoverable from the canonical error docs, not just from the +Timeouts+ subsection. Also adds +or_timeout+ to the +service_communication.rb+ row in the examples-mapping table so the table stays accurate. * +docs/INTERNALS.md+ — extends the Durable Futures section so the +or_timeout+ method shows up on both +DurableFuture+ and +DurableCallFuture+ alongside the existing +cancel+ docs. Notes the orphan-handle footprint at the same source-of-truth as the rest of the future internals. * +examples/service_communication.rb+ — adds a +with_deadline+ handler that demonstrates +Worker.call.process(task).or_timeout(5)+ with a +rescue Restate::TimeoutError+ block, so the example matches the entry now listed in the user-guide table. No code or behavior changes — pure docs/sig fill-in. Test results: +bundle exec rspec+ — 82 examples, 0 failures. +bundle exec steep check+ — no type errors.
igalshilman
reviewed
May 21, 2026
Contributor
igalshilman
left a comment
There was a problem hiding this comment.
Thanks for your contribution @junyuanz1
couple of quick notes:
- re auto cancelation - I think that this is very much use case by use case specific. and some would find that surprising. For example you might be `racing in a loop between multiple calls, you might find it surprising that the losing calls were canceled.
- regarding
or_timeoutin typescript and others, this itself returns aDurableFuturewhich can be combined later.
such functionality does not exist yet, thereforeRestate.any/Restate.raceis a top level blocking operation. - perhaps it will be simpler to do
Restate.with_timeout( durable future , timeout )?
We'd need to make these futures combinable but it is a slightly more involved task.
igalshilman
reviewed
May 21, 2026
igalshilman
reviewed
May 21, 2026
Match TS RestatePromise.orTimeout / Java DurableFuture.withTimeout: the timer firing raises Restate::TimeoutError without cancelling the underlying call. Callers who want the remote invocation stopped rescue the error and invoke #cancel themselves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
igalshilman
approved these changes
May 22, 2026
Contributor
igalshilman
left a comment
There was a problem hiding this comment.
Awesome, thank you.
will merge and release.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings the Ruby SDK to feature parity with the TypeScript and Java SDKs for the race-a-future-against-a-deadline use case. Today Ruby users have to hand-roll
Restate.sleep+Restate.wait_any+completed?branching at every call site — verbose and easy to get wrong.Peer SDK references
RestatePromise.orTimeout(duration)packages/libs/restate-sdk/src/promises.ts#L155-L172class TimeoutError extends TerminalErrorpackages/libs/restate-sdk/src/types/errors.ts#L121-L126TIMEOUT_ERROR_CODE = 408/CANCEL_ERROR_CODE = 409packages/libs/restate-sdk/src/types/errors.ts#L17-L18DurableFuture.withTimeout(Duration)/await(Duration)sdk-api/.../DurableFuture.java#L67-L84class TimeoutException extends TerminalExceptionsdk-common/.../TimeoutException.java#L11-L14Docs context: Concurrent tasks (TS) · Durable timers (TS) · Competitive Racing pattern
Changes
Restate::TimeoutError— subclass ofRestate::TerminalError. Default message"Timeout occurred", HTTP status408. Inheriting fromTerminalErrorlets the existingrescue Restate::TerminalErroridiom catch timeouts uniformly, matching TS'sTimeoutError extends TerminalErrorand Java'sTimeoutException extends TerminalException.DurableFuture#or_timeout(duration)— races againstRestate.sleep. Returns the future's value on win; raisesTimeoutErrorif the sleep wins. Shape matches TS'sorTimeout.DurableCallFuture#or_timeout(duration)— overrides the base to also#cancelthe remote invocation when the timeout wins, so the callee doesn't keep running after the caller gave up. This goes beyond what TS and Java do today — neither TS'sBaseRestatePromise.orTimeoutnor Java'sDurableFuture.withTimeoutcancels the work future on timeout; both leave the remote call running.This felt like a correctness win — leaving the callee running after the caller has given up wastes resources and produces spurious results that nobody is waiting for. But happy to drop the override if maintainers prefer strict TS/Java parity (1-line revert + spec adjustment).
spec/or_timeout_spec.rb— 8 RSpec examples covering happy/timeout paths on both future types plus error-class invariants. StubsRestate.sleep/Restate.wait_anyso the spec runs without a live VM, matchingspec/server_context_outbound_middleware_spec.rb.docs/USER_GUIDE.md— new "Timeouts" subsection right after "Sleep". Documents the API + the orphan-sleep caveat + the cancellable-deadline workaround pattern.Design decisions
Inherits from
TerminalError(not a fresh exception hierarchy)Mirrors the TS hierarchy (
TimeoutError extends TerminalError extends RestateError) and the Java one (TimeoutException extends TerminalException). Keeps the existingrescue Restate::TerminalErrordiscipline working for users who don't care which terminal flavor they hit.Method name
or_timeout(matches TS) vs Java'swithTimeoutThere's actual divergence between peer SDKs here:
orTimeout(duration)(single method)withTimeout(Duration)plus a convenience overloadawait(Duration)Picked
or_timeout(snake-case of TS'sorTimeout) because:future.or_timeout(5)→ "or, timeout after 5s")with_timeoutwould suggest a chainable builder, which doesn't match the imperative flow we want hereHappy to switch to
with_timeoutif maintainers prefer aligning with Java; happy to add both as aliases.HTTP status code 408 (not 409)
408 (Request Timeout) is the correct HTTP semantic for a timeout and matches what TS ships:
Java's
TimeoutExceptionuses 409, but 409 is what TS reserves forCancelledError— Java appears to be the outlier and the choice there looks like a copy-paste fromCancelledException. Picking 408 keeps Ruby aligned with both standard HTTP semantics and TS.(Happy to flip to 409 if maintainers prefer matching Java — just flagging the cross-SDK divergence.)
Known limitation (documented, not fixed in this PR)
When the work future wins the race, the sleep handle remains in the journal because
restate-sdk-shared-core0.7.0 exposes nosys_cancel_handleprimitive — onlysys_cancel_invocationfor a separate invocation. The wake-up is a no-op against a completed handler but keeps the invocation row alive in Restate's state until the timer fires.Both peer SDKs have the same footprint:
orTimeoutuses rawctx.sleepinside the combinator with no cancellation.withTimeoutusesctx.timer(timeout, null), same VM primitive, same no-cancel.This PR matches that behavior 1:1 and documents the workaround (cancellable-deadline pattern via a separate scheduled invocation +
SendHandle#cancel) in the user guide.Two reasonable follow-ups, out of scope here:
#with_cancellable_deadlinehelper to the Ruby SDK that bundles a tinyDeadlineTriggerservice.restate-sdk-shared-corefor asys_cancel_handleprimitive — which would let every SDK fix the leak at the source.Happy to do either as a separate PR / issue if there's interest.
Test plan
bundle exec rspec— 82 examples, 0 failuresbundle exec rake compileclean on arm64-darwin23