Skip to content

feat: auto-regenerate accounts on persistent failures#1942

Open
SantiagoPittella wants to merge 2 commits intonextfrom
santiagopittella-auto-regenerate-accounts
Open

feat: auto-regenerate accounts on persistent failures#1942
SantiagoPittella wants to merge 2 commits intonextfrom
santiagopittella-auto-regenerate-accounts

Conversation

@SantiagoPittella
Copy link
Copy Markdown
Collaborator

closes #1930

When the counter increment task fails repeatedly and wallet re-sync from RPC is ineffective (e.g., after a network reset or protocol upgrade), the monitor now automatically creates fresh wallet and counter accounts, deploys them, and re-initializes the increment task. This triggers after 10 consecutive failures and is rate-limited to once per hour to avoid loops when the network itself is down.

@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-auto-regenerate-accounts branch from 6824aa1 to b6ef68b Compare April 14, 2026 18:37
@Mirko-von-Leipzig
Copy link
Copy Markdown
Collaborator

In theory an alternative would be to separately monitor the genesis hash (or the latest known local chain tip commitment)? This wouldn't catch a protocol upgrade, but in that situation won't we remain broken even if we regenerate?

@SantiagoPittella
Copy link
Copy Markdown
Collaborator Author

In theory an alternative would be to separately monitor the genesis hash (or the latest known local chain tip commitment)? This wouldn't catch a protocol upgrade, but in that situation won't we remain broken even if we regenerate?

The thing is that sometimes we restart the service but not the accounts after a protocol upgrade, causing the service to continue with the old accounts.

@JereSalo JereSalo self-requested a review April 16, 2026 19:23
Copy link
Copy Markdown
Collaborator

@JereSalo JereSalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that in run_counter_tracking_task we assign the counter_account variable, I wonder if after re-creating the accounts this keeps pointing to the previous account and we should update it. Perhaps there's something I'm missing though.

Not for this PR but maybe we should consider adding tests for these kinds of behavior (if you find it's worth it and not too complex).

consecutive_failures,
"re-sync ineffective, regenerating accounts from scratch"
);
last_regeneration = Some(Instant::now());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move this inside the Ok path of try_regenerate_accounts, so that if, for some reason, fails with an error we can try again in a short period of time instead of having to wait for an hour.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it was intentional though and we still want to wait upon failure, but I'll leave this JIC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Monitor: Auto-regenerate accounts after persistent failures

3 participants