Update the upload benchmark workflow to work with the BenchmarkLauncher

### Problem Description

Once #570 is resolved, we will have all the functionality needed to launch, manage, and terminate benchmarks. The goal of this issue is to update the logic for uploading internal benchmark results.

One of the main problems we want to address is the case where some instances run into Out Of Memory errors. Right now, when that happens, we have to manually add results for the missing jobs in order for the workflow to succeed. We want to remove this manual step and make sure benchmark results are uploaded after a defined amount of time, even if some instances did not finish or failed unexpectedly.

### Expected behavior

- When launching our internal benchmarks, we should save the BenchmarkLauncher instance used to launch the benchmark.
- Update the logic used by the [upload benchmark workflow](https://github.com/sdv-dev/SDGym/blob/main/.github/workflows/upload_benchmark_results.yml). The idea is:
   - Load the saved BenchmarkConfig daily.
   - Check the status of each instance (Running / Completed / Stopped).
   - If all instances have completed, upload the results.
   - Otherwise, if some instances are still running but we have reached the deadline for uploading the results (timeout + 1 extra day for instance), stop all remaining instances. Update the results files and for the missing jobs, the error inside the `Error` column must be `Instance Error`

### Additional context

In the future, we might consider saving the logs when deleting the remaining instances so we can parse them and identify the exact error that killed the kernel or disrupted the run. This is out of scope for this issue.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the upload benchmark workflow to work with the BenchmarkLauncher #584

Problem Description

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Update the upload benchmark workflow to work with the BenchmarkLauncher #584

Description

Problem Description

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions