agentbeats

Star

Here are 11 public repositories matching this topic...

RDI-Foundation / amber

Star

Capability-based compiler/runner for reproducible agent scenarios

benchmark multi-agent agentic-ai agentbeats

Updated May 22, 2026
Rust

Samir-atra / code_translator_green_agent

Star

Code translator green agent (Judge) of an agent-as-a-judge programming languages translation.

ai leaderboard evaluation gemini-api agentic-ai googleadk agentbeats

Updated Jan 28, 2026
Python

yonghongzhang-io / green-comtrade-bench-v2

Star

Deterministic offline ComtradeBench judge for evaluating agent robustness under pagination, retries, duplicates, page drift, and totals traps.

benchmark comtrade data-quality agentbeats deterministic-evaluation api-faults robust-agents

Updated Mar 23, 2026
Python

abhishec / purple-agent-business-process-worker

Star

Business Process AI Worker · τ²-Bench #1 globally (3/3, 100%) · CRMArenaPro Run 8 · Reflexive Agent Architecture

agent crm multi-agent workflow-automation claude ai-agent llm anthropic llm-agent agentbeats business-process-agent tau2-bench

Updated Apr 30, 2026
Python

shikibuton10x / tau2-green-agent

Star

A2A-compatible Green Agent that runs τ²-bench end-to-end on AgentBeats

docker benchmark uv a2a tau2 agentbeats

Updated Jan 16, 2026
Python

dakshdoesdev / sre-enginnerllm

Star

Fault-injecting OpenEnv training environment for vibe-coded SaaS incidents. 30 scenarios grounded in 2025-26 production failures. Drop-in OpenClaw-RL pool server. Claude Code skill included.

reinforcement-learning hackathon incident-response sre llm-agents grpo vibe-coding claude-code openenv agentbeats openenv-environment openenv-hackathon vibe-coded-saas

Updated Apr 25, 2026
Python

PowerForYou74 / agentbeats-debate-leaderboard

Star

AgentBeats Debate Leaderboard — CellRepair AI Purple Agent (96.5% Win Rate, #1 Ranking)

competition leaderboard debate ai-agents llm agentbeats cellrepair

Updated Feb 18, 2026
Python

yonghongzhang-io / purple-comtrade-baseline-v2

Star

Baseline purple agent for the ComtradeBench benchmark: UN Comtrade tool-use under adversarial API conditions.

benchmark comtrade tool-use llm-agent agentbeats baseline-agent

Updated Jan 27, 2026
Python

PowerForYou74 / cellrepair-agentx-purple

Star

CellRepair AI – AgentX Purple Agent. 3-Layer Fallback. Zero Downtime.

competition multi-agent debate autonomous-agents ai-agents llm agentbeats cellrepair

Updated Feb 20, 2026
Python

Clinical-Quality-Artifical-Intelligence / NurseSim-RL

Star

AI-powered clinical triage simulation using Manchester Triage System (MTS). OpenEnv Challenge 2026 entry with A2A protocol support.

docker reinforcement-learning ai healthcare llama nursing medical-education triage gymnasium huggingface clinical-ai agent-to-agent openenv agentbeats manchester-triage

Updated Feb 26, 2026
Jupyter Notebook

yonghongzhang-io / agentbeats-leaderboard-v2

Star

Leaderboard infrastructure for the ComtradeBench / AgentBeats agent-evaluation benchmark: task definitions, submission flow, and scoring.

benchmark leaderboard comtrade agent-evaluation agentbeats

Updated Jan 31, 2026
Python

Improve this page

Add a description, image, and links to the agentbeats topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the agentbeats topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agentbeats

Here are 11 public repositories matching this topic...

RDI-Foundation / amber

Samir-atra / code_translator_green_agent

yonghongzhang-io / green-comtrade-bench-v2

abhishec / purple-agent-business-process-worker

shikibuton10x / tau2-green-agent

dakshdoesdev / sre-enginnerllm

PowerForYou74 / agentbeats-debate-leaderboard

yonghongzhang-io / purple-comtrade-baseline-v2

PowerForYou74 / cellrepair-agentx-purple

Clinical-Quality-Artifical-Intelligence / NurseSim-RL

yonghongzhang-io / agentbeats-leaderboard-v2

Improve this page

Add this topic to your repo