Popular repositories Loading
-
skillsbench
skillsbench PublicSkillsBench evaluates how well skills work and how effective agents are at using them
-
-
Repositories
- skillsbench Public
SkillsBench evaluates how well skills work and how effective agents are at using them
benchflow-ai/skillsbench’s past year of commit activity - mini-swe-agent Public Forked from SWE-agent/mini-swe-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
benchflow-ai/mini-swe-agent’s past year of commit activity - benchflow Public
Framework for creating high fidelity and complex RL environments and evaluation tasks
benchflow-ai/benchflow’s past year of commit activity - skillsbench-leaderboard Public
benchflow-ai/skillsbench-leaderboard’s past year of commit activity - benchmarks Public
benchflow-ai/benchmarks’s past year of commit activity - skillsbench-trajectories Public
benchflow-ai/skillsbench-trajectories’s past year of commit activity - mockflow Public
benchflow-ai/mockflow’s past year of commit activity - agent-client-protocol Public Forked from agentclientprotocol/agent-client-protocol
A protocol for connecting any editor to any agent
benchflow-ai/agent-client-protocol’s past year of commit activity - cli Public Forked from googleworkspace/cli
Google Workspace CLI — one command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, Admin, and more. Dynamically built from Google Discovery Service. Includes AI agent skills.
benchflow-ai/cli’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…