2024-04-01: Coding Agents Tackle Github Issues
Fast Approaching Clff for Computer Science
🔷 Subscribe to get breakdowns of the most important developments in AI in your inbox every morning.
Here’s today at a glance:
🗃️ Coding Agents Tackle Github Issues
Paper: MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue ReSolution
Who: Fudan University et al.
What: An AI agent that can automatically resolve Github issues that require code changes across an entire repository
How:
MAGIS consists of four types of AI agents that collaborate:
Manager: Plans the resolution and assembles a team of developer agents
Repository Custodian: Locates code files relevant to the issue
Developers: Modify the code based on the plan
Quality Assurance (QA): Reviews and validates the code changes
The agents work together to analyze issues, find relevant files, create a plan, implement code changes, and verify the modifications. The inspiration comes from Github Flow, a commonly used workflow paradigm.
The research group:
Evaluated MAGIS on the SWE-bench dataset of real GitHub issues and compared it to LLMs like GPT-3.5, GPT-4, and Claude-2
MAGIS successfully resolved 13.94% of issues, an 8-fold improvement over applying GPT-4 directly, demonstrating the effectiveness of the multi-agent approach
Ablation studies showed the QA agent and ability to utilize information from comments also boosted performance
MAGIS marks a significant step towards using AI to automatically resolve complex software issues that span multiple files and require understanding the relationships between code components.
If we are at 14% now... we will probably fully solve coding within 24 months at this rate. Ridiculous.
🗞️ Things Happen
Cryptic post from the head of Tesla’s self-driving team






