Bugs, Debugs, and Rebugs: AI Does Great Work

April 21, 2025

AI algorithms have already been assimilated into everyday technology, but AI still has problems or you could say there’s a bug in their code. TechCrunch tells us that, “AI Models Still Struggle To Debug Software, Microsoft Study Shows.” Large AI Models such as Anthropic, OpenAI, and more are used for programming. Mark Zuckerberg of Facebook plans to deploy AI coding models at his company, while Sundar Pichai, the Google CEO, said that 25% of code is AI generated.

AI algorithms are great at automating tasks, but they shouldn’t be relied on 100% for all programming projects. Microsoft Research released a new study that discovered AI models like Cause 3.7 Sonnet and 03-mini fail to debug problems in SWE-bench Lite, a software development benchmark. Humans still beat technology when it comes to coding. Here’s what the study did and found:

“The study’s co-authors tested nine different models as the backbone for a “single prompt-based agent” that had access to a number of debugging tools, including a Python debugger. They tasked this agent with solving a curated set of 300 software debugging tasks from SWE-bench Lite. According to the co-authors, even when equipped with stronger and more recent models, their agent rarely completed more than half of the debugging tasks successfully. Claude 3.7 Sonnet had the highest average success rate (48.4%), followed by OpenAI’s o1 (30.2%), and o3-mini (22.1%).”

What is the problem? It’s one that AI has faced since it was first programmed: lack of data for training.

More studies show that AI generated code creates security vulnerabilities too. Is anyone surprised? (Just the AI marketers who do not understand why their assertions don’t match reality.)

Whitney Grace, April 21, 2025

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta