Less is more: Meta study shows shorter reasoning improves AI accuracy by 34%

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Researchers from Meta’s FAIR team and The Hebrew University of Jerusalem have discovered that forcing large language models to “think” less actually improves their performance on complex reasoning tasks.

The study released today found that shorter reasoning processes in AI systems lead to more accurate results while significantly reducing computational costs.

“In this work, we challenge the assumption that long thinking chains results in better reasoning capabilities,” write the authors in their paper titled “Don’t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning.”

The research contradicts the prevailing trend in AI development, where companies have invested heavily in scaling up computing resources to allow models to perform extensive reasoning through lengthy “thinking chains” — detailed step-by-step trajectories that AI systems use to solve complex problems.

AI accuracy jumps 34% when models use shorter reasoning chains

The researchers discovered that within the same reasoning task, “shorter reasoning chains are significantly more likely to yield correct answers — up to 34.5% more accurate than the longest chain sampled for the same question.” This finding held true across multiple leading AI models and benchmarks.

“While demonstrating impressive results, [extensive reasoning] incurs significant computational costs and inference time,” the authors note, pointing to a substantial inefficiency in how these systems are currently deployed.

Based on these findings, the team developed a novel approach called “short-m@k,” which executes multiple reasoning attempts in parallel but halts computation once the first few processes complete. The final answer is then selected through majority voting among these shorter chains.

New ‘short-m@k’ method slashes computing costs by 40% while boosting performance

For organizations deploying large AI reasoning systems, the implications could be substantial. The researchers found their method could reduce computational resources by up to 40% while maintaining the same level of performance as standard approaches.

“Short-3@k, while slightly less efficient than short-1@k, consistently surpasses majority voting across all compute budgets, while still being substantially faster (up to 33% wall time reduction),” the paper states.

Michael Hassid, the paper’s lead author, and his team also discovered that training AI models on shorter reasoning examples improved their performance — challenging another fundamental assumption in AI development.

“Training on the shorter ones leads to better performance,” the researchers write. “Conversely, finetuning on S1-long increases reasoning time with no significant performance gains.”

Tech giants could save millions by implementing “don’t overthink it” approach

The findings come at a critical time for the AI industry, as companies race to deploy increasingly powerful models that consume enormous computational resources.

“Our findings suggest rethinking current methods of test-time compute in reasoning LLMs, emphasizing that longer ‘thinking’ does not necessarily translate to improved performance and can, counter-intuitively, lead to degraded results,” the researchers conclude.

‘This research stands in contrast to other prominent approaches. Previous influential studies, including OpenAI’s work on “chain-of-thought” prompting and “self-consistency” methods, have generally advocated for more extensive reasoning processes. It also builds upon recent work like Princeton and Google DeepMind’s “Tree of Thoughts” framework and Carnegie Mellon’s “Self-Refine” methodology, which have explored different approaches to AI reasoning.

For technical decision makers evaluating AI investments, the research suggests that bigger and more computationally intensive isn’t always better. The study points toward potential cost savings and performance improvements by optimizing for efficiency rather than raw computing power.

In an industry obsessed with scaling up, it turns out that teaching AI to be more concise doesn’t just save computing power — it makes the machines smarter too. Sometimes, even artificial intelligence benefits from the age-old wisdom: don’t overthink it.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link