Rebuilding Alexa: How Amazon is mixing models, agents and browser-use for smarter AI

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Amazon is betting on agent interoperability and model mixing to make its new Alexa voice assistant more effective, retooling its flagship voice assistant with agentic capabilities and browser-use tasks.

This new Alexa has been rebranded to Alexa+, and Amazon is emphasizing that this version “does more.” For instance, it can now proactively tell users if a new book from their favorite author is available, or that their favorite artist is in town — and even offer to buy a ticket. Alexa+ reasons through instructions and taps “experts” in different knowledge bases to answer user questions and complete tasks like “Where is the nearest pizza place to the office? Will my coworkers like it? — Make a reservation if you think they will.”

In other words, Alexa+ blends AI agents, computer use capabilities and knowledge it learns from the larger Amazon ecosystem to be what Amazon hopes is a more capable and smarter home voice assistant.

Alexa+ currently runs on Amazon’s Nova models and models from Anthropic. However, Daniel Rausch, Amazon’s VP of Alexa and Echo, told VentureBeat that the device will remain “model agnostic” and that the company could introduce other models (at least models available on Amazon Bedrock) to find the best one for accomplishing tasks.

“[It’s about] choosing the right integrations to complete a task, figuring out the right sort of instructions, what it takes to actually complete the task, then orchestrating the whole thing,” said Rausch. “The big thing to understand about it is that Alexa will continue to evolve with the best models available anywhere on Bedrock.”

What is model mixing?

Model mixing or model routing lets enterprises and other users choose the appropriate AI model to tap on a query-by-query basis. Developers increasingly turn to model mixing to cut costs. After all, not every prompt needs to be answered by a reasoning model; some models perform certain tasks better.

Amazon’s cloud and AI unit, AWS, has long been a proponent of model mixing. Recently, it announced a feature on Bedrock called Intelligent Prompt Routing, which directs prompts to the best model and model size to resolve the query.

And, it could be working. “I can tell you that I cannot say for any given response from Alexa on any given task what model it’s using,” said Rausch.

Agentic interoperability and orchestration

Rausch said Alexa+ brings agents together in three different ways. The first is the traditional API; the second is deploying agents that can navigate websites and apps like Anthropic’s Computer Use; the third is connecting agents to other agents.

“But at the center of it all, orchestrating across all those different kinds of experiences are these baseline, very capable, state-of-the-art LLMs,” said Rausch.

He added that if a third-party application already has its own agent, that agent can still talk to the agents working inside Alexa+ even if the external agent was built using a different model.

Rausch emphasized that the Alexa team used Bedrock’s tools and technology, including new multi-agent orchestration tools.

Anthropic CPO Mike Krieger told VentureBeat that even earlier versions of Claude won’t be able to accomplish what Alexa+ wants.

“A really interesting ‘Why now?’ moment is apparent in the demo, because, of course, the models have gotten better,” said Krieger. “But if you tried to do this with 3.0 Sonnet or our 3.0 level models, I think you’d struggle in a lot of ways to use a lot of different tools all at once.”

Although neither Rausch nor Krieger would confirm which specific Anthropic model Amazon used to build Alexa+, it’s worth pointing out that Anthropic released Claude 3.7 Sonnet on Monday, and it is available on Bedrock.

Large investments in AI

Many user’s first brush with AI came through AI voice assistants like Alexa, Google Home or even Apple’s Siri. Those let people outsource some tasks, like turning on lights. I do not own an Alexa or Google Home device, but I learned how convenient having one could be when staying at a hotel recently. I could tell the Alexa to stop the alarm, turn on the lights and open a curtain while still under the covers.

But while Alexa, Google Home devices, and Siri became ubiquitous in people’s lives, they began showing their age when generative AI became popular. Suddenly, people wanted more real-time answers from AI assistants and demanded smarter task resolutions, such as adding multiple meetings to calendars without the need for much prompting.

Amazon admitted that the rise of gen AI, especially agents, has made it possible for Alexa to finally meet its potential.

“Until this moment, we were limited by the technology in what Alexa could be,” Panos Panay, Amazon’s devices and services SVP, said during a demo.

Rausch said the hope is that Alexa+ continues to improve, add new models and hopefully make more people comfortable with what the technology can do.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link