Microsoft's new generative AI agents are more autonomous than Copilot

By Axios | Created at 2024-10-21 09:35:53 | Updated at 2024-10-21 11:40:08 2 hours ago

While some are eager to declare the era of autonomous agents upon us, the reality is a bit more complicated — and that's probably a good thing.

Why it matters: Giving autonomy to generative AI tools opens up a range of tantalizing possibilities for increased productivity, but also vastly increases the potential of catastrophic risk.

Driving the news: Microsoft on Monday announced a new series of semi-autonomous agents that business customers can either configure to their liking or use straight out of the box.

Microsoft's fresh crop of agents will qualify sales leads, communicate with suppliers and understand customer intent. Next month Microsoft will release a tool in public preview that will allow users to customize agents in Copilot Studio.
Some of the new agents are included as part of Microsoft's $30-per-worker-per-month Microsoft 365 Copilot. Other agents work within Microsoft Dynamics and other tools and are priced separately ($200 for up to 25,000 messages per month).

The big picture: Agents that can act autonomously (within confined boundaries) are the logical next evolution of generative AI, which has thus far largely been limited to providing information for humans to act on.

Agents, by contrast, are designed to operate partly or entirely without direct human intervention, though best practices call for thorough testing and close oversight.
Sierra, a start-up from former Salesforce executive and OpenAI chair Bret Taylor and ex-Google exec Clay Bavor, has been focused on AI agents from the start, while Salesforce, Google and others have been heavily touting the approach only in recent weeks.

Zoom in: On the plus side, agents can work 24/7 and a small number of humans can theoretically oversee vast numbers of AI agents. Even with a great AI assistant, there are finite limits to human productivity.

The risk of agents, however, is that generative AI, by its nature, doesn't always respond predictably — and without a human to approve each action, it could act in harmful ways. A Google DeepMind paper from this year highlighted such concerns.
Companies try to mitigate this by having agents perform a small set of known tasks, often with specific rules and limits. For example, an AI customer service agent might be able to answer a range of questions about orders, but only provide refunds or discounts up to a set amount.
Some companies learned this lesson on guardrails the hard way by empowering chatbots to, for example, sell a plane ticket or car for well below normal cost.
Being able to highlight when human intervention is necessary is key, says Microsoft corporate VP Charles Lamanna. "Because if, say, the agent can do the work 90% of the time, if it didn't have a way to call a person to help with that 10%, you couldn't actually use it."

Between the lines: Microsoft says it sees agents as separate from, and an addition to, highly personalized AI copilots that help an individual worker with their tasks.

You don't want just a copilot or just an agent," Lamanna told Axios. "You want both."
"You'll want every employee to have an AI copilot to make them more productive," he said. "And you'll want every process or every department to have a multitude of agents which are able to complete tasks."

The other side: Salesforce CEO Marc Benioff, meanwhile, has been bashing both the notion of a copilot and Microsoft's interpretation, comparing it to Clippy, the company's ill-fated Office assistant.

"When you look at how Copilot has been delivered to customers, it's disappointing.," he said on X (formerly Twitter) this past week, echoing comments he has been making since August. "It just doesn't work, and it doesn't deliver any level of accuracy."
Benioff also claims that Copilots are spilling corporate data, a charge that Microsoft strongly denies.

What's next: Consumer chatbots aren't going anywhere, nor are copilots, but AI agents will likely start to attract even more of the buzz. Over time, their purview could also grow to handling larger tasks and taking more steps without involving people.

Read Entire Article