a16z's latest perspective: When AI Agents become the primary users of software

Writing by: DeepThink Circle

Have you ever wondered whether the entire logic behind how we build software might need to be completely overturned? For the past few decades, all software has been designed for humans. We’ve poured countless efforts into optimizing user interfaces—so buttons are easier to find, menus are clearer, and workflows run more smoothly. But what if, in the future, the main users of software are no longer humans, but AI agents? If a company has 100 people, but 1,000 AI agents doing the work, should we still focus our main effort on optimizing human interfaces?

Recently, on an episode of a16z’s podcast, Erik Torenberg, Steven Sinofsky, and Martin Casado had an extremely insightful conversation with Box CEO Aaron Levie. The core issue they discussed was this: when AI agents become the primary users of enterprise software, how will the entire software industry be restructured? This conversation made me realize that we’re on the edge of a paradigm shift that’s more intense than most people imagine. This isn’t simply about adding AI features to existing software; it’s about fundamentally rethinking how software should be built, how it should interact, and how it should be used.

Software Must Be Built for AI Agents

Aaron Levie put forward a viewpoint that made me think deeply: if you have 100 times or even 1,000 times more AI agents than employees, then your software must be built for agents. This isn’t a multiple-choice question—it’s an inevitable trend. Box is now spending about as much time thinking about agent interfaces as it does thinking about human interfaces. The speed of this shift is far beyond my expectations.

The underlying logic is actually quite simple. When AI agents become the primary users of software, they will interact with systems through protocols like APIs, CLI (command-line interface), or MCP (Model Context Protocol, model context protocol). So, what paradigm currently looks most effective? Give a code-writing agent access to SaaS tools, so it can access your knowledge work processes and contextual information. Such an agent can not only read and understand information—but more importantly, it can complete tasks by writing code or using APIs.

Anthropic’s Claude Code, OpenAI’s developing super app, and Perplexity’s computing capabilities are all moving in this direction. I believe the compound growth of this kind of capability is just beginning. Imagine an agent that not only understands your request, “Help me analyze last quarter’s sales data,” but can also write code to extract the data, perform analysis, generate visual charts, and even proactively discover trends you hadn’t noticed. Where are the boundaries of this capability? I can’t see them clearly yet.

But there’s a key question that keeps nagging at me: people often say we should “build things for agents,” “market to agents,” and “have good APIs and interface description languages.” Martin Casado offered a counterargument that I strongly agree with: that framing is almost completely wrong. Why? Because agents are precisely best at finding the right tools and backend systems. They won’t choose you just because your API documentation looks pretty; they’ll choose based on substantive factors like cost parameters, system reliability, and data persistence. What agents have is the collective intelligence of humans using these platforms.

This viewpoint was an eye-opener for me. As an industry, we’re too focused on interfaces and APIs, while ignoring the essence: we need to build better systems themselves. Agents will push us back to the technical core, rather than marketing packaging. In the past, enterprise software purchasing decisions were often influenced by sales capability, brand influence, or even business banquets. But in an agent-led world, the weight of these factors will drop dramatically. Agents will make more rational choices based on technical merits. This is a huge opportunity for companies that truly focus on technology itself.

The Barrier of Algorithmic Thinking: Not Everyone Can Command an AI Agent

There was a discussion in the conversation that left a particularly strong impression on me: the real-world challenges of non-technical people using AI agents. Steven Sinofsky made a sharp point: algorithmic thinking is truly, very, very difficult for most people who have jobs. If you ask any person to draw a flowchart for the tasks they need to complete, they’re very likely to fail.

This observation hits the core. Imagine a team of 50 marketers handling a massive product line. Perhaps only one person truly understands and can document the entire workflow. If you put these collaboration tools or AI agents in front of ordinary employees and ask them to create automated workflows, their ability to clearly explain what needs to be done is actually very limited.

Aaron Levie’s response was: this is just work moving up one level—you need to learn a whole new set of skills. This is no different from every technological change in history. He gave an interesting example: an Anthropic growth marketer—one person—used Claude Code to do work that previously required five to ten people. This example matters because that person was already a systems thinker, and already understood enough technical concepts to pull it off.

But the key question is this: if you imagine an infinite pool of engineering resources beside every job role that can automatically handle any work the person wants to automate, what would that role look like in the future? I think this is a question worth thinking about. Maybe agents will become increasingly good at guiding users toward systematic thinking—but at least at this stage, only a small number of people can effectively use these tools.

Steven Sinofsky shared a brilliant analogy. His sister graduated from an elite business school and joined her first job right as the computing era was beginning. She hadn’t used spreadsheets in graduate school. In her first year of work, she was told she could hire any number of interns, so she managed an entire room of “agents”—those college students who did all the spreadsheet work. The magic part is that over the next few years, she and her colleagues became spreadsheet experts too. The abstraction layer moved upward. What interns used to do with calculators and HP financial calculators, she now does herself with spreadsheets—and she can do 30 iterations instead of just 2.

This story made me realize that we’re in a similar stage of AI agent development. You might think you need 50 small agents coordinated by one super-smart person. But soon, these will fold into a set of skills—a piece of code that we call an agent—one that understands marketing. You can ask it marketing-related questions, and then, for the next step, you let it execute tasks.

I think the key turning point is: right now, you have to be on the level of a rocket scientist to create 42 agents and get them all running. But that “rocket science” barrier will disappear quickly, and a large chunk of domain expertise will return to domain experts. This evolution path is almost identical to how spreadsheets evolved.

Enterprise Fears: A Nightmare of Uncontrolled Integration and Permissions

There was a scene in the conversation that really moved me. Aaron Levie said that he recently expressed a similar optimistic viewpoint in front of a room full of CFOs and CIOs, but then six people ran over and said, “You’re crazy—you’ve lost all credibility with me.” Why? Because he said the integration problem will become much easier.

These concerns from enterprise IT leaders are not without reason. They’re afraid not only of the agents themselves, but also of the fact that humans will be authorized to do integrations. When you let employees create new integrations, you’re basically saying, “Please come and break my core systems.” Imagine someone creating a new API connection between system 27 and system 38. If it’s only used to generate reports, then if that person makes a mistake, it’s on them. But what if it involves write operations?

Aaron Levie believes that for a long time (N is a very large number), we will have an agent integration version that’s read-only. Many AI applications today are at the consumption layer—humans are the end consumers. But even at that level, enterprises face new challenges.

Box just rolled out an official CLI. Aaron described a scenario: you give Claude Code the Box CLI, and then you can interact with the entire Box system using natural language, orchestrated by a powerful model like Opus 4.6 to orchestrate a series of actions. It sounds cool: you can say, “Upload this entire folder on my desktop to Box,” or “Handle all documents in this folder,” and it can do it.

But the headaches are what come next. Imagine a company with 5,000 employees, where everyone can access a shared repository of engineering documents and marketing materials, and everyone is using the CLI. Now we face some very interesting new challenges: how do you coordinate potentially 10,000 requests per hour to the system? This isn’t a performance issue—it’s about ensuring that when one person’s agent is moving files, another person’s agent won’t simultaneously try to write to another folder, and a third person’s agent is also trying to delete something. When these agents are running wildly, this will become a new issue that every CFO and CIO has to solve with hair-on-fire urgency.

Aaron Levie encountered this problem himself during testing. When he tried to create a sample directory structure for a marketing plan, he got stuck in a loop, repeatedly creating nested folders. He joked, “I want to know what Box’s limit is on the depth of nested directories, because I’m about to hit it.”

This little aside reflects a bigger issue: when agents are given execution capability, they may do things we didn’t expect. And that kind of unpredictability is exactly what enterprises fear most.

Treating AI Agents Like Employees? It’s Not That Simple

There was a segment in the conversation about how to manage AI agents that I found especially interesting. When everyone starts using personal agents, they give them their own API keys and their own email addresses. So how do you prevent them from doing things they shouldn’t?

Martin Casado shared a practical approach: give your agent its own phone number, its own credit card (ideally a prepaid Visa card bought from CVS), and its own Gmail account. Gmail actually has many RBAC (role-based access control) permission mechanisms. You could argue that we’ve already built a lot of such permission systems, and that we should treat agents as an independent person.

But Aaron Levie immediately pointed out the flaw with this model. In a 50-person team, will there be 100 “people” collaborating in the same shared space—50 humans and 50 agents? I obviously have full oversight of my agent, but what if my agent collaborates with others and accidentally accesses resources it shouldn’t? Now this autonomous, stateful agent is handling other people’s information.

There’s a fundamental contradiction here. When it comes to real employees, you can’t view their Slack channels, you can’t log in as them, and you can’t supervise every move they make. They are responsible for their own execution, and in the real world, you don’t get punished because they messed up something. But for agents, you bear responsibility for everything they do. You need to have complete oversight, and they don’t have privacy rights.

So some contradictions appear. I need to be able to grant the agent access permissions, but I also need to be able to log in as it at any time—for example, “No, you messed up the whole thing, I need to revoke all operations.” But if I can log in as it, how can it collaborate with other people in the real world while still keeping any information confidential or secure? That’s why, in practice, agents are almost impossible to not be an extension of you.

Aaron Levie also raised a deeper security issue: we still don’t know how to make agents keep secrets. If you tell an agent, “Don’t reveal information X in the context window,” it’s very difficult to solve this. If anything can enter an agent’s context window because it has access to resources, then theoretically you should assume this information could leak through prompt injection.

What does that mean? It means that if I know your new agent’s email address, I can email it and social engineer it—which is ten times easier than social engineering a human. It’s difficult to let this agent also access sensitive information like M&A documents.

I think this is one of the biggest technical obstacles AI agents face today. Until this is fundamentally solved, agents will find it hard to be granted truly independent decision-making power and resource access. They will exist as extensions of humans, not as independent entities.

Startups’ Advantage: Embracing AI Agents Without Hesitation

There was a point in the conversation that really resonated with me: the diffusion speed of AI capabilities will be much slower than people in Silicon Valley realize. The reason is that the constraints faced by startups and large enterprises are completely different.

Aaron Levie said that we see startups able to build from scratch, without the risks we discussed, because they have nothing to break. That’s how we view our trajectory. But when you go to JPMorgan and ask how they would set up NanoClaw (a hypothetical AI agent) to automate their business, you’ll find a huge gap.

Where does this gap show? Large enterprises have 75 legacy systems that need integration, strict compliance requirements, decades of accumulated data security standards, and complex permission management mechanisms. More critically, they have too much at stake to afford failure. If a startup’s agent goes wrong, at worst it’s a joke—maybe even an episode of something like 《Silicon Valley》. But if an agent at a big bank leaks customer data, it could be a catastrophe that leads to the company shutting down.

Steven Sinofsky made a great prediction: startups will burn through available capital while pretending that computation costs aren’t an issue. Many large companies will freeze out of fear and do nothing. Then ordinary employees will start buying and using these tools themselves—doing the things that big companies have money for but are too afraid to spend it on.

In between, there will be some companies that are willing to bet for all sorts of reasons, because their financial situation allows it. These companies will become leaders in their respective fields, as long as they can maintain financial health. There won’t be situations where no one dares to enter because the CFO is afraid of being fired. Mistakes by CFOs will happen, but that’s normal.

I think this will create a very interesting market split. Medium-sized companies that are willing to invest early and take risks may gain a competitive advantage over large enterprises. They have enough resources to invest in AI, but they aren’t constrained in the same way by legacy systems and risk aversion that giant companies face.

At the same time, a new batch of service companies will emerge. Imagine creating from zero a marketing agency, an engineering consultancy, or an architecture design company, built entirely on first principles of AI agents. There are no information barriers or boundaries: you can provide the agent with all the context it needs to get the work done, and you can write software on demand for specific needs. For a period of time, companies like this could be highly disruptive—until those larger existing enterprises can break free from their constraints.

Token Budget: The New Battlefield of Engineering Management

A discussion about token budgets in the conversation felt both realistic and absurd. Aaron Levie said, “Token budget discussions about engineering compute will be one of the craziest conversations in the next few years.”

Why would he say that? Because engineering expenses account for 14% to 30% of the revenue of any publicly listed technology company. If compute cost is twice as high as other engineering-related costs—or only 3% higher—then the difference could be the entire EPS (earnings per share) of the company.

But Steven Sinofsky believes we still don’t have the answer, and CFOs always want the answer to questions they don’t know. Wall Street will force them to produce a number, and then they’ll be held responsible for it—then they’ll get fired, and the cycle will continue. This is nothing new; we’ve gone through the same thing with every new technology, like internet bandwidth, vacuum tubes, transistors, and the number of programmers.

But Aaron Levie insists that this time is different. He made a good point: we have never had a moment where every final user in an organization has fully elastic ability to initiate resources on their behalf. And in many cases, initiating these resources is completely reasonable.

This is indeed similar to the early-2000s cloud computing transition. Back then, we moved from CapEx (capital expenditures) to OpEx (operating expenditures), and then to effectively unlimited spending. Aaron recalled how, at Box back then, CFOs would say, “You don’t understand—we’re an agriculture company, we only understand CapEx,” or “We’re an OpEx company—we love the cloud.” Differences in accounting rules truly affect technology adoption.

But the token budget issue is even more granular. As an engineering leader, you now need to decide: should engineers factor compute budgets into every prompt? Do you want prompts that run for a long time, or short ones? Do you want to parallelize? How tolerant are you of wasting tokens?

Aaron said that his current stance is that it’s acceptable to waste a lot of tokens, because that means we’re trying new things. Should engineering leadership be happy to run 10 experiments in parallel? Even if it obviously wastes 90% of the tokens, you’ll still choose a successful path. Or should you tell the team to truly design a perfect system before they run it?

As this conversation was recorded, people were panicking over Claude Code’s new Max plan because it limits them after three prompts. This will be a very real topic until we can truly build enough data center capacity.

But I agree with Steven Sinofsky’s long-term view: this problem will eventually disappear. The biggest reason is that you have to do Benioff-style math. If you pay a company salesperson $1,000,000 per year, you have to ask what their tools are worth. If you pay an engineer X dollars per year, then at some point their tools will absolutely be worth that investment.

And the law of large numbers will solve this. Eventually, with enough engineers using this many compute resources, things will balance out. We are in a transition phase: most people two years ago thought AI spending was at the level of a chatbot. But they were wrong, because they treated it as a specific use case—whereas in reality it’s a platform-level shift.

The Future of SaaS Systems: The Value of the Data Layer Returning

A discussion about the future of enterprise systems left a deep impression. Martin Casado proposed that the current SaaS vendors are having an interesting problem: they don’t actually sell business-line data. They sell intelligence, domain expertise, and the entire system. But the agents only want to buy data, only want to authorize access, and have unlimited access—yet that has never been their business model.

This has long been the tension point with systems like Workday and SAP—how much API access should be allowed. Salesforce has gone through three major platform redesigns because of this. It’s a particularly interesting question at the technology layer: when people want to access data, what does a system of record mean?

Steven Sinofsky put it bluntly: “It’s absurd to think you can build a SAP-like system with vibe coding.” All the domain knowledge in SAP isn’t just sitting inside some carefully orchestrated data layer. It exists in the UI, it exists in the middle layer, and it exists in the way you use it.

But Aaron Levie has a different view. He believes that if you iterate enough, agents will ultimately be largely responsible for choosing the tools they want to implement and use. While agents can’t replace enterprise systems, after enough generations of development, agents might encounter so many obstacles in your software that they’ll say, “You need to finally retire your legacy HR system, otherwise I can’t automate this workflow for you.”

That’s a disruptive viewpoint. Imagine what happens when the number of agents is 100 times or 1,000 times that of humans—if this keeps happening repeatedly, then ultimately you’ll need to build software stacks for agents. Maybe a few systems will remain as last strongholds—like a few ERP systems—but everything else in your business performance will be tied to how well your agents can access the information they need to do work. That means your enterprise IT stack must be set up in a way that supports these agents to operate effectively.

Martin Casado proposed a nuance that I strongly agree with. People often talk abstractly about “now you’re marketing to the agent,” “you need to be an API,” and “you need a good interface description language.” He believes this is almost completely wrong. What agents are truly good at is finding the right backend. They won’t say, “This interface is good; the documentation is good.” They’ll say, “These cost parameters make sense, that one is persistent.” In practice, they actually have the collective intelligence of how we use these platforms.

He gave an example: whenever he asks an agent to choose a cloud platform, the agent uses meaningful criteria instead of interface-related criteria. So as an industry, we focus too much on these interfaces and talk as if “you need to market to the agent” is the main point. But in reality, we’ll be pushed to build better systems—that’s what will get chosen.

I think this viewpoint is very profound. In the age of agents, technical superiority will become even more important, while the importance of marketing and packaging will decline. Those products that are truly competitive in technology will stand out, while those mainly driven by sales will face challenges.

My Thoughts: We’re Underestimating the Scale of This Transformation

After listening to the entire conversation, my biggest feeling is that Wall Street and the whole industry are using the wrong framework to understand the economic impact of this shift. Aaron Levie is right: the biggest problem is that everyone is trying to figure out the economic benefits of all this, but their estimates of the scale of the opportunity are off by at least an order of magnitude.

Steven Sinofsky illustrated this with historical examples. When people looked at PCs, they thought the consumption of MIPS (million instructions per second) represented a limited market, not realizing what would happen if we put all those MIPS onto every desktop. And people thought software came along with MIPS—only one person (referring to Bill Gates and Paul Allen) thought of the possibility of selling software separately.

The same thing happened with cloud computing. When people looked at cloud, they thought we were just moving server business—about 60,000 servers per year—to someone else’s data center. No one thought usage would grow by 1,000 times.

With AI, the same thing is happening. The Wall Street model has a fixed revenue pie, based on zero-sum thinking. They think the amount each company will spend annually on something is fixed. But when cloud computing arrived, Salesforce faced a $2,000,000,000 CRM business involving buying all those servers, Oracle licenses, the pain of deployment, and many years of consulting. But if you could let salespeople register on their own personally, it would happen without friction—that’s what’s happening now.

I believe AI agents will bring a market expansion of similar or even larger scale. When every knowledge worker has one or more agents working alongside them, the use of software, the processing of data, and the consumption of compute will all grow exponentially. This isn’t a zero-sum game—not just shifting work from humans to agents, but creating entirely new possibilities and value.

Aaron Levie mentioned that, as an investor, he has come into contact with about 240 infrastructure companies, and over the past six months they’ve all shown asymptotic growth. Why? Because the amount of software being written now is greater than ever. With more software and more agents, there will be more compute resources consumed. When everyone’s phones consume AI heavily, and device-side AI becomes real-time, usage will increase by a billion times.

I believe we’re going through a “transistor moment.” Steven Sinofsky used vacuum tubes as an example. There was once a time when people thought the entire Dakotas would be covered with vacuum tube warehouses. People wore roller skates and replaced vacuum tubes in the aisles, just so they could fight again in World War II. Then someone said, “Why don’t we just use transistors?”

Tokens might be like IBM’s MIPS back then. IBM sold more MIPS at lower prices each year, but they still priced mainframes by MIPS—until someone pointed out their curve was declining because they were producing MIPS faster than they could charge for them. The same thing will happen with tokens.

In the short term, we will see massive chaos and uncertainty. Enterprises will struggle between how much to invest, how to control costs, and how to manage risks. Startups will make bold bets and move fast. There will be failures and successes. But in the long run, the direction is clear: software must be built for agents, APIs will become more important than UI, system quality will be more important than marketing, and compute costs will keep declining while usage increases exponentially.

We are not experiencing a simple tool upgrade—we’re experiencing a fundamental shift in the computing paradigm. Companies and individuals that understand this and take action will define the technology landscape of the next decade. Those who still think in old frameworks may find themselves left far behind.

This transformation has only just begun.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin