Microsoft Build 2026 put developers face-to-face with two uncomfortable AI realities: building smarter systems now requires constant model evaluation, while software itself may be easier than ever to copy.
Across sessions featuring Yina Arenas, Naomi Moneypenny, Sharmila Chockalingam, and Chip Huyen, the message was clear: AI builders need better architecture, sharper workflows, and stronger reasons for their products to exist.
TL;DR
- Microsoft Foundry is being positioned as a governed platform for choosing, testing, routing, and optimizing AI models at scale.
- Foundry’s model catalog includes more than 11,000 models, spanning OpenAI, Anthropic, Hugging Face, Claude on Azure, Microsoft multimodal models, Aurora 1.5, NVIDIA physical AI models, and Foundry Labs experiments.
- Chip Huyen’s session questioned whether AI-generated code makes software less defensible, while arguing that user understanding, execution, empathy, and real-world problem solving still matter.
Microsoft Build 2026 Frames AI Development As A Systems Challenge
Microsoft Build 2026 took place in person from June 2 to 3 at Fort Mason Center in San Francisco, with keynote and select sessions also available online.
Microsoft positioned the event around AI-powered tools, platforms, and developer workflows, making these two sessions part of a broader conversation about how software creation is changing.
In “Build smarter AI systems in Foundry as models and costs evolve,” Yina Arenas, CVP of Microsoft Foundry, and Naomi Moneypenny, Senior Director of Product Development, framed the current AI landscape as one where deploying a chatbot is no longer the hard part.
The challenge now is building systems that can scale, stay cost-effective, adapt as models change, and remain reliable under real enterprise conditions.
The session focused on Microsoft Foundry as the control layer for this complexity, helping developers choose models, benchmark performance, integrate workflows, and improve AI systems over time.
Microsoft Foundry Pushes Developers Beyond One-Size-Fits-All AI Models
The Foundry session leaned heavily into a practical point: model selection should be based on the task, not brand recognition or raw size.
Foundry’s catalog includes more than 11,000 models, including flagship collaborations with OpenAI, Anthropic, and Hugging Face. The platform also includes newer options such as Claude on Azure, Microsoft’s multimodal models for image, code, and audio, Aurora 1.5 for geospatial use cases, NVIDIA’s physical AI models, and Foundry Labs from Microsoft Research.
That breadth matters because enterprise AI teams are no longer building around a single model. They are assembling systems where different models handle different jobs, under consistent APIs, governance controls, and production-grade infrastructure.
The session’s travel app example showed how that thinking works in practice. Developers first define success criteria such as correctness and policy compliance, then test a large model such as GPT-4.1 before optimizing. The presenters compared overusing a large model to “using a Ferrari for a grocery run,” pointing to the risk of paying premium prices for tasks that smaller models can handle.
Foundry’s Evaluation-First Approach Targets Cost, Latency, And Quality
The strongest takeaway from the Foundry session was that optimization begins before deployment.
Arenas and Moneypenny showed how developers can use built-in Foundry evaluators, custom evaluators, prompts, code, and rubrics to measure whether an AI system is actually doing what the business needs. Newly announced rubric-based evaluation can infer performance dimensions from agent definitions and weigh requirements such as compliance or correctness.
From there, teams can decompose tasks and route them to smaller, more efficient models through automatic or custom routers. This approach can reduce latency and cost-per-task while protecting output quality.
The session also covered fine-tuning, distillation, serverless APIs, and reinforcement learning options. The goal is not just to make models cheaper, but to embed domain-specific knowledge into smaller models without losing reliability.
Microsoft also emphasized operational discipline through tracing, monitoring, Azure Monitor, structured outputs, token budgeting, batch inference, caching, and Azure Context Cache for explicit prompt caching.
In other words, the long-term AI advantage may come less from chasing every new model and more from continuously improving the system around it.
Topics For More Insights
- Microsoft’s Majorana 2 Quantum Chip Gets 1,000x Reliability Boost With Agentic AI
- Microsoft Unveils Project Solara To Power Agent-First Badge And Desk Devices
- Microsoft Build 2026 Reveals AI Agents, RTX Dev Box, Smarter Windows And Majorana 2 Quantum Chip
- Microsoft Adopts OpenClaw With Scout, Its First Always-On AI Personal Assistant
Chip Huyen Questions Software Defensibility As AI Coding Gets Cheaper
While Foundry focused on how to build better AI systems, Chip Huyen’s session “Software Defensibility in the era of AI coding” asked a sharper business question: “If the cost of building software is approaching 0, is the value of building software is also approaching 0?”
Huyen, a builder at Stealth and former core developer of NVIDIA’s NeMo, described a world where AI tools can generate code, clone products, and compress software development cycles almost overnight. She used the example of a weekend project that went viral before being copied by others using AI tools, highlighting how fragile many software moats have become.
The session challenged traditional defensibility claims, including proprietary data, distribution, branding, trust, customer service, expertise, and execution speed. Huyen argued that some of these moats are weaker than founders assume, especially when wealthy labs or competitors can buy data, replicate features, or move faster with AI.
AI May Weaken Software Moats, But It Expands The Problem Space
Huyen’s argument was not that builders should stop building. It was that builders need to rethink what makes software valuable.
As AI makes software production cheaper, defensibility may shift toward long-tail problems, cultural nuance, user empathy, workflow design, and the ability to solve overlooked needs. She pointed to challenges such as natural multilingual voice chatbots, differences in conversational timing across cultures, and the difficulty of designing AI that works smoothly in physical environments.
The session also explored how developer workflows are changing. Engineering artifacts may move beyond static code into prompts, specifications, and AI-readable systems. Tools such as terminals, IDEs, GitHub-style collaboration platforms, APIs, modular codebases, and even real-world infrastructure may evolve to make environments easier for AI agents to use.
In robotics, Huyen used examples such as delivery robots that cannot press crosswalk buttons to show how digital intelligence still struggles with messy physical reality. She also highlighted safety and irreversible actions, including her own experience losing a database due to an AI coding assistant.
The combined message from Microsoft Build 2026 was pointed: AI may make software easier to create, but it also raises the bar for what survives. Foundry’s answer is governed, evaluated, optimized AI systems. Huyen’s answer is deeper defensibility through execution, empathy, and real-world problem selection.

