
Takeaways from NVIDIA GTC 2025

Last week, Distributional joined other cutting-edge companies at NVIDIA GTC, where we chatted with other leaders in AI about what they’re building and the challenges they’re facing, reconnected with friends and former colleagues, and shared a live demo of the latest capabilities at our booth.
With over 17K people showing up to Jensen Huang’s keynote at SAP Center this year, it’s no exaggeration to say that GTC is essentially the Super Bowl of AI. Personally, I find it’s an interesting space to learn where industry is going next, and gain insight into what challenges and opportunities leaders are thinking about. I was especially looking forward to learning more about NIMs, seeing how enterprises are tackling latency and design challenges, and hearing more about where companies are making bets when it comes to AI these days. Looking back at last week, here are my top takeaways from GTC 2025.

NIMs help with creating endpoints
Out of all the sessions I attended, there seemed to be a pattern surfacing around the use of NVIDIA inference microservices—or NIMs. This tooling and abstraction layer allows teams to stand up endpoints for open-weight foundation models, like those from Meta or Mistral. Enterprises are using NIMs to have more control over how these endpoints are deployed and used. This is a common need we’ve heard from customers and connects well with what we’ve built with Distributional to help them track the stability of these types of endpoints.
Real-time latency is a key consideration for user experience
Something else that stood out from the talks I attended was the importance of real-time latency for user-facing applications. People don’t have a lot of patience for an app to take more than a second or so to synthesize information or respond with dialogue. If your app is slow, you’re going to lose that connection and delight with customers pretty quickly. It’s interesting to see how providing a good user experience is driving a lot of technical considerations for these systems. It’s also an indication of the wider types of roles now involved with building and launching AI products.
Token use is increasing dramatically
During the keynote, Huang spoke about how token use is increasing dramatically, specifically in agentic flows and systems doing deeper reasoning. This increase in test-time compute means teams will need to build systems that can handle a high level of token throughput. A lot of tokens in—and a lot of tokens out, and the right measurements to track their behaviors will be critical.

Big bets on open weight models
In a talk with Prem Natarajan, VP of AI at Capital One, I was impressed to hear how the digital bank is working with cutting edge technology to deploy customer support agents to accelerate their operations. Natarajan shared how the company is making big bets on open weight models. Their thesis is that customizing these models is critical to get to production level quality, whereas just tweaking a third-party commercial model at arm’s length isn’t going to be effective enough in an application vertical like theirs.

The data flywheel is spinning up
Natarajan also did a good job of articulating what a lot of people call the data flywheel (Karpathy calls this the data engine), which is a flow to collect real usage and examples from your AI system. That helps companies identify where their app is defective and where customers are asking for things that aren’t possible, and use that data to consistently evaluate and test their system. It was really cool to see how Capital One is making progress on that, and I was particularly struck how Natarajan sees this as critical for business as a whole, not just individual teams. He shared, “Any company that will be in the lead has to invest in a centralized AI strategy.”

AI from large to small
Last but not least, we hosted a Braindate with an open call to discuss the AI platform stack and some of its friction points. This led me to connect with Kun-Lin Hsieh, CEO of BookAi, an interactive books product. It was interesting to hear about his pain points in building and scaling an AI app, in particular with the publishing industry—publishers don’t want their content to be shown in an off light, and so he was driven to look at solutions that will ease that concern. All in all, it was a prime example of how GTC really covers the spectrum, going from start ups with entire companies built on AI all the way up to larger enterprises like Capital One making very serious investments in the space.
That’s it for now! See you at GTC next year, hopefully. If we missed you this year, please reach out to learn more about Distributional’s adaptive testing solution to help you tap into the future of AI.