As enterprises transfer from experimenting with generative AI to deploying agentic techniques in manufacturing, the dialog is shifting. The query executives are asking is now not “Can this mannequin motive?” however “Can this method be trusted?”
To discover what that shift actually means, I sat down with Maria Zervou, Chief AI Officer for EMEA at Databricks. Maria works carefully with prospects throughout regulated and fast-moving industries and spends her time on the intersection of AI structure, governance, and real-world execution.
All through the dialog, Maria stored returning to the identical level: success with agentic AI isn’t in regards to the mannequin. It’s in regards to the techniques round it—knowledge, engineering self-discipline, and clear accountability.
Catherine Brown: Many executives I communicate with nonetheless equate AI high quality with how spectacular the mannequin appears. You’ve argued that’s the incorrect body. Why?
Maria Zervou: The largest misunderstanding I see is individuals complicated a mannequin’s cleverness or perceived reasoning potential with high quality. These usually are not the identical factor.
High quality, particularly in agentic techniques, is about compounding reliability. You’re now not evaluating a single response. You’re evaluating a system that may take a whole lot of steps—retrieving knowledge, calling instruments, making selections, escalating points. Even small errors can compound in unpredictable methods.
So the questions change. Did the agent use the best knowledge? Did it discover the best assets? Did it know when to cease or escalate? That’s the place high quality actually lives.
And importantly, high quality means various things to totally different stakeholders. Technical groups usually concentrate on KPIs like price, latency, or throughput. Finish customers care about model compliance, tone, and authorized constraints. So, if these views aren’t aligned, you find yourself optimizing the incorrect factor.
Catherine: That’s attention-grabbing, particularly as a result of many leaders assume AI techniques need to be “good” to be usable, significantly in regulated environments. How ought to firms in highly-regulated industries strategy AI initiatives?
Maria: In extremely regulated sectors, you do want very excessive accuracy, however the first benchmark ought to be human efficiency. People make errors right this moment, on a regular basis. In the event you don’t anchor expectations in actuality, you’ll by no means transfer ahead.
What issues extra is traceability and accountability. When one thing goes incorrect, are you able to hint why a choice was made? Who owns the result? What knowledge was used? In the event you can’t reply these questions, the system isn’t production-ready, no matter how spectacular the output appears to be like.
Catherine: You discuss loads about domain-specific brokers versus general-purpose fashions. How ought to executives take into consideration that distinction?
Maria: A general-purpose mannequin is actually a really succesful reasoning engine educated on very giant and various datasets. However it doesn’t perceive your corporation. A website-specific agent makes use of the identical base fashions, but it surely turns into extra highly effective by way of context. You power it right into a predefined use case. You restrict the house it may search. You educate it what your KPIs imply, what your terminology means, and what actions it’s allowed to take.
That constraint is definitely what makes it higher. By narrowing the area, you cut back hallucinations and improve the reliability of outputs. Many of the worth doesn’t come from the mannequin itself. It comes from the proprietary knowledge it may securely entry, the semantic layer that defines which means, and the instruments it’s allowed to make use of. Basically, it may motive in your knowledge. That’s the place aggressive benefit lives.
Catherine: The place do you usually see AI agent workflows break when organizations attempt to transfer from prototype to manufacturing?
Maria: There are three most important failure factors. The primary is tempo mismatch. The expertise strikes quicker than most organizations. Groups leap into constructing brokers earlier than they’ve completed the foundational work on knowledge entry, safety, and construction.
The second is tacit information. Loads of what makes workers efficient lives in individuals’s heads or scattered paperwork. If that information isn’t codified in a kind an agent can use, the system won’t ever behave the best way the enterprise expects.
The third is infrastructure. Many groups don’t plan for scale or real-world utilization. They construct one thing that works as soon as, in a demo, however collapses underneath manufacturing load.
All three points have a tendency to indicate up collectively.
Catherine: You’ve stated earlier than that capturing enterprise information is as vital as choosing the proper mannequin. How do you see organizations doing that nicely?
Maria: It begins with recognizing that AI techniques usually are not one-off initiatives. They’re residing techniques. One sensible strategy is to document and transcribe conferences and deal with that as uncooked materials. You then construction, summarize, and tag that data so the system can retrieve it later. Over time, you’re constructing a information base that displays how the enterprise truly thinks.
Equally vital is the way you design evaluations. Early variations of an agent ought to be utilized by enterprise stakeholders, not simply engineers. Their suggestions—what feels proper, what doesn’t, why one thing is incorrect—turns into coaching knowledge.
Constructing an efficient analysis system, customized to that agent’s particular objective, is essential to making sure high-quality outputs, which is in the end essential for any AI initiatives in manufacturing. Our personal utilization knowledge reveals that prospects who use AI analysis instruments get almost 6x extra AI initiatives into manufacturing than those that don’t.
In impact, you’re codifying the enterprise mind into analysis standards.
Catherine: That sounds costly and time-consuming. How do you stability rigor with velocity?
Maria: That is the place I speak about minimal viable governance. You don’t remedy governance for the complete enterprise on day one. You remedy it for the particular area and use case you’re engaged on. You make sure that the info is managed, traceable, and auditable for that agent. Then, because the system proves invaluable, you broaden.
What helps is having repeatable constructing blocks—patterns that already encode good engineering and governance practices. That’s the considering behind approaches like Agent Bricks, the place groups can begin from refined foundations as an alternative of reinventing workflows, evaluations, and controls from scratch every time.
Executives ought to nonetheless insist on a couple of non-negotiables up entrance: clear enterprise KPIs, a named government sponsor, evaluations constructed with enterprise customers, and robust software program engineering fundamentals. The primary challenge shall be painful—but it surely units the sample for every little thing that follows and makes subsequent brokers a lot quicker to deploy.
In the event you skip that step, you find yourself with what I name “demo put on”: spectacular prototypes that by no means fairly change into actual.
Catherine: Are you able to share examples the place brokers have materially modified how work will get completed?
Maria: Internally at Databricks, we’ve seen this in a couple of locations. In Skilled Providers, brokers are used to scan buyer environments throughout migrations. As a substitute of engineers manually reviewing each schema and system, the agent generates advisable workflows primarily based on finest practices. That dramatically reduces time spent on repetitive evaluation.
In Subject Engineering, brokers mechanically generate demo environments tailor-made to a buyer’s trade and use case. What used to take hours of handbook prep now occurs a lot quicker, with larger consistency.
In each circumstances, the agent didn’t substitute experience—it amplified it.
Catherine: In the event you needed to distill this for a CIO or CDO simply beginning down this path, what ought to they concentrate on first?
Maria: Begin with the info. Trusted brokers require a unified, controllable, and auditable knowledge basis. In case your knowledge is fragmented or inaccessible, the agent will fail—irrespective of how good the mannequin is. Second, be clear about possession. Who owns high quality? Who owns outcomes? Who decides when the agent is “ok”? And at last, do not forget that agentic AI shouldn’t be about displaying how good the system is. It’s about whether or not the system reliably helps the enterprise make higher selections, quicker, with out introducing new danger.
Closing Ideas
Agentic AI represents an actual shift—from instruments that help people to techniques that act on their behalf. However as Maria makes clear, success relies upon far much less on mannequin sophistication than on self-discipline: in knowledge, in governance, and in engineering.
For executives, the problem shouldn’t be whether or not brokers are coming. It’s whether or not their organizations are able to construct techniques that may be trusted as soon as they arrive.
To be taught extra about constructing an efficient working mannequin, obtain the Databricks AI Maturity Mannequin.