AI Agents in the Enterprise: Lessons from the Field
Insights from practitioners who’ve taken AI agents from prototype to production.
Today I listened to a panel discussion hosted by OpenPipe on Lessons Learned from Building Enterprise AI Agents. It featured three panelists with firsthand experience in building and deploying agents at scale:
Sandy Besson – Applied AI Research Engineer at IBM Research
Chris Chileles – CEO at Zig
Boaz Ashkenazi – CEO at Augmented AI Labs
The discussion focused on the practical challenges of building AI agents for enterprise use—not theoretical architectures or future visions, but the realities of deployment, safety, measurement, and cost. Topics included everything from managing expectations and ROI, to protocol choices like MCP, to what they’d build differently now.
Below is a distilled summary of the key insights they shared.
1. Demos Don’t Reflect Reality
Clients love polished demos—but want strict guardrails in production, especially for customer-facing agents.
Leaders often expect fast ROI. But value takes time, preparation, and honest risk evaluation.
Systems that work in testing often become brittle or expensive at scale.
2. Success Requires Trade-Offs
You can’t maximize speed, accuracy, and cost all at once.
One panelist mentioned of the Pareto frontier framework to guide teams through meaningful trade-offs.
In legal, finance, and other high-risk areas, human oversight is essential. A simple rule:
If you wouldn’t trust an intern to do it alone, don’t let the AI.
3. Measuring Quality Isn’t Simple
Standard metrics (accuracy, F1 score) help, but don’t tell the full story.
Human feedback, logging, and telemetry are key to assessing response quality.
Business teams should define what “good” means—not just engineers.
4. Handle Data with Care
Sensitive data is driving many teams back to on-prem infrastructure.
Limit what the model sees and what it can output. Apply the least-access principle.
In many cases, small, focused ML models outperform general-purpose LLMs—especially when reliability and privacy matter.
5. Design for Safety and Modularity
Guardrails are needed both before and after the model runs.
Don’t treat models as all-knowing. Consider them as intern—with boundaries.
Use synthetic data to model edge cases without exposing sensitive records.
6. Stay Flexible with Tools and Protocols
New agent communication protocols (like MCP, A2A, ACP) are emerging, but no dominant standard exists yet.
Choose simple, open tools with strong community backing.
Avoid overbuilding. Many orchestration tools built last year are already obsolete.
7. SaaS Isn’t Dead—But It’s Changing
SaaS still matters, especially for small to mid-sized teams.
But usage-based AI workloads don’t fit neatly into per-user pricing models.
In regulated industries, adoption is slower than it seems. Risk, compliance, and culture slow change.
8. Build for What’s Next
Many custom tools built today may become unnecessary tomorrow as models improve.
Before building, ask: “Can we wait six months?”
Favor modularity and future adaptability over quick fixes or heavy infrastructure.
Closing Thought
The discussion remained focused on practical issues—what it takes to build and deploy AI agents responsibly in complex enterprise settings. For those working on agent frameworks, enterprise AI platforms, or real-world deployments, it offers several useful takeaways.