Trust, Safety, and Standards: Closing the Gap Between AI Pilots and Real ROI

The conversation is shifting from what AI can do to whether organizations, institutions, and society are ready to trust it. Deployment is accelerating, but so are the risks.

From the AI Institute Report: Artificial Intelligence at an Inflection Point

As artificial intelligence moves from experimental technology to operational reality, the conversation is shifting from what AI can do to whether organizations, institutions, and society are ready to trust it. Deployment is accelerating, but so are the risks—technical debt, safety failures, liability gaps, and a growing public demand for accountability. The difference between the organizations capturing durable value from AI and those stalling out is no longer a question of access to models. It is a question of governance, trust, and standards.

We’re highlighting Theme 3: Trust, Safety, and Standards from Artificial Intelligence at an Inflection Point: Infrastructure Constraints, Workforce Transformation, and Trust—examining why enterprises are struggling to convert AI adoption into outcomes, how the threat landscape is evolving in parallel, and what it will take to build the institutional trust that responsible deployment requires.

Many companies acknowledge that  AI has progressed from a testing-phase technology to an operational priority. Even so, a noticeable gap remains between the excitement of early pilots and the tangible value achieved once solutions reach production. 

AI coding assistants represent the first category to achieve both mass adoption and measurable productivity gains. However, quality metrics reveal a troubling pattern: change failure rates have increased, signaling that productivity without governance creates technical debt and operational risk.

A similar ideological pattern extends across the enterprise. Organizations report high adoption rates but struggle to demonstrate return on investment. The root cause is not technological—foundation models are increasingly capable—but organizational. Companies treat AI as a deployment problem rather than a transformation challenge, underinvesting in enablement by as much as 25% relative to successful peers and measuring usage instead of outcomes.

Organizations reporting positive, “non-traditional” ROIs pursue AI value across three concurrent horizons:

  • Immediate efficiency gains (0–90 days): Translation, summarization, retrieval-augmented search, and template generation deliver rapid OPEX reduction. One manufacturing firm eliminated $12 million in annual translation costs within 60 days. These wins require minimal organizational change and build credibility for deeper transformation.
  • Human-centric transformation (90–270 days): Redesigning workflows and roles to integrate AI into standard operating procedures unlocks the next tier of value. This phase demands significant change management investment—structured training, prompt libraries, feedback mechanisms, and cross-functional enablement. Organizations that increase enablement investment by 25% see measurable improvements across all productivity and quality metrics.
  • Business model reimagination (6–24 months): AI-native business models are emerging but remain nascent. The start-up phase for most companies is underway, with truly transformative applications just entering the market. This horizon requires non-institutionalized thinking, often from external partners, and tolerance for experimentation in ring-fenced environments.

However, AI trust and safety concerns remain, with three threat categories escalating simultaneously: 

  • Democratization of abuse: Barriers to conducting phishing campaigns, deploying malware, and generating coordinated misinformation have collapsed. Individuals without technical skills can now execute attacks previously requiring specialized knowledge.
  • Amplification of social harms: Child safety risks, self-harm guidance, and harassment automation have triggered lawsuits against AI companies. Foundation model providers are responding reactively, implementing age restrictions and content filters only after legal and reputational damage.
  • Novel attack vectors: Deepfake technology has enabled foreign operatives to secure employment at technology companies using synthetic video interviews. CFO impersonation attacks have resulted in fraudulent wire transfers. Most recently, a company documented the first complex cyber-espionage campaign conducted almost entirely by AI with minimal human oversight—a watershed moment indicating autonomous threat actors are no longer theoretical.

Moreover, the AI industry faces a standards and liability crisis. Over 2.2 million models exist on platforms, each with inconsistent or absent safety documentation. When asked how they determine model safety, leading providers historically answered “because we said so”—an untenable position as AI moves into mission-critical and regulated workflows.

Liability questions compound the challenge. When an AI agent makes a decision that causes harm, who bears responsibility—the foundation model provider, the platform operator, the enterprise customer, or the individual who deployed the agent? Jurisdictional boundaries for AI decisions remain undefined, creating legal uncertainty that slows adoption in risk-sensitive sectors.

Public sentiment reveals a paradox: 80-90% of stakeholders across all demographics support AI regulation, yet fewer than 20% trust any entity to regulate it. Big tech companies—currently the de facto regulators through self-governance—earn complete trust from under 7% of respondents, while 35% express no trust at all. This trust vacuum paralyzes institutional adoption, fragments policy development, and undermines the collaborative governance essential for responsible deployment. 

Recommendations: 

  • Establish loose-tight governance. Balance is essential. Loose controls enable experimentation, allowing employees to discover novel applications and reinvent their roles. Tight controls systematize successful patterns, enforce safety boundaries, and enable scale. Organizations succeeding with AI operate Centers of Excellence that intake bottom-up use cases, apply security and policy guardrails, and productionize solutions with shared services (RAG infrastructure, prompt registries, evaluation harnesses).
  • Measure outcomes, not adoption. Usage metrics (daily active users, prompts per employee) are vanity metrics. Business outcomes matter: cycle time reduction, defect density, first-pass yield, compliance incident rates, customer satisfaction, and cost per task. A/B testing and contribution analysis isolate AI impact from confounding variables.
  • Adopt safety-by-design. Integrate trust and safety at the design phase, not after incidents. Threat modeling for LLM-specific risks (prompt injection, tool abuse, data exfiltration), policy-as-code controls, least-privilege API access, and continuous red-teaming are table stakes. Use AI models themselves to detect misuse patterns and iterate defenses. 


AI’s biggest hurdles are no longer technical; they’re about governance, trust, and accountability. Organizations that move beyond simply measuring adoption and focus on human-centric outcomes will be the ones that pull ahead. But that progress can’t come at the cost of safety. As the threat landscape grows more complex and public trust continues to erode, building AI responsibly isn’t just the ethical choice, it’s the strategic one. The path forward requires balanced governance, meaningful metrics, and safety integrated from day one.

This is the work the AI Institute was created to advance. At the 2026 Marconi Awards Gala & Institute Forums, global leaders from industry, academia, government, and society will convene to translate these themes into concrete frameworks, partnerships, and standards—and to recognize the innovators shaping what responsible AI looks like in practice.