AT&T Slashes AI Costs by 90% with Smart Orchestration and Small Language Models

06 Mar, 2026
Artificial Intelligence

AT&T Slashes AI Costs by 90% with Smart Orchestration and Small Language Models

In the fast-paced world of artificial intelligence, efficiency and cost-effectiveness are paramount. AT&T recently unveiled a groundbreaking approach to AI orchestration that has not only streamlined their operations but also resulted in a staggering 90% cost reduction. This innovation, driven by Chief Data Officer Andy Markus and his team, centers on a clever re-architecture of their AI systems, moving away from solely relying on massive, resource-intensive large language models (LLMs).

The Challenge of Scale: 8 Billion Tokens a Day

The sheer volume of data AT&T processes daily – an average of 8 billion tokens – presented a significant scalability and cost challenge. Pushing every query and task through large reasoning models was becoming economically unfeasible. To tackle this, AT&T focused on optimizing their internal "Ask AT&T" personal assistant by rethinking its orchestration layer.

A Multi-Agent Approach with Small Language Models (SLMs)

The solution involved building a sophisticated multi-agent stack powered by LangChain. Instead of one monolithic LLM, AT&T implemented a system where powerful "super agents" direct smaller, specialized "worker" agents. These worker agents are designed for concise, purpose-driven tasks. This modular approach significantly improved latency, speed, and response times.

Andy Markus emphasized the future lies in embracing Small Language Models (SLMs). "We find small language models to be just about as accurate, if not as accurate, as a large language model on a given domain area," he stated. This strategic shift to SLMs, orchestrated intelligently, is the key to their remarkable cost savings.

Ask AT&T Workflows: Empowering Employees

Building on this re-architected stack, AT&T, in partnership with Microsoft Azure, developed and deployed Ask AT&T Workflows. This intuitive, drag-and-drop agent builder empowers employees to automate a wide range of tasks. These agents leverage AT&T's proprietary tools for tasks like document processing, natural language-to-SQL conversion, and image analysis. Crucially, AT&T's own data remains at the core of these decision-making processes, ensuring focus and accuracy.

A vital aspect of this system is the human-in-the-loop oversight. While agents operate autonomously to a degree, a human always monitors the process, ensuring checks and balances are in place. All agent actions are logged, data is kept isolated, and role-based access controls are strictly enforced.

Strategic Model Selection and Agility

AT&T's strategy isn't about reinventing the wheel. They prioritize using "interchangeable and selectable" models rather than building everything from scratch. As industry standards mature, they are prepared to adopt off-the-shelf solutions, recognizing the rapid pace of change in the AI landscape. This agile approach allows them to quickly integrate new advancements and deprecate less efficient homegrown tools.

Rigorous evaluations are conducted for both external and internal solutions. AT&T's "Ask Data with Relational Knowledge Graph" has even topped leaderboards for text-to-SQL accuracy. For their agentic tools, they rely on frameworks like LangChain, fine-tune models using RAG, and leverage Microsoft's search functionality for vector storage.

Markus cautioned against over-engineering AI solutions. The guiding principles remain:

Accuracy: Ensuring the AI performs tasks correctly.
Cost: Maintaining economic feasibility.
Responsiveness: Delivering timely results.

The question for developers should always be: "Does this truly need to be agentic?" Breaking down complex problems into smaller, more manageable pieces often leads to more accurate and efficient solutions.

Real-World Impact: Productivity Gains for 100,000 Employees

Ask AT&T Workflows has been rolled out to over 100,000 employees, with more than half using it daily. Active users report productivity increases of up to 90%. The agent builder offers both pro-code and no-code options, and surprisingly, even technically proficient users are gravitating towards the simpler, drag-and-drop interface.

Examples of its use are diverse: a network engineer might use agents to automate alert responses and customer reconnection, correlating telemetry data, checking logs, opening trouble tickets, and even suggesting code patches. A human engineer oversees this automated workflow, ensuring everything operates as intended.

AI-Fueled Coding: Reshaping Software Development

The principles of breaking down complex tasks into smaller, purpose-built components are also revolutionizing how AT&T writes code. "AI-fueled coding" integrates agile development methods with function-specific AI archetypes, producing code that is nearly production-ready in a single iteration. This approach significantly shortens development timelines and enhances output quality. Even non-technical teams can leverage this by using plain language prompts to build software prototypes, as demonstrated by the rapid creation of an internal data product in just 20 minutes – a task that would have taken six weeks without AI.

AT&T's innovative approach to AI orchestration, powered by SLMs and a flexible multi-agent architecture, is not just cutting costs but also unlocking new levels of productivity and efficiency across the organization.