Customer Support

AI-Powered Automated Phone Calls: How n8n, Twilio, and Ultravox AI Transform Customer Communication

by
Alex Gurbych

Article by Oleksandr Gurbich, PhD, CEO, an expert in artificial intelligence systems.

The quality and speed of handling phone inquiries directly impact a company's financial performance. According to research, poor customer service leads up to 75 billion dollars in annual losses due to customer churn. Operational inefficiencies, such as missed calls or long on-hold times, turn the telephone channel – a potential source of revenue – into an item of unjustified expense.

Modern technologies offer a solution to this problem: voice AI agents. These are automated systems capable of conducting meaningful dialogues with customers 24/7. They handle typical requests, qualify leads, and integrate with corporate databases.

In this article, we will analyze how this technology works, the economic advantages it provides, and how it can be integrated into business processes.

For a detailed technical overview of the setup process, we recommend watching the video version.

Operational Challenges and Financial Losses in Traditional Telephony

The standard model for handling calls has several bottlenecks that lead to direct and indirect financial losses for business:

  1. Limited Availability. 90% of customers consider an immediate response most important. Still, most companies physically cannot provide round-the-clock support – due to limited business hours.
  2. Inefficient Use of Resources. Up to 80% of support inquiries are typical and repetitive. Involving qualified employees to handle them is not cost-effective.
  3. Poor Customer Experience. Having to wait on hold for a long time and repeat their issue to multiple operators is the primary source of frustration for nearly 70% of customers.

Furthermore, more than half of customers will go to a competitor after just one negative experience with a company.

Business Metric Traditional Call Center (per operator) Call Center with Voice AI Economic Effect
Cost per minute of conversation ~$3.00 – $6.50 (salary, taxes, office) ~$0.05 – $0.25 (API cost) 📉 Reduction in call processing costs by up to 90%.
Availability 8 hours/day, 5 days/week 24 hours/day, 7 days/week 📈 Customer reach outside of business hours, increasing the number of potential inquiries.
First Response Time 2–5 minutes (on-hold time) Instant ✅ Increased customer satisfaction. 89% of consumers note that response speed influences their purchasing decision.
% of Lost Calls ~15–20% (outside business hours, all lines busy) 0% 💸 Minimized lost opportunities and potential revenue.
Manager's Time Allocation 70% on routine queries, 30% on complex tasks 10% on AI supervision, 90% on sales and complex cases 💼 Increased productivity. Employees focus on high-value-added tasks.
ROI (Return on Investment) Depends on employee efficiency Payback on implementation averages 4–6 months. 🔥 Fast return on investment and the ability to scale without a proportional increase in staff.

What's the difference between IVR and Conversational AI?

Traditional Interactive Voice Response (IVR) systems after dialing usually play the message like "press 1 for...". This pattern represents the first generation of automation. They function based on rigidly scripted scenarios – and still incapable of handling requests that fall outside predefined limits.

The difference from Conversational AI is comparable to the evolution from a landline telephone to a smartphone: both perform the basic function, but their capabilities are fundamentally different.

Architecture and Operating Principles of a Voice AI Agent

Creating a Voice AI Agent system does not require deep programming or software knowledge, as modern platforms offer ready-made tools. Let's examine the system's architecture using three key services as an example.

  • Twilio  –  Telephony Gateway: This service provides a virtual phone number and acts as the entry point for all calls.
  • Ultravox AI  –  The System's Core: This is the platform where the AI agent itself is constructed and hosted. Its logic, behavior scenarios, and knowledge base are defined here.
  • n8n  –  Automation Platform: This tool serves as the connecting link, transmitting data between Twilio and Ultravox to ensure they interact seamlessly.

Step 1. Agent Configuration in Ultravox AI

At this stage, the main parameters of the digital assistant are defined.

  1. Basic Settings: Choosing the voice, language, accent, and other parameters that shape the agent's persona.
  2. Defining the Role (System Prompt): A key step is creating a detailed instruction for the AI. This is a text description that defines its role, goals, communication style, and course of action.

Example of an effective prompt: "You are Max, a consultant for the investment company Flex Funds. Your goal is to qualify incoming inquiries. Your communication style is formal. Algorithm: 1. Greet the caller. 2. Clarify the client's investment goals, budget, and timeframe. 3. If the budget does not meet the minimum threshold of €169,000, offer alternative products. 4. If the criteria are met, collect contact information (full name, phone, email) to pass on to a manager."

Step 2. Training the Agent with Corporate Data (RAG Technology)

To provide accurate answers, the agent must have access to up-to-date company information. RAG (Retrieval-Augmented Generation) technology automates this process.

  1. Creating a Knowledge Base: A knowledge source is created in Ultravox.
  2. Indexing Data: The system is given a link to the corporate website. It automatically scans its content, structures the information, and transforms it into a knowledge base.

This way, the agent can independently find answers to specific customer questions: "What are the investment conditions for Fund X?" or "What is the yield of Portfolio Y?" without manually pre-programmed answers.

Step 3. Setting Up Service Interaction in n8n

The n8n platform allows you to visually design the call processing logic using a workflow diagram.

  1. "Webhook" Node: Receives an HTTP request from Twilio when a new call comes in.
  2. "HTTP Request" Node: Sends a request to the Ultravox API to activate the corresponding AI agent.
  3. "Respond to Webhook" Node: Returns a response to Twilio with a link to the media stream created by Ultravox, establishing a direct connection between the client and the AI agent.

Example of the System in Action

Let's look at a simulated dialogue with the configured agent:

AI Agent: "Hello, you've reached Flex Funds. My name is Max. How can I help you?"

Client: "Good day, I'm interested in investments. My budget is around €50,000."

AI Agent: "Thank you for the information. The minimum threshold for investment in real estate co-ownership is €169,000. However, for your budget, we can offer participation in real estate funds. Are you interested in this option?"

Client: "Yes, possibly."

AI Agent: "Excellent. To provide you with detailed information, please state your full name and email address."

The agent correctly identified the client's budget. Then offered a relevant product, and then proceeded to the data collection stage – which aligns with its goal of lead qualification.

Conclusion

Thus, AI-based voice automation is not a replacement for human staff. Still, it's a useful tool for augmentation of business specialists.

Implementing AI systems allows businesses to reduce operational costs, and increase the efficiency of inquiry processing, and improve the customer experience. 

The AI takes over routine, freeing up the time of qualified employees to resolve complex issues and perform work that requires human interaction. This makes voice agents a vital element of a modern businessstrategy – aimed at optimization and growth.

Written by

Alex Gurbych
Chief Solutions Architect
Receive a professional and in-depth consultation from an experienced expert. Get tailored advice to address your specific needs and achieve your goals effectively.
Table of contents
    Have questions?
    Book a meeting

    Discover more
    Related articles

    All articles