Every helpdesk vendor sells AI Agent now. Gorgias has one. Zendesk has Advanced AI. Intercom has Fin. Front has its own. The pitch is always the same: "automate 50% of your tickets, save 30% on team cost".
Some of those claims are true. Some aren't. After a year of shipping AI Agent on 60+ Gorgias accounts, here's the honest version.
When AI Agent actually works
There's a clean profile of brand and inbox where AI Agent ships measurable value. If you fit it, expect 50–60% repeat-ticket coverage in 90 days. If you don't, expect to be disappointed.
The fit:
- DTC brand on Shopify with €2M+ GMV.
- 1,000+ tickets a month.
- Top 10 ticket categories cover 70%+ of volume.
- Refund and return policies are written down somewhere a human can point to.
- Brand voice is consistent and documented (or can be inferred from 90 days of agent replies).
If those five lines describe you, AI Agent is going to do real work.
The numbers we typically see on a tuned account:
- 50–60% of repeat tickets handled end-to-end without human touch.
- 30–40% of total inbox volume off the agent team's plate.
- First response time on AI-handled tickets under 60 seconds.
- CSAT on AI-handled tickets equal to or higher than human-handled.
That last one surprises people. On well-tuned accounts, AI Agent scores higher CSAT than human agents. Why: it's never tired, never grumpy, never inconsistent. The customer doesn't care that it's a bot — they care that the answer was right and arrived in 90 seconds.
When AI Agent doesn't work
Three failure modes we see consistently. If your brand is in any of these, slow down.
1. Inconsistent policies
If your refund policy lives in one person's head and changes by mood, AI Agent can't help you. It needs deterministic rules. "Refund up to €100 if the customer has fewer than 2 prior refunds and uploads a photo of the defect." That's a rule. "We refund when it feels right" is not.
The fix isn't more AI tuning. It's writing your policies down first. Brands that try to skip this step ship an AI Agent that contradicts itself, customers complain, and the team turns it off.
2. Brand voice that's a religion
Some brands have a voice that's intentionally idiosyncratic. Intentional typos, inside jokes, very specific punctuation tics. AI Agent can be tuned to mimic it, but the tuning effort is 3× a normal account, and the quality bar is harder to hit.
We've taken on accounts like this and shipped AI Agent that works. We've also told brands honestly: "your voice is so specific that the cost of getting AI Agent on-brand exceeds the cost savings for the next 18 months". One brand a year falls in this bucket.
3. High-touch luxury
A €4,000 watch brand asking customers to upload a defect photo and processing the refund via bot is a brand mistake, not a CX optimization. Some categories live or die on the human moment. If your AOV is €1,000+ and your customer expects a concierge experience, AI Agent's role shrinks dramatically — it should triage and pre-fill, not close.
What "tuned" means
The biggest difference between AI Agent that works and AI Agent that gets switched off is tuning. Most brands enable AI Agent, give it three FAQ articles, watch it answer "what are your shipping times" successfully, and conclude they've shipped AI.
That's not tuned. That's the demo.
Tuning means:
Knowledge base. A real document. Refund policy, return windows, shipping cut-offs, sizing notes per category, defect handling, exception rules. Versioned, dated, owned by someone. AI Agent reads this before answering anything.
Tone training. 90 days of resolved tickets fed in as exemplars. AI Agent picks up rhythm, signature, exclamation policy. If the output doesn't sound like you, you keep tuning until it does.
Action permissions. Refund up to what amount, exchanges within what window, address changes within what SLA. Each action has a rail. Outside the rails, escalate.
Categorization training. AI Agent classifies the incoming ticket before it answers. If the classifier is wrong, the answer is wrong. We tune the classifier on the brand's last 90 days of tagged tickets.
Quality monitoring. Weekly audit of every AI-handled ticket flagged for tone or accuracy. Drift caught before it ships.
A typical Gorgias AI Agent setup ships in 4 weeks. A tuned one ships in 6–8. The two-week delta is the difference between 25% repeat-ticket coverage and 60%.
The hallucination question
The thing every CFO asks. "What if the AI invents a refund policy that doesn't exist?"
Real risk. Gorgias's AI Agent is grounded in your knowledge base — it answers from sourced content rather than from open-domain training. But "grounded" doesn't mean "incapable of being wrong". It means "less likely to be wrong, and traceable when it is".
The mitigations we put in production:
- AI Agent escalates rather than guesses on low-confidence inputs.
- Action permissions cap blast radius (it can refund €100, not €10,000).
- Weekly quality audit catches drift before it scales.
- A "kill switch" that disables AI Agent within 60 seconds if a quality issue surfaces.
In one year of shipping this on 60+ accounts, we've had two material quality incidents. Both were caught in the weekly audit. Neither involved a customer-facing chargeback. Cost: roughly two ticket-level apologies. That's the realistic risk profile.
What "AI replaces my CX team" misses
The pitch you'll hear from vendors: "AI Agent handles 50% of tickets, you cut your team in half".
The reality:
- AI Agent handles the easy 50% — order status, address change, sizing, simple refunds.
- Your team is now full-time on the hard 50% — escalations, complex returns, defects, edge cases, partnership ops, B2B.
- The hard 50% takes more effort per ticket, not less.
- Your team headcount probably stays flat, but skill mix shifts senior. Junior triage roles disappear. CX architects, problem-solvers and account-level relationship people grow.
What changes is what you can do at your existing headcount. A 10-person team with AI Agent can handle the volume of a 16-person team without it. The savings show up as growth absorbed without backfill, not as redundancies.
This is the version of "AI replaces work" that's true. The version where you cut your CX team in half on day one is the version that puts the AI Agent in a position to fail in week three because there's no human safety net behind it.
What to ship first
If you're considering AI Agent and want a tight first project, here's the 4-week minimum viable scope:
Week 1. Knowledge base draft. Refund policy, return windows, shipping cut-offs, sizing notes, escalation criteria. One Notion doc, signed off by ops and CX.
Week 2. Tone training. Feed AI Agent 90 days of resolved tickets. Tune until it sounds like you. Configure action permissions on refunds, address changes and exchanges.
Week 3. AI Agent in shadow mode. Drafts replies that go to human approval. Weekly quality audit on the drafts. No customer-facing automation yet.
Week 4. Flip auto-send on the highest-confidence categories — order status, address change, sizing FAQ. Watch quality. If it holds, expand the next week.
By week 8, you should be at 30%+ repeat-ticket coverage. By week 12, at 50%+. By month 6, fine-tuning quarterly.
When you should not ship AI Agent yet
A short list:
- You're below €2M GMV. Hand-craft your top 30 macros instead.
- Your refund policy isn't written down.
- Your top 10 ticket categories don't cover 70% of volume — your inbox is too fragmented.
- You don't have someone who'll own the knowledge base after launch.
- Your team is at war about whether to ship AI at all. Settle that first.
Each of those is a "fix this first" signal. Shipping AI on top of an unresolved one of those is the easiest way to spend €15K and conclude "AI doesn't work for our brand".
What we'd actually do
If you want a 30-minute read on whether AI Agent fits your account, we'll do that. We'll look at your top 10 ticket categories, your policy maturity, your team capacity, and tell you on the call whether to ship AI Agent now, in 90 days, or not yet. If "not yet" is the answer, we'll tell you exactly what to fix first.
If you sell on Shopify and you're serious about CX, you talk to custo.tech.
Book a demo — 30 minutes, no deck.
