Metrics handbook
v1 · 2026-05-06
data-driven decisions
owners · product, engineering
We measure what customers pay for.
One metric leads. The rest catch us when it lies. This is the handbook for how we make product decisions, what we celebrate internally, and how we'll price the product when revenue arrives.
The north star
single number · self-correcting · pricing-aligned
North Star Metric
Customer-Verified Qualified Meetings
per week
CVQMs = meetings booked by celliq agents AND confirmed-as-qualified by the customer's AE within 7 days of the meeting happening
Why this and not "calls made" or "meetings booked": RevOps benchmarks their human SDRs by qualified meetings, not call volume. Beating their SDR baseline on this metric is the entire pitch — so optimising the same number aligns us with the customer. And it's hard to game: a flood of bad bookings doesn't raise it, because AEs don't verify them.
Booking ≠ value.
An AE-verified qualified meeting is the smallest unit of actual outcome.
The metric tree
six levers · each owned by a specific system
When CVQMs drop, the multiplier that broke is the diagnosis path. Every product investment maps to one of these six levers — that mapping is the gate for any roadmap proposal.
L1
Calls dialed
target · per agent / day
Owned by
trigger ingestion + scheduler
L2
Connect rate
target · ≥ 30%
Owned by
opener · voice trust · time-to-call · caller-ID
L3
Qualification capture
target · ≥ 85%
Owned by
planner + state machine + qual fields
L4
Book rate (of qualified)
target · ≥ 60%
Owned by
live tools · calendar · AE fit · slot density
L5
Show rate
target · ≥ 75%
Owned by
Calendar Concierge · pre-meeting confirms · no-show rescue
L6
AE-verification rate
target · ≥ 80%
Owned by
summary quality · picklist mapping · AE trust
product
Customer-Verified Qualified Meetings
Worked benchmark · per agent
≈ 5 CVQMs / day
200 dials · 30% connect · 85% qual · 60% book · 75% show · 80% verify
Per customer · per week
25 → 50
5 working days · 1–2 active agents per customer at MVP
Cost per CVQM · target
£8 → £14
Twilio + LLM + STT + TTS · before our margin
SDR replacement break-even
~ £40k / yr
Per customer · what celliq saves vs hiring an SDR
Counter-metrics
NSM growth that breaks any of these is fool's gold
Compliance event rate
DNC matches that should have been caught earlier, calling-window violations, missed disclosures.
owned by · compliance layer · runtime gates
Cost per CVQM
Twilio + LLM + STT + TTS spend ÷ verified meetings. Climb here kills margin without anyone noticing.
owned by · provider abstraction · model selection · prompt size
AE override rate
% of celliq-written CRM records the AE deletes or rewrites. Climbing override rate = trust eroding.
owned by · outcome writer · picklist mapping · summary quality
Review queue ratio
% of calls auto-flagged for human review. Some flags are healthy; runaway flags mean extraction is decaying.
owned by · post-call extraction · confidence thresholds
24h cancel rate
% of meetings cancelled by the prospect within 24h of booking. Catches "agent booked something the prospect immediately regretted."
owned by · planner · qualification · slot offer
p95 voice latency
User finishes speaking → agent starts speaking. Beyond ~1.5s, callers feel the lag and disengagement climbs.
owned by · voice runtime · provider choice · region
How the NSM evolves
stage-appropriate · what we measure today vs tomorrow
Now · pre-revenue
00
Pilots running with CVQMs > 0
We haven't earned the right to optimise CVQMs at scale. First goal: the system books a real qualified meeting at all.
Pilot · weeks 6–8
01
CVQMs per customer per week
Per-customer view, because at 1–3 customers averages mean nothing. Each pilot has its own number.
Steady · 5+ customers
02
Total CVQMs / week + per-customer trend
Both growth (total) and per-customer health (trend). Either failing is a real problem.
Maturity · attribution clean
03
Net Pipeline Influenced (£)
Once attribution is reliable, money is the better truth. CVQMs become a leading indicator under the new star.
Pricing alignment
customer pays for the same number we optimise
Outcome-based pricing follows the NSM. If we charge per CVQM (rather than per minute, per call, or per seat), the customer pays for what they actually wanted, and we have a unit-economics model that improves as the product improves. Every multiplier we optimise in the metric tree directly reduces our cost per CVQM and increases the customer's value.
Pricing model£ per CVQM
Indicative · UK SaaS£40 – £80 / CVQM
Margin floor at maturity≥ 70%
Vs. per-minute pricing+ aligned · - punishes long good calls
Per-minute is what most voice-agent vendors quote. It punishes thoughtful long calls and rewards fast bad ones. Per-CVQM rewards the agent that does the thing the customer hired it for.
What we don't chase
explicit · so we don't celebrate the wrong numbers
calls made
Pure noise.
Volume without quality is a vendor metric, not a customer metric. A team that hits a calls-made target by lowering qualification is destroying value.
meetings booked
Gameable.
Without the AE-verification step, "books at all costs" becomes a tactic. We've seen this play out in the human-SDR world — celliq doesn't repeat it.
£ pipeline influenced
Right metric · wrong stage.
It's where we end up at maturity, but during MVP it's too laggy and contentious for product feedback. We track it as a secondary, not a star.
connect rate
Input · not outcome.
Belongs in the tree (L2), not as the star. A team optimising for connect rate alone could ship a worse product overall.
customer NPS
Lagging · low resolution.
Doesn't capture daily product progress, doesn't move week-on-week, and is more about CSM relationship than product quality.
minutes of call audio
Vendor metric.
Talking longer ≠ doing better. This is the one most likely to be quoted at us by competitors and the one we deliberately don't price on.
How this drives decisions
PRDs · prioritisation · A/B · dashboards
PRDs
Which lever does this move?
Every PRD declares the metric-tree lever (L1–L6) it improves and by how much. A PRD that touches no lever and breaks no counter-metric is an honest signal that we shouldn't build it.
Prioritisation
Which lever is currently weakest?
We invest where the multiplication chain has its smallest number this quarter. If show rate (L5) is dragging, Calendar Concierge work outranks all other agent improvements until L5 recovers.
A/B testing
Did the variant raise its target lever without breaking counters?
Variants are graded on the specific lever they target plus all six counter-metrics. A variant that improves connect rate but lifts override rate is rejected.
Dashboards
Can a customer see the same numbers we see?
Customer-facing analytics shows L1–L6 + CVQMs, alongside the same counter-metrics. Trust is built by sharing the diagnosis, not just the conclusion.
Pricing reviews
Has cost-per-CVQM moved enough to reprice?
Quarterly. If our cost falls 30%+ we either lower price (compete) or hold price (margin). The choice is explicit, made on the data.
Hiring
Which lever does this person own?
Roles are scoped to a specific lever or counter-metric. A hire whose work doesn't tie to one of the eight numbers on this page is mis-scoped.
Approval
v1 · 2026-05-06 · revisit quarterly
This handbook becomes the canonical reference for product decisions. When something doesn't fit — a customer wants us to optimise something not in this tree, or an internal proposal doesn't tie to a lever — that's a flag, not an exception. Either the proposal is wrong, or the handbook is wrong. We update one or the other; we don't make quiet exceptions.