Tanay Arora
AI & Data Strategy · Part 1

From dashboards
to decisions.

There's a gap between data existing and data mattering. It shows up on Monday mornings, when a question the data should answer in seconds takes two days and a ticket. The infrastructure that closes that gap — a semantic layer, an automated weekly review, an intervention engine.

The gap worth closing

The pipelines were running. Product data existed. But metric definitions lived in fifteen different BI workbooks, nobody agreed on what "activation" meant, and every business question went through the data team. A ticket, a wait, an answer that arrived after the decision had already been made.

The first job was to fix the foundation — a single semantic layer where every metric was defined once and owned. The second was to make the data reachable without an engineer in the loop.

Raw SQL queries Report dbt models + Dashboards Self-service Observe ← most teams stop here Semantic layer + AI agents Auto-reports WBR workflow Automate Churn signals Risk scoring Auto-trigger Intercom/Slack Outcome loop Intervene ────── ────── ──────
Fig 1 — The four maturity stages: from reporting to automated intervention

I built the data platform at Lyrebird from scratch. Getting to stage two was the easy part — pipelines running, dashboards live, tickets getting closed. But it was clear that wasn't enough. For data to matter strategically it needs three things: to be accurate enough that people trust it, accessible enough that they don't need an engineer to get it, and oriented around the questions the business actually cares about. That's what was worth building.

Building the semantic layer first

Before any AI agent can query your data reliably, the data has to mean something. That sounds obvious, but most warehouses are a graveyard of inconsistently named tables, duplicated metric definitions, and logic that lives in fifteen different BI workbooks. An LLM pointing at that will hallucinate confidently and be wrong.

The fix is a semantic layer — a governed interface where every metric is defined once, in one place, with one owner. I built this using Snowflake semantic views: SQL views with explicit metric definitions, business-friendly column names, and row-level access controls that enforce what each role is allowed to see.

Sources
Stripe
Product DB
Gong / CRM
BP EMR
HubSpot
Amplitude
Linear
Intercom
Transform
dbt transformation layer
FCT_WEEKLY_USER_BUSINESS_METRICS · FCT_GONG_CALLS · staging models · CI/CD pipeline
Semantic
Snowflake Semantic Views
1 definition per metric · governed access controls · business-friendly names
BI / Dashboards
Finance / Product
WBR AI Agent
Claude via MCP
Fig 2 — Semantic layer architecture: one governed interface for every consumer, including AI agents

The critical design decision was treating AI agents as first-class consumers of the semantic layer, not an afterthought. When I connected Claude to Snowflake via MCP, it queried the semantic views — not raw tables. That means it sees business-friendly metric names, enforced access controls, and pre-validated logic. The agent can't accidentally query a table it shouldn't see, and it can't invent a metric definition that contradicts the one Finance is using.

The insight: A semantic layer built only for BI tools will fail when you add AI. Build it for the most demanding consumer first — an LLM that will query it at 3am without a human in the loop — and the BI use case becomes easy.

8
data sources
feeding the layer
31
internal users
on one platform
1
source of truth
per metric

Agentic analytics: the WBR that runs itself

The first thing I automated once the semantic layer was in place was the Weekly Business Review. Every Monday, a senior leadership team at Lyrebird would spend hours compiling metrics across six business sections — product, marketing, SMB sales, enterprise, conversion funnel, and north star. The output was a PowerPoint. The insight arrived 48 hours after the week ended.

I replaced it with an AI agent that queries the semantic layer, calculates 13-week trend tables, scores every metric against its vPlan target with RAG scoring, and generates a shareable Claude artifact and a formatted Word document — automatically, triggered on Monday morning.

Monday
07:00 AEST
Trigger fires
Snowflake query
Semantic views
13-week window
Claude agent
RAG scoring
Trend analysis
vPlan delta
Claude artifact
Interactive · shareable
Word document
Leadership-ready
Cron
~40s
~90s
Delivered
Fig 3 — The autonomous WBR pipeline: from cron trigger to shareable outputs in under 3 minutes

The human still writes the narrative — decisions, highlights, lowlights. That's appropriate. But the 3 hours of data compilation that preceded that thinking is now zero. The agent is also more consistent than a human: it never misses a metric, never miscalculates a vPlan score, and never formats a table differently from the previous week.

The intervention engine

The semantic layer and the automated WBR freed up enough time and trust in the data to tackle the problems that actually mattered commercially. At Lyrebird there were two: activation rates that were stalling out, and paying GPs going quiet with nobody watching.

Problem 1 — GP activation wasn't converting

GP activation rates had been declining — from 61% in January 2026 to 44% by May. The core issue: 43–55% of new GPs never got a note saved to Best Practice on day one, which meant they never hit the "dopamine hit" moment that drives retention. No note saved to BP, no value felt, no reason to come back.

The solution was a structured day-one onboarding flow with two diverging paths based on whether the GP reached that value moment or not.

ENTRY · COHORT DAY 1
User signs up & completes first session
→ User signs up
→ Completes first listening session  session 1
PATH 1 — VALUE MOMENT REACHED
Gets the "Dopamine Hit" Get Rock Day
→ Note saved to BP successfully
→ Trigger: dopamine hit notification sent
→ Show to-do list for next steps
→ Social proofing content surfaced
PATH 2 — NO VALUE MOMENT
Session done, note not saved
→ Session completed but note not saved to BP
→ Falls into EOD Day 1 comms flow
→ Research prompt triggered:
   "Why wasn't your note saved?"
END OF DAY 1 · COMMUNICATION BRANCHES
Three outreach paths based on integration status
2.1A Email — integrated Coming survey · trigger integration follow-up
2 Email — not integrated Non-integrated · nudge toward integrated recordings
2.1B Call-up High-touch outreach → feeds into Day 2 re-engagement
OPEN QUESTIONS
What we still need to answer
• Why are notes not being saved? — core blocker to reaching the "Dopamine Hit" path
• Investigate integration gaps and recording issues for non-integrated users
• Can social proofing / to-do list nudges help close the activation gap for Path 1 users?
43–55% non-activated
never saved
Fig 4 — GP day-one onboarding flow: two paths based on whether the "dopamine hit" value moment is reached

The onboarding flow above defines the logic — who gets what, and when. What follows is the architecture that makes it run. Snowflake does the qualification, HubSpot executes the comms, and the outcome data flows back into Snowflake to close the loop.

The Engine Behind the Flow
Snowflake
Batch job qualifies users
on business criteria
Splits into experiment
& control groups
Sync to HubSpot
User segments pushed
Experiment / control
flags included
HubSpot Workflows
Triggers comms per path
Path 1: dopamine hit flow
Path 2: EOD rescue flow
Control: no comms
Back to Snowflake
HubSpot outcomes
re-ingested
Experiment lift
measured
Experiment group — receives comms
Control group — no comms, measures natural recovery
Fig 4a — Activation engine architecture: Snowflake qualifies and splits users, HubSpot executes, outcomes flow back for measurement

The 43–55% of GPs who never saved a note to BP on day one were the problem population. For them, the flow triggered a research prompt ("Why wasn't your note saved?") and an end-of-day comms sequence segmented by integration status — integrated GPs got a follow-up survey, non-integrated got a nudge toward BP setup, and the highest-touch segment got a direct call routed into the day-two re-engagement flow.

Problem 2 — Paying GPs going quiet with no one watching

The second problem was harder to solve with comms alone. These were already-paying GPs — some with months of tenure — who had simply stopped using the product. No single trigger, no obvious day-one failure. Just a slow drift toward zero consults over several weeks, and a 5.4-week average window before they cancelled.

With 674 GPs showing warning signals across three risk tiers, and each percentage point of activation worth roughly 0.71 points of paid conversion, the cost of a manual spot-check process was too high. So I built a weekly churn risk engine and wired it directly to the CS team's workflow.

Snowflake
Weekly refresh
600+ paying GPs
Risk Engine
Critical — 0 WAU 4wk+
High — 0 WAU 2–3wk
Medium — declining
Watch — early signals
Weekly Skill
User-level risk list
MRR exposure
Delivered to CS lead
CS Lead
Emails · Calls
Records outcome
Signal → HubSpot
HubSpot
→ Snowflake
HubSpot data flows back into Snowflake, merged with product usage data to track whether outreach moved the needle
Critical (145 users · highest MRR exposure)
High (217 users)
Medium (312 users)
674 total at-risk · 5.4-week avg intervention window
Fig 5 — GP churn risk engine: weekly scoring feeds a CS skill, outreach signals flow back through HubSpot into Snowflake for effectiveness tracking

Every Monday after the Snowflake refresh, the engine classifies every non-enterprise paying GP into one of four risk tiers based on consecutive weeks of zero usage and MRR exposure. The output is a structured weekly skill delivered directly to the CS team lead — a prioritised list with user context, tenure, and risk tier. The lead then works through it: emails, calls, records the outcome signals in HubSpot.

The part that closes the loop: HubSpot data flows back into Snowflake, where it's merged with product usage data. That means we can see whether a CS outreach actually moved the needle — did the GP who got a call in week one start using the product again in weeks two and three? The signal from the CS team's work becomes data in the same system that generated the risk score.

The business case

Activation rate had fallen 17 percentage points over five months. When I ran the correlation against paid conversion, the relationship was clear — the two moved together, with each point of activation tracking closely with paid conversion. At the same time, the churn risk engine was flagging roughly a quarter of GP MRR as at risk. The cost of doing nothing had a number attached to it.

Key Signals · GP Business Health
−17pp
Activation rate decline over 5 months
0.73
Correlation between activation rate and paid conversion
~26%
of GP MRR showing active churn warning signals
Fig 6 — The signals that made the business case: activation decline, its correlation with conversion, and MRR at risk

The conversation became straightforward. Instead of "we think this will reduce churn," it was a correlation chart and a percentage of ARR on the table. That's what gets it on the roadmap.

The design principle: Every cohort is split — 70% receive the intervention, 30% are held out as a control. Without that holdout you can't tell whether the outreach worked or whether those users would have recovered anyway. The experiment structure is what turns an intervention programme into something you can actually learn from.

What I'd build next

The four-phase roadmap I set for Lyrebird has one phase remaining. Phases one through three — command centre, agentic WBR, and the intervention engine — are complete or in active use. Phase four is the forecasting layer: user-level churn probability scores updated weekly, 30-day MRR forecasts from leading indicators, and automated anomaly detection that fires before a human notices the chart moving.

The pattern I'd apply to any health or SaaS company at this stage is the same: build the semantic layer first so AI has something trustworthy to consume, automate the recurring reporting workflows to free up analyst time, then turn that freed time toward building the signal detection and intervention machinery. In that order. Companies that try to skip to the intervention engine without the semantic layer build something that confidently acts on bad data.

The infrastructure I'd use wouldn't change much either — Snowflake for the warehouse and semantic layer, dbt for the transformation layer with a proper CI/CD pipeline and testing strategy, GitHub Actions for orchestration, and whatever communication tools the customer-facing team is already using for the intervention delivery layer. The stack isn't exotic. What's hard is the design: knowing which signals matter, what the intervention playbook should contain, and how to close the loop so the system gets smarter over time.

More from this series
02
Self-Service Analytics
Data at the point of decision
03
AI & Data Governance
Governed infrastructure for agentic AI
04
Scaling dbt
The dbt project structure that scales
Tanay Arora
Senior Data Engineer · Melbourne, AU
LinkedIn GitHub Get in touch →
← Home Self-Service Analytics →