The dbt project structure that scales

The gap worth closing

The pipelines were running. Product data existed. But metric definitions lived in fifteen different BI workbooks, nobody agreed on what "activation" meant, and every business question went through the data team. A ticket, a wait, an answer that arrived after the decision had already been made.

The first job was to fix the foundation — a single semantic layer where every metric was defined once and owned. The second was to make the data reachable without an engineer in the loop.

Fig 1 — The four maturity stages: from reporting to automated intervention

I built the data platform at Lyrebird from scratch. Getting to stage two was the easy part — pipelines running, dashboards live, tickets getting closed. But it was clear that wasn't enough. For data to matter strategically it needs three things: to be accurate enough that people trust it, accessible enough that they don't need an engineer to get it, and oriented around the questions the business actually cares about. That's what was worth building.

Building the semantic layer first

Before any AI agent can query your data reliably, the data has to mean something. That sounds obvious, but most warehouses are a graveyard of inconsistently named tables, duplicated metric definitions, and logic that lives in fifteen different BI workbooks. An LLM pointing at that will hallucinate confidently and be wrong.

The fix is a semantic layer — a governed interface where every metric is defined once, in one place, with one owner. I built this using Snowflake semantic views: SQL views with explicit metric definitions, business-friendly column names, and row-level access controls that enforce what each role is allowed to see.

Sources

Stripe

Product DB

Gong / CRM

BP EMR

HubSpot

Amplitude

Linear

Intercom

Transform

dbt transformation layer

FCT_WEEKLY_USER_BUSINESS_METRICS · FCT_GONG_CALLS · staging models · CI/CD pipeline

Semantic

Snowflake Semantic Views

1 definition per metric · governed access controls · business-friendly names

BI / Dashboards

Finance / Product

WBR AI Agent

Claude via MCP

Fig 2 — Semantic layer architecture: one governed interface for every consumer, including AI agents

The critical design decision was treating AI agents as first-class consumers of the semantic layer, not an afterthought. When I connected Claude to Snowflake via MCP, it queried the semantic views — not raw tables. That means it sees business-friendly metric names, enforced access controls, and pre-validated logic. The agent can't accidentally query a table it shouldn't see, and it can't invent a metric definition that contradicts the one Finance is using.

The insight: A semantic layer built only for BI tools will fail when you add AI. Build it for the most demanding consumer first — an LLM that will query it at 3am without a human in the loop — and the BI use case becomes easy.

data sources
feeding the layer

internal users
on one platform

source of truth
per metric

Agentic analytics: the WBR that runs itself

The first thing I automated once the semantic layer was in place was the Weekly Business Review. Every Monday, a senior leadership team at Lyrebird would spend hours compiling metrics across six business sections — product, marketing, SMB sales, enterprise, conversion funnel, and north star. The output was a PowerPoint. The insight arrived 48 hours after the week ended.

I replaced it with an AI agent that queries the semantic layer, calculates 13-week trend tables, scores every metric against its vPlan target with RAG scoring, and generates a shareable Claude artifact and a formatted Word document — automatically, triggered on Monday morning.

Monday

07:00 AEST
Trigger fires

Snowflake query

Semantic views
13-week window

Claude agent

RAG scoring
Trend analysis
vPlan delta

Claude artifact

Interactive · shareable

Word document

Leadership-ready

Cron
~40s
~90s
Delivered

Fig 3 — The autonomous WBR pipeline: from cron trigger to shareable outputs in under 3 minutes

The human still writes the narrative — decisions, highlights, lowlights. That's appropriate. But the 3 hours of data compilation that preceded that thinking is now zero. The agent is also more consistent than a human: it never misses a metric, never miscalculates a vPlan score, and never formats a table differently from the previous week.

The intervention engine

The semantic layer and the automated WBR freed up enough time and trust in the data to tackle the problems that actually mattered commercially. At Lyrebird there were two: activation rates that were stalling out, and paying GPs going quiet with nobody watching.

Problem 1 — GP activation wasn't converting

GP activation rates had been declining — from 61% in January 2026 to 44% by May. The core issue: 43–55% of new GPs never got a note saved to Best Practice on day one, which meant they never hit the "dopamine hit" moment that drives retention. No note saved to BP, no value felt, no reason to come back.

The solution was a structured day-one onboarding flow with two diverging paths based on whether the GP reached that value moment or not.

ENTRY · COHORT DAY 1

User signs up & completes first session

→ User signs up
→ Completes first listening session session 1

PATH 1 — VALUE MOMENT REACHED

Gets the "Dopamine Hit" Get Rock Day

→ Note saved to BP successfully
→ Trigger: dopamine hit notification sent
→ Show to-do list for next steps
→ Social proofing content surfaced

PATH 2 — NO VALUE MOMENT

Session done, note not saved

→ Session completed but note not saved to BP
→ Falls into EOD Day 1 comms flow
→ Research prompt triggered:
"Why wasn't your note saved?"

END OF DAY 1 · COMMUNICATION BRANCHES

Three outreach paths based on integration status

2.1A Email — integrated Coming survey · trigger integration follow-up

2 Email — not integrated Non-integrated · nudge toward integrated recordings

2.1B Call-up High-touch outreach → feeds into Day 2 re-engagement

OPEN QUESTIONS

What we still need to answer

• Why are notes not being saved? — core blocker to reaching the "Dopamine Hit" path
• Investigate integration gaps and recording issues for non-integrated users
• Can social proofing / to-do list nudges help close the activation gap for Path 1 users?

43–55% non-activated
never saved

Fig 4 — GP day-one onboarding flow: two paths based on whether the "dopamine hit" value moment is reached

The onboarding flow above defines the logic — who gets what, and when. What follows is the architecture that makes it run. Snowflake does the qualification, HubSpot executes the comms, and the outcome data flows back into Snowflake to close the loop.

The Engine Behind the Flow

Snowflake

Batch job qualifies users
on business criteria
Splits into experiment
& control groups

Sync to HubSpot

User segments pushed
Experiment / control
flags included

HubSpot Workflows

Triggers comms per path
Path 1: dopamine hit flow
Path 2: EOD rescue flow
Control: no comms

Back to Snowflake

HubSpot outcomes
re-ingested
Experiment lift
measured

Experiment group — receives comms

Control group — no comms, measures natural recovery

Fig 4a — Activation engine architecture: Snowflake qualifies and splits users, HubSpot executes, outcomes flow back for measurement

The 43–55% of GPs who never saved a note to BP on day one were the problem population. For them, the flow triggered a research prompt ("Why wasn't your note saved?") and an end-of-day comms sequence segmented by integration status — integrated GPs got a follow-up survey, non-integrated got a nudge toward BP setup, and the highest-touch segment got a direct call routed into the day-two re-engagement flow.

Problem 2 — Paying GPs going quiet with no one watching

The second problem was harder to solve with comms alone. These were already-paying GPs — some with months of tenure — who had simply stopped using the product. No single trigger, no obvious day-one failure. Just a slow drift toward zero consults over several weeks, and a 5.4-week average window before they cancelled.

With 674 GPs showing warning signals across three risk tiers, and each percentage point of activation worth roughly 0.71 points of paid conversion, the cost of a manual spot-check process was too high. So I built a weekly churn risk engine and wired it directly to the CS team's workflow.

Snowflake

Weekly refresh
600+ paying GPs

Risk Engine

● Critical — 0 WAU 4wk+
● High — 0 WAU 2–3wk
● Medium — declining
● Watch — early signals

Weekly Skill

User-level risk list
MRR exposure
Delivered to CS lead

CS Lead

Emails · Calls
Records outcome
Signal → HubSpot

HubSpot

→ Snowflake

HubSpot data flows back into Snowflake, merged with product usage data to track whether outreach moved the needle

Critical (145 users · highest MRR exposure)

High (217 users)

Medium (312 users)

674 total at-risk · 5.4-week avg intervention window

Fig 5 — GP churn risk engine: weekly scoring feeds a CS skill, outreach signals flow back through HubSpot into Snowflake for effectiveness tracking

Every Monday after the Snowflake refresh, the engine classifies every non-enterprise paying GP into one of four risk tiers based on consecutive weeks of zero usage and MRR exposure. The output is a structured weekly skill delivered directly to the CS team lead — a prioritised list with user context, tenure, and risk tier. The lead then works through it: emails, calls, records the outcome signals in HubSpot.

The part that closes the loop: HubSpot data flows back into Snowflake, where it's merged with product usage data. That means we can see whether a CS outreach actually moved the needle — did the GP who got a call in week one start using the product again in weeks two and three? The signal from the CS team's work becomes data in the same system that generated the risk score.

The business case

Activation rate had fallen 17 percentage points over five months. When I ran the correlation against paid conversion, the relationship was clear — the two moved together, with each point of activation tracking closely with paid conversion. At the same time, the churn risk engine was flagging roughly a quarter of GP MRR as at risk. The cost of doing nothing had a number attached to it.

Key Signals · GP Business Health

−17pp

Activation rate decline over 5 months

0.73

Correlation between activation rate and paid conversion

~26%

of GP MRR showing active churn warning signals

Fig 6 — The signals that made the business case: activation decline, its correlation with conversion, and MRR at risk

The conversation became straightforward. Instead of "we think this will reduce churn," it was a correlation chart and a percentage of ARR on the table. That's what gets it on the roadmap.

The design principle: Every cohort is split — 70% receive the intervention, 30% are held out as a control. Without that holdout you can't tell whether the outreach worked or whether those users would have recovered anyway. The experiment structure is what turns an intervention programme into something you can actually learn from.

From dashboards
to decisions.

The gap worth closing

Building the semantic layer first

Agentic analytics: the WBR that runs itself

The intervention engine

Problem 1 — GP activation wasn't converting

Problem 2 — Paying GPs going quiet with no one watching

The business case

What I'd build next

From dashboardsto decisions.

The gap worth closing

Building the semantic layer first

Agentic analytics: the WBR that runs itself

The intervention engine

Problem 1 — GP activation wasn't converting

Problem 2 — Paying GPs going quiet with no one watching

The business case

What I'd build next

From dashboards
to decisions.