SQL for Product Managers and Business Analysts
The 4 queries that cover 90% of what PMs and BAs actually need — feature adoption, funnel drop-off, retention cohorts, and revenue attribution. Written in plain English first. SQL second. With AI doing the heavy lifting.
Why PMs and BAs should learn SQL (and what to skip)
You don't need to become a data engineer. You need to answer your own questions without opening a ticket. That's a different bar — and a much lower one.
The goal is not to write perfect SQL. The goal is to write correct enough SQL to validate a hypothesis, answer a leadership question, or build the context you need before handing off to engineering. Claude writes the first draft. You read it, verify one number by hand, and move on.
What you need to learn: SELECT, WHERE, GROUP BY, JOIN, aggregate functions, and CTEs. That's it. Everything else you ask Claude. The 4 queries below demonstrate all of these in realistic PM/BA contexts.
Before you write queries: read your schema
SQL works on your actual table names and column names — not generic ones. Before writing any query, run this in your database dashboard (Supabase, Metabase, DBeaver, etc.):
Paste the output into Claude: “Here is my schema. Now write me a query that [your question].” This one habit will make every query 10x more accurate.
The 4 queries
Who is using this feature?
Feature adoption · Usage segmentation · Rollout tracking
The situation
You shipped a feature 3 weeks ago. Leadership asks: 'Who's using it?' You need a count by cohort (new vs. returning users, plan tier, signup date) — not a screenshot from Amplitude.
Plain English (give this to Claude first)
SQL
SELECT u.plan_tier, COUNT(DISTINCT e.user_id) AS users_who_used_feature, COUNT(e.id) AS total_events, MIN(e.created_at)::date AS first_used, MAX(e.created_at)::date AS last_used FROM events e JOIN users u ON u.id = e.user_id WHERE e.event_name = 'feature_x_clicked' AND e.created_at >= NOW() - INTERVAL '30 days' GROUP BY u.plan_tier ORDER BY users_who_used_feature DESC;
What this teaches you
SELECT, FROM, JOIN, WHERE, GROUP BY, COUNT(DISTINCT). These 6 keywords appear in almost every PM query you'll ever write.
Where are users dropping off?
Funnel analysis · Onboarding drop-off · Checkout abandonment
The situation
You have a 5-step onboarding flow. 2,400 users start it. 310 finish. Where exactly are they leaving — and is it getting better or worse over time?
Plain English (give this to Claude first)
SQL
WITH funnel AS (
SELECT
user_id,
MAX(CASE WHEN event_name = 'onboarding_step_1' THEN 1 ELSE 0 END) AS step_1,
MAX(CASE WHEN event_name = 'onboarding_step_2' THEN 1 ELSE 0 END) AS step_2,
MAX(CASE WHEN event_name = 'onboarding_step_3' THEN 1 ELSE 0 END) AS step_3,
MAX(CASE WHEN event_name = 'onboarding_step_4' THEN 1 ELSE 0 END) AS step_4,
MAX(CASE WHEN event_name = 'onboarding_completed' THEN 1 ELSE 0 END) AS completed
FROM events
WHERE created_at >= NOW() - INTERVAL '30 days'
GROUP BY user_id
)
SELECT
SUM(step_1) AS entered_step_1,
SUM(step_2) AS reached_step_2,
SUM(step_3) AS reached_step_3,
SUM(step_4) AS reached_step_4,
SUM(completed) AS completed_onboarding,
ROUND(SUM(completed)::numeric / NULLIF(SUM(step_1), 0) * 100, 1) AS completion_rate_pct
FROM funnel;What this teaches you
CTEs (WITH), CASE WHEN, conditional aggregation, NULLIF to avoid divide-by-zero. These are the patterns that turn raw events into funnel charts.
Are users coming back?
Retention analysis · Cohort tables · Churn prediction
The situation
Your sign-up numbers look healthy. But are those users still around in week 2? Week 4? A retention cohort shows the real picture — and it's almost always harder to hear than the signup chart.
Plain English (give this to Claude first)
SQL
WITH cohorts AS (
SELECT
user_id,
DATE_TRUNC('week', MIN(created_at)) AS cohort_week
FROM events
WHERE event_name = 'account_created'
GROUP BY user_id
),
activity AS (
SELECT
e.user_id,
DATE_TRUNC('week', e.created_at) AS activity_week
FROM events e
WHERE e.event_name = 'session_start'
GROUP BY e.user_id, DATE_TRUNC('week', e.created_at)
)
SELECT
c.cohort_week,
COUNT(DISTINCT c.user_id) AS cohort_size,
COUNT(DISTINCT CASE WHEN a.activity_week = c.cohort_week + INTERVAL '1 week'
THEN a.user_id END) AS week_1_retained,
COUNT(DISTINCT CASE WHEN a.activity_week = c.cohort_week + INTERVAL '4 weeks'
THEN a.user_id END) AS week_4_retained
FROM cohorts c
LEFT JOIN activity a ON a.user_id = c.user_id
WHERE c.cohort_week >= NOW() - INTERVAL '12 weeks'
GROUP BY c.cohort_week
ORDER BY c.cohort_week;What this teaches you
DATE_TRUNC for time-based grouping, LEFT JOIN to include users with no activity, cohort analysis pattern. This query is the foundation of every retention deck you'll ever build.
What's driving revenue?
Revenue attribution · Plan upgrades · Expansion MRR
The situation
Finance wants to know which features correlate with upgrades. You want to know if the users who hit your key activation event are 3x more likely to pay — and you want to show the data, not guess.
Plain English (give this to Claude first)
SQL
SELECT
CASE
WHEN activated.user_id IS NOT NULL THEN 'Activated'
ELSE 'Not Activated'
END AS activation_status,
COUNT(DISTINCT u.id) AS users,
COUNT(DISTINCT CASE WHEN u.plan_tier = 'paid'
THEN u.id END) AS paid_users,
ROUND(
COUNT(DISTINCT CASE WHEN u.plan_tier = 'paid' THEN u.id END)::numeric
/ NULLIF(COUNT(DISTINCT u.id), 0) * 100, 1
) AS conversion_rate_pct,
ROUND(AVG(u.lifetime_value_cents) / 100.0, 2) AS avg_ltv_usd
FROM users u
LEFT JOIN (
SELECT DISTINCT user_id
FROM events
WHERE event_name = 'key_activation_event'
AND created_at >= NOW() - INTERVAL '90 days'
) activated ON activated.user_id = u.id
WHERE u.created_at >= NOW() - INTERVAL '90 days'
GROUP BY activation_status;What this teaches you
LEFT JOIN with a subquery, CASE WHEN for segmentation, AVG with arithmetic for unit conversion. This is the activation → revenue correlation every growth PM needs.
The 5-step workflow for writing SQL with AI
Don't start with the SQL. Start with the question. This workflow works for every query you'll ever write.
Write the question in plain English
Start here: "I want to know which plan tier has the highest feature adoption rate in the last 30 days." This is your spec. Don't touch SQL yet.
Describe your schema to Claude
Paste your table definitions. "I have a `users` table (id, plan_tier, created_at) and an `events` table (id, user_id, event_name, created_at). Write a query for my question above."
Ask Claude to explain before you run
"Before I run this, explain what each clause does in plain English. I want to understand it, not just copy it." This catches errors and teaches you the pattern.
Run it against a date-limited sample first
Add `LIMIT 100` and a tight date range before running against your full table. A runaway query on 10M rows can lock your database.
Verify one number by hand
Pick one row in the result. Count the raw events manually (or spot-check against a known source). One verified data point is worth 10 unverified ones.
3 mistakes that waste hours
Querying without a LIMIT
Always add LIMIT 100 when exploring. A full table scan on 50M rows will time out, burn compute credits, and frustrate your data team. Add the LIMIT first, verify the shape of the data, then remove it for the real run.
Using SELECT * in production queries
SELECT * pulls every column — including blobs, PII, and columns you don't need. Name your columns explicitly. It's better documentation and significantly faster on wide tables.
Trusting the first number without a sanity check
AI-written SQL is usually correct. Usually. Before putting a number in a deck or a message to leadership, verify one row by hand against a source of truth (your BI tool, a raw export, a known data point). One verified number is worth 10 unverified ones.
The one skill that makes everything else faster
Learn to read a database schema like a PM reads a PRD — not to memorize it, but to understand the nouns and verbs of the system. Tables are nouns (users, orders, events). Joins are relationships between them. Once you see the schema as a data model instead of a technical artifact, every question you have maps naturally to a query structure.
This skill — reading schemas, understanding data models, knowing what's possible — is what separates PMs who wait for dashboards from PMs who build them. Zero to Ship Module 12 covers the full data product stack: schema design, query patterns, and shipping interactive data apps to real URLs.
Ready to build the full data product?
Module 12 (Data Products) takes you from SQL queries to a deployed interactive data app — charts, filters, drill-downs, and a live URL you can share with stakeholders.