Ecommerce Retention Analytics: GA4 + BigQuery Case Study

How a warehouse-first GA4 + BigQuery architecture improved ecommerce retention analytics, LTV, and channel profitability through cohort and user-level modelling.

Mujaheed Abdul Wahab

2/12/20263 min read

Most ecommerce brands believe retention is a marketing problem.

They launch email campaigns.

They test discounts.

They optimize creatives.

Yet repeat purchase rate barely moves.

The reality is this:

Retention is rarely a campaign issue. It is usually a measurement and modelling issue.

Without user-level modelling, identity stitching, and structured cohort analysis, brands optimize for first purchase conversions while remaining blind to long-term customer value.

This case study breaks down how AnalyticsFlow rebuilt the analytics foundation for a growing ecommerce brand and turned retention from a vague KPI into a measurable growth lever.

The brand was scaling aggressively across paid channels.

• Revenue growing steadily

• Strong top-of-funnel performance

• Healthy reported ROAS

• Increasing customer acquisition costs

However:

• Repeat purchase rate was stagnant

• Customer lifetime value was unclear

• Leadership lacked visibility into which channels drove high-value customers

The executive team began asking the right questions:

Are we scaling profitably, or just buying revenue?

The Business Context
The Root Cause: Measurement Architecture Gaps

Our audit revealed structural issues across the stack.

1. Fragmented Tracking

• Web and app tracking were not unified

• Inconsistent event naming conventions

• No standardized ecommerce schema

• Data layer inconsistencies

2. Identity Breakdown

• No enforced user_id strategy

• Cross-device behavior fragmented

• Returning users are misclassified as new

This made retention analysis unreliable.

3. Over-Reliance on GA4 Interface Reporting

• Session-level reporting dominated analysis

• No customer-level modelling

• No cohort structure

• No lifetime value calculations

The organization was optimizing on surface metrics.

Without warehouse modelling, retention insights were guesswork.

We rebuilt the analytics system across four layers.

Layer 1: Unified Event Architecture

We designed a standardized event schema across the web and app.

Core ecommerce events were normalized:

• view_item

• add_to_cart

• begin_checkout

• purchase

Additional lifecycle events were defined:

• first_purchase

• repeat_purchase

• subscription_renewal

Every event followed strict naming, parameter governance, and documentation standards.

This created a clean, scalable measurement foundation.

Layer 3: BigQuery Data Modelling

This is where the transformation happened.

Instead of relying on GA4 dashboards, we leveraged raw GA4 export into BigQuery and built structured data models designed for retention analysis.

Customer Fact Table

Included:

• user_id

• first_purchase_date

• last_purchase_date

• total_orders

• total_revenue

• acquisition_channel

• days_since_last_purchase

This allowed user-level segmentation and lifecycle tracking.

Order Fact Table

Included:

• order_id

• user_id

• revenue

• product breakdown

• timestamp

This enabled product-level retention analysis.

Cohort Retention Model

We constructed a cohort table structured by:

• cohort_month

• retention_month_index

• repeat_purchase_rate

• revenue_per_user

This allowed visualization of retention curves over time.

LTV by Acquisition Source Model

We calculated:

• Revenue per user by channel

• Repeat rate by channel

• Time to second purchase

• 30, 60, 90 day LTV

This reframed marketing performance evaluation entirely.

The Insights That Changed the Growth Strategy

Once the modelling layer was live, insights became actionable.

1. Paid Social Was Driving Low Retention Users

A large percentage of paid social customers never made a second purchase.

ROAS looked acceptable on first purchase metrics, but long-term value was weak.

2. Organic Search Delivered 2x Higher LTV

Organic customers:

• Purchased more frequently

• Returned faster

• Generated higher cumulative revenue

This shifted investment toward SEO and high-intent acquisition.

3. The 45-Day Retention Window

Repeat purchases were heavily concentrated within the first 45 days.

After that, the probability dropped sharply.

This insight reshaped lifecycle marketing timing.

4. Product-Level Retention Triggers

A small subset of SKUs consistently triggered second purchases.

These became focal products in bundles and upsell flows.

Business Impact

With retention modelling integrated into decision-making:

• Budget allocation shifted toward high LTV channels

• Lifecycle campaigns were redesigned around the 45-day window

• Product bundling strategy evolved

• Leadership moved from ROAS-based decisions to profitability-based decisions

Retention improved, but more importantly, growth became more predictable and capital-efficient.

Introduction: The Real Retention Problem
The AnalyticsFlow Solution: A Warehouse-First Retention Framework
Layer 2: Identity and Data Integrity

Retention modelling depends on reliable identity stitching.

We implemented:

• Consistent user_id assignment

• Cross-device identity enforcement

• Login-based stitching strategy

• Server-side tracking for reliability

Event QA validation framework

Now, users were tracked as users, not sessions.

Why Most Ecommerce Brands Struggle With Retention Analytics

The problem is not a lack of tools.

The problem is a lack of modelling.

Common gaps we see:

• No unified event schema

• No warehouse layer

• No user-level identity strategy

• No cohort modelling

No integration between marketing and data engineering

Without these layers, retention remains anecdotal.

The AnalyticsFlow Approach

At AnalyticsFlow, we do not treat analytics as reporting.

We engineer data systems that support:

• Customer lifetime value modelling

• Cohort retention analysis

• Channel profitability evaluation

• Product contribution analysis

• Executive-level growth decisions

Retention becomes measurable, not assumed.

Conclusion: Retention Is a Data Infrastructure Problem

Ecommerce growth does not break because of poor marketing alone.

It breaks when businesses scale without a reliable user-level data architecture.

If your analytics stack stops at GA4 dashboards, you are likely optimizing first purchases while ignoring long-term value.

A warehouse-first approach changes that.

Retention is not a campaign tactic.

It is an engineered system.