- DBT Github - https://github.com/shubhM13/saas_account_analytics
- Colab Notebooks Github - https://github.com/shubhM13/saas_account_analytics/tree/main/colab_notebooks
- Weekly Cohort Retention Analytics - here
- Monthly Cohort Retention Analytics - here
- 7d_rolling_active_users Analytics - here
- Basic Account Analysis - here
Data from a Banking SaaS application includes:
- account_created
- account_closed
- account_reopened
- account_transactions
To develop a reliable data model representing different Monzo accounts. Ensure accuracy, completeness, usability, and documentation. Implement tests to validate data integrity.
To calculate the 7d_active_users metric: users with transactions over the past 7 days divided by users with at least one open account.
Requirements:
- Intuitive and flexible data model
- Exclude users with only closed accounts
- Calculate the metric for any date
- Ensure historical consistency
Implemented in DBT on BigQuery with advanced analytics in Google Colab.
##Architecture
The solution uses a Medallion architecture:
- Bronze (Raw): Raw transactional tables.
- Silver (Clean): Cleaned and deduplicated data.
- Gold (Mart): Aggregated facts and dimensions.
The approach follows Ralph Kimball’s dimensional modelling with fact and dimension tables for efficient querying and reporting.
Fact Tables
fact_active_user_transactions
mt_daily_active_users
mt_daily_cum_active_accounts
mt_rolling_7day_active_users
Dimension Tables
dim_account
dim_date
dim_users
Raw Data (Bronze)
account_created
account_closed
account_reopened
account_transactions
Clean Data (Silver)
account_created.sql
account_closed.sql
account_reopened.sql
account_transactions.sql
Marts (Gold)
dim_account.sql
dim_date.sql
dim_users.sql
fact_valid_user_transactions.sql
mt_daily_active_users.sql
mt_daily_cum_active_accounts.sql
mt_rolling_7day_active_users.sql
Cohorts
cohort_retention_monthly.sql
cohort_retention_weekly.sql
user_activity.sql
user_first_action.sql
Implemented in schema.yml
for uniqueness, completeness, accepted values, integrity, consistency, and range/format.
Analyzed user activity by day and month to identify engagement patterns and inform strategic decisions. Seasonality analysis helps understand high/low activity periods.
Identify trends and patterns using 7d_rolling_active_users metric. Box plots show distribution of user activity by day and month.
Cohort analysis helps understand user behavior over time, focusing on retention, activation, and activity. It informs strategies for improving customer engagement and retention.
Visualize retention trends over time using log scales for a nuanced view of user engagement.
Heatmap visualizes retention percentages for each cohort and period, guiding strategic decisions to enhance user engagement.
More detailed analysis: