Skip to content

Use BQML (k-means) to generate customer segments (from BigQuery public data) and then calls Gemini in Vertex AI through SQL to create human readable headlines and summaries for each segment.

Notifications You must be signed in to change notification settings

amithmathew/bq-customer-segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simple ML Customer Segmentation - all in BigQuery!

This repo contains code that demonstrates how you can do ML based customer segmentation very easily using only BigQuery ML. To make things exciting, we'll use BigQuery's ability to call Vertex AI LLMs (like Gemini) directly through SQL to generate human-readable headlines and summaries of each segment by sending it sample data from the generated segments.

Prerequisites

You need a GCP project and access to the BigQuery public dataset, specifically bigquery-public-data.thelook_ecommerce. You should also create a BigQuery dataset in the US region within your project to hold the model we'll create, as well as other artifacts.

You should have a Cloud Resource connection setup in your project so that BigQuery can call Vertex AI. The documentation page here walks you through this process.

The streamlit app uses service account credentials that you need to specify in .streamlit/secrets.toml. To do this, create a Service Account with appropriate permissions to query BigQuery and download the credentials JSON file. Rename .streamlit/secrets.toml.template to .streamlit/secrets.toml and replace the empty strings in this file with the information in the downloaded service account credentials JSON file.

You should specify your Project ID and Dataset ID. To do this, open the bigquery_sql/BigQuery.sql file and find and replace all occurances of my-project with your project id and my-dataset with your dataset id. Do this in app.py as well.

Setup python prerequisites with

pip install -r requirements.txt

To Run the Streamlit app

streamlit run app.py

Enjoy!

About

Use BQML (k-means) to generate customer segments (from BigQuery public data) and then calls Gemini in Vertex AI through SQL to create human readable headlines and summaries for each segment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages