WBS Data Science Bootcamp Portfolio

Primer: A conversation with ChatGPT 3.5 about SQL challenges

It's interesting to see how well ChatGPT works and where its (current) limitations are, so I chatted with it about some of our SQL challenges. I found it a fun read!

Primer: No-Hangman

At the end of the two-weeks primer course on SQL, Tableau and Python, everybody builds a simple text-based Hangman (click on the image for my take on it):

Chapter 1: Eniac expansion from Spain to Brazil

In this case study, the company Eniac wants to expand its business to Brazil and evaluates the potential after-sales fulfillment partner Magist for its suitability.

$scatter plot of all sellers with the x-axis saying what fraction of products was sold in tech categories and the y-axis depicting the average monthly sales$

Chapter 2: Introduction to pandas

As second basic data-handling system after SQL, we were introduced to the Python pandas library. Read about our challenges here!

Chapter 3: Data Cleaning and Storytelling

In this quite intense two-weeks chapter, we were diving deep into a sales database with severe problems and learned how to still extract useful conclusions from it. Take a look!

Chapter 4: A/B Testing

In this one-week chapter, we learned a lot about the foundations of inferential statistics and deciding whether the outcome of UI experiments have statistical significance. Because the methods are applicable in a much larger class of problems, I found the material very helpful.

Chapter 5: Data Pipelines on the Cloud

In this two-week project, we learned and exercised ETL data-engineering skills, i.e. extracting, transforming and loading data into storage for comprehensive analysis. We scraped the web, used public APIs, transformed and augmented the data and stored it in an SQL database. The finished ETL process was then wrapped into a Google cloud function for automatic execution and I even went further to produce automatically updated reports on the data.

One of the deliverables was a blog post which I wrote on dev.to.

Chapter 6: Unsupervised ML - Clustering Songs

In this one-week project we learned about high dimensional distances, scaling, PCA, k-Means, inertia elbow and silhouette score and the Spotify API.

My special treat was to apply harmony theory to order songs by harmonic distance.

Chapter 7: Supervised ML - Housing Prices and Mushroom classification

Two weeks were devoted and crammed with insights into supervised machine-learning. We learned about

training data preparation
classification, regression
prediction metrics
decision trees, gradient boosted random forests
linear and logistic regression
support vector classifiers
one-hot and ordinal encoding
parameter optimization and cross-validation

and even pickling data and creating classifiers as web-apps with streamlit! Our model data-sets were selling prices of houses 🏰 and poisonous vs. edible mushrooms 🍄.

Chapter 8: Recommender systems

This week took us to learn about different ways to extract movie recommendations for the fictitious WBSFLIX online DVD rental shop from previous movie ratings. Read all about it here and check out the recommendation app!

Chapter 9: Advanced SQL 🗄️

I actually do like SQL a lot, so this one-week reinforcement on advances SQL topics was actually good fun!

Final project: Wave-energy converter power optimization with deep learning 🌊

coming soon

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
00_primer_chatgpt_on_sql		00_primer_chatgpt_on_sql
00_primer_no_hangman		00_primer_no_hangman
01_eniac_magist_case_study		01_eniac_magist_case_study
02_pandas		02_pandas
03_eniac_data_cleaning		03_eniac_data_cleaning
04_ab_testing		04_ab_testing
05_cloud_pipelines		05_cloud_pipelines
06_unsupervised_ml_clustering_songs		06_unsupervised_ml_clustering_songs
07_supervised_ml_house_prices		07_supervised_ml_house_prices
08_recommender_system		08_recommender_system
09_advanced_sql		09_advanced_sql
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WBS Data Science Bootcamp Portfolio

Primer: A conversation with ChatGPT 3.5 about SQL challenges

Primer: No-Hangman

Chapter 1: Eniac expansion from Spain to Brazil

Chapter 2: Introduction to pandas

Chapter 3: Data Cleaning and Storytelling

Chapter 4: A/B Testing

Chapter 5: Data Pipelines on the Cloud

Chapter 6: Unsupervised ML - Clustering Songs

Chapter 7: Supervised ML - Housing Prices and Mushroom classification

Chapter 8: Recommender systems

Chapter 9: Advanced SQL 🗄️

Final project: Wave-energy converter power optimization with deep learning 🌊

About

Releases

Packages

Languages

tvogel/datascience-bootcamp

Folders and files

Latest commit

History

Repository files navigation

WBS Data Science Bootcamp Portfolio

Primer: A conversation with ChatGPT 3.5 about SQL challenges

Primer: No-Hangman

Chapter 1: Eniac expansion from Spain to Brazil

Chapter 2: Introduction to pandas

Chapter 3: Data Cleaning and Storytelling

Chapter 4: A/B Testing

Chapter 5: Data Pipelines on the Cloud

Chapter 6: Unsupervised ML - Clustering Songs

Chapter 7: Supervised ML - Housing Prices and Mushroom classification

Chapter 8: Recommender systems

Chapter 9: Advanced SQL 🗄️

Final project: Wave-energy converter power optimization with deep learning 🌊

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages