Skip to content

A repository dedicated to maintaining resources about Data Science for beginners, intermediate as well as advanced practioners

License

Notifications You must be signed in to change notification settings

Sayar1106/awesome-data-scientist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 

Repository files navigation

awesome-data-scientist Awesome

A repository dedicated to provide a roadmap for prospective Data Science practitioners as well as maintaining resources about Data Science for beginners, intermediate as well as advanced practioners.

Table Of Contents πŸ—‚

Motivation ✊🏼

The primary motivation behind this repository was so that beginners in the field of Data Science will be able to find their way through the path towards becoming a Data Scientist. In today's information age, it is very easy to get lost in the vast amount of content available online. This repository will help navigate through this information by providing a structure towards learning Data Science.

Path to becoming a Data Scientist πŸš€

This section will try to develop a path towards becoming a Data Scientist.

We will assume about a 24 month window for this.

Depending on your starting point, you can choose where to start from. Even if you are an intermediate Data Science practioner, I would highly recommend going though this path from the start.

The roadmap presented in this repo isn't a-one-size-fits-all type of curriculum. Some of you coming from CS/Math/Statistics backgrounds may choose to cherry pick some resources while discarding others. This is perfectly okay. πŸ‘πŸΌ

Beginner (3-6 Months) ⭐️

Start here if you are a complete beginner to Data Science. or if you have recently completed an introduction to Machine Learning course like the one offered by Microsoft Azure on Udacity.

Learn Python 🐍

python-crash-course

free-ebook-link

A very hands on type of book, Python Crash Course helps readers get familiarized with the language as well as understand a lot of the best pratices that are followed when writing Python code.

lpthw

free-ebook-link

Learn Python the Hard Way is perhaps the best way to learn Python 3 for someone with 0 coding experience. The author provides a careful, methodical and sort of rigit approach to explaining concepts. This manner of teaching actually helps develop the best practices for learning Python the right way.

automate-the-boring-stuff

online-book-link

udemy-course-link

Automating the boring stuff with Python is a book that focuses on a more project based approach towards the language. Readers can expect to learn a variety of ways to use the Python language to create mini automation projects. A really good read!

Learn the Mathematics πŸ”’

Khan Academy

Books

mml

free-ebook-link

This book covers almost every mathamatical concept required to understand how Machine Learning Algorithms work. A really great book designed specifically for ML.

Learn Data Analysis Libraries πŸ“ˆ

Books

Python Data Analysis

free-ebook-link

Written by the person who created Pandas himself, Python for Data Analysis is a must read for Data Scientists. The book delves into numerous features provided by the Numpy and Pandas libraries.

Courses

Python Data Analysis

Python Data Science

Python Data Science

These two courses are freely available by freecodecamp on Youtube. I found the content really easy to understand for beginners. They are lengthy but they provide a fantastic coverage of performing Data Analysis using Python.

Kaggle Notebooks and/or Project 1 πŸ—’

At this point which should be around the 2-4 month mark, you should be able to start working on Kaggle. By this, I do not mean competing in competitions.

Kaggle users post datasets on which EDA notebooks can be created by other users. With your knowledge of Python, you can start creating these notebooks and gain popularity in the Kaggle community.

You may also start working on exploratory data analysis based projects where you find a dataset of your interest and do some data analysis on it.

Learn Machine Learning Concepts πŸ’»

Courses

Books

100-page-ml-book

link-to-booksite(not free)

A book that is does exactly as the title says, the 100 page Machine Learning book actually covers the entire breadth of Machine Learning in the space of 100 pages! Anybody who wants a super high level understanding of Machine Learning will love this book.

intro-to-ml-with-sklearn-tensorflow

link-to-ebook

Perhaps the most well known book published about practical Machine Learning, Aurelien Geron's masterpiece provides an impressive coverage of two of the most beloved Machine Learning libraries.

Note that for the above book, it is advisable to read upto Part 1 initially. This will cover the scikit-learn library.

We can move to part two in the next 6 months which involves Deep Learning with Tensorflow.

Intermediate (6-12 Months) ⭐ ️

At this stage, you will be comfortable working with Data Science libraries and different datasets. Now, we will be moving into slightly more advanced topics like Deep Learning.

Learn Deep Learning Concepts and Libraries πŸ‘¨πŸ½β€πŸ’»πŸ‘©πŸ½β€πŸ’»

Courses

Books

intro-to-ml-with-sklearn-tensorflow

link-to-ebook

This book makes a second appearance here. This time, you would want to read the second part of the book which is focused on Tensorflow.

Deep Learning

free-link-ebook

Another book written by the author of a famous library, Deep Learning with Python offers a deep-dive into the famous Keras library.

Note that there is a newer version of the book set to be released which will cover the use of Keras in Tensorflow 2.0.

Learn Deep Learning and Machine Learning Math (Optional) β¨Šβˆβˆ€

This section is left optional because these books will take a lot of effort to read and understand. There is a lot of math involved within the content of these books but it really helps to understand what is really happening underneath the Machine Learning libraries that we have learnt so far.

Deep Learning

free-ebook-link

The most famous book about Deep Learning, it is a must read for all Deep Learning enthusiasts. Although there is significant mathematics presented in the topic, the approach is easy to grasp.

elements-of-statistical-learning

free-ebook-link

Another very famous book that focuses on the mathematics behind classical machine learning models, The Elements of Statistical Learning is a classic.

Project 2 πŸ’»

Depending on how soon you have completed the material (excluding the optional section), you might want to considering building another project at this point. You can pick either a classical ML project or a Deep Learning one.

Advanced (12-18 Months) ⭐️

At this stage, you would have sufficient knowledge about the breadth of the field of Data Science and Machine Learning. There are a couple of paths one could take here depending on the interests and goals.

Project 3 and/or 4 πŸ’»

These projects could be domain specific such as:

  • Computer Vision
  • NLP
  • Reinforcement Learning

Data Science Competitions 🦾

Look up competitions that are happening in websites like Kaggle, Hackerearth, DrivenData and join and participate! If you are in college, you could also look at college level Data Science competitions as well.

Implementing a research paper πŸ€–

This is technically also a project but in this case, you will be specifically taking a research paper of interest and trying to implement the concepts presented in it. This is definitely the hardest among all of the three tracks mentioned here. However, having a research paper implementation project in your resume is pretty awesome as it will help you stand out from the crowd.

Professional/ Job Ready (18-24 Months) πŸ’Ό

This section will focus on making you a really well rounded Data Scientist. By now, your resume should have atleast 2-4 Data Science projects. In these sections, we will try to cover the remaining aspects of landing that Data Science job.

Learning and Practicing SQL πŸ“Š

Many roadmaps may include this step at the start. However, the reason I include this at the end is because I feel the best way to learn SQL is to practice it. Post this stage, most of you will be lining up interviews for Data Science positions. Most companies will have a round of SQL questions in their interviews. Hence, by practicing it now, your practical concepts will remain fresh.

Courses

Practice

Data Structures and Algorithms 🧩

Again, one of those topics that a lot of roadmaps either choose to include in the initial stages or discard entirely. Eitherways, having a strong foundation in Data Structures and Algorithms is very important.

Data Science roles may not test these concepts as hard as a Software Engineering role. However, most FAANG companies as well as a lot of startups do ask a round of coding interview questions even for Data Science roles.

If you are from a CS background, you will most likely be familiar with all the topics presented here.

Courses

Practice

There will me more resources available for this topic in resources section

Git and Github πŸ‘»

Essential skill to have. Every company working with software will be using version control. Hence, it is absolutely essential to have a sound knowledge of git.

Books

Pro Git

free-book-link

In my opinion, the best book there is to learn Git.

Courses

git-crash-course

A really nice 30 min crash course to understand Git.

Next Steps

This is where most of you can start applying to Data Science jobs. Congratulations!!! πŸ₯³ You have officially completed the journey of becoming a data scientist 😁😁😁

Bonus (sprinkle in) 😻

If you have more time left on your hands, this bonus material may help strengthen your profile even more before applying for Data Science jobs. This section might be more relevant for more experienced folks in the field as well.

Command Line and Software Engineering

Cloud Computing

MLOps

DevOps

Web Developement

Research papers

Miscellaeneous Resources

To be filled soon!

About

A repository dedicated to maintaining resources about Data Science for beginners, intermediate as well as advanced practioners

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published