Skip to content

ML pipeline to classify disaster categories from text messages

License

Notifications You must be signed in to change notification settings

besson/disaster_response_detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Disaster Response Pipeline Project

Machine learning project to classify disaster labeled data from Figure Eight. This project is also part of a Nanodegree in Data Science at Udacity and it has 3 main components:

  1. ETL scripts to process the data
  2. Machine Learning pipeline to extract features, training and optimize a classifier model using grid search
  3. Web application to evaluate the model and get statistics about the training set data

Table of Contents

  1. Instructions
  2. File Descriptions
  3. Web app
  4. Licensing, Authors, and Acknowledgements

Instructions

  1. Run the following commands in the project's root directory to set up your database and model.

    • To run ETL pipeline that cleans data and stores in database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
    • To run ML pipeline that trains classifier and saves python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
  2. Run the following command in the app's directory to run your web app. python run.py

  3. Go to http://0.0.0.0:3001

File Descriptions

  • data/: ETL scripts to process data and save it into a relation database
  • models/: Machine learning pipeline code for extracting features, training and optimize the model using grid search
  • logs/: Output of best model performance (precision, recall and f1-score) using test data
  • plot_data/: Scripts to wrangle data and prepare web app plots
  • app/: Flask app code
  • test/: Unit tests to validate customized tokenizers and sklearn estimators

Web app

Evaluating the model

For a given message, the app runs the model and outputs the predicted categories.

png

Training set statistics

Overview of training set distribution and input message content. png

Licensing, Authors, and Acknowledgements

Must give credit to Figure Eight and Udacity for providing the dataset.

This project is under MIT License.

About

ML pipeline to classify disaster categories from text messages

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published