Skip to content

Athens Airbnb Data Analysis and Recommendation System Implementation based on the description of each Airbnb using TF–IDF and Cosine Similarity metric.

Notifications You must be signed in to change notification settings

alexiszamanidis/airbnb_data_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Athens Airbnb Data Analysis

The goal of the project is to analyze my country's Airbnb data set. Also, a simple implementation of a recommendation system based on the description of each Airbnb using Term Frequency–Inverse Document Frequency and Cosine Similarity metric.

Data sets

I downloaded the data sets for my country from here. You can do the same for your own or for any other country you want to analyze.

There are 3 datasets:

  • listings.csv
  • calendar.csv
  • reviews.csv

The whole procedure of each notebook consists of:

  1. Loading data sets.
  2. Droping any rows that have a nan value
  • word_cloud.ipynb
  1. Merging data sets
  2. Text preprocessing
  3. Generating Word Clouds
  • recommendation.ipynb
  1. Concatenating name and description columns
  2. Text preprocessing
  3. TF-IDF vectorization
  4. Calculating the similarity of each Airbnb with the others
  5. Storing 100 most similar Airbnbs for each one (Linear time)
  • listings.ipynb
  1. Cleaning price column
  2. Data analysis
  • calendar.ipynb
  1. Cleaning price column and separating date to year, month and day columns
  2. Data analysis

Word Clouds

Description

description

Last Review

last review

Neighbourhood

neighbourhood

Transit

transit

About

Athens Airbnb Data Analysis and Recommendation System Implementation based on the description of each Airbnb using TF–IDF and Cosine Similarity metric.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published