GitHub - JoyAlbertini/IMDB_Political_Reviews_Classifier: IMDb reviews scraper and NLP political bias analysis

IMDB Political Review Classifier

Disclaimers

Disclaimer: Non-Political Affiliation

The content and context of this project do not reflect any political affiliations or ideologies of the creator. Any interpretations or assumptions linking the project's material to specific political views are unfounded and not endorsed.

Content Sensitivity Disclaimer

Please be advised that this project may contain material that could be considered offensive to some individuals. This includes content derived from user contributions on IMDb. Viewer discretion is advised. The content is presented for educational, informational, or research purposes only and does not intend to harm, offend, or disparage any group or individual.

Overview

The application is designed to scrape, classify, and analyze IMDb film reviews to determine the influence of political bias on film ratings. It uses advanced natural language processing (NLP) techniques to categorize reviews into neutral, left-leaning, or right-leaning based on their content. This classification helps in recalculating the film ratings to provide a more unbiased view, reflecting the film's content rather than political sentiment.

The main goal of this tool is to provide film enthusiasts, critics, and researchers with a clearer, more objective perspective on film ratings by filtering out potential biases introduced by politically charged reviews. This could help users make better-informed decisions about which films to watch, based on ratings that truly reflect viewers' opinions about the film's content rather than its political implications.

Overview Video

Full Document

Application PDF

IMDB Link

For the Jupiter Notenbook and Application site to correct link to give is

Installation

To create a new environment

python -m venv .venv

to activate the new environment

source .venv/bin/activate

while active install the dependencies

pip install -r requirements.txt

This project, which utilizes selenium for scraping data as demonstrated in IMDB_Extended.py, requires the Chrome web browser to be installed on your machine. Additionally, it makes use of the ChromeWebDriver extension for Chrome to facilitate data scraping. The code is designed to automatically install ChromeWebDriver on your machine. If you prefer not to test the scraping capabilities of the application, you can simply use the already scraped data. By not adding a new link, ChromeWebDriver will not be installed.

Train the model

To create and train the model, simply run `political_bias_train.ipynb`` to initiate the training and model creation process.

Run the application

For this project, I have also developed a user interface that allows you to run the project outside of Jupyter Notebooks. To start, execute the following command:

python Interface.py

This will launch a webpage at http://127.0.0.1:8080/.

Scraping and generating the report of an IMDB page can take a while, for IMDb pages with a lot of reviews.

Notebooks

In this project there are several jupiter notebook

political_bias_train.ipynb : train the model and generated the data
ploitical_review.ipynb : is a notebook that you can put the link of a film and will analyze it, can give some problem in pycharm due to the html code, if it does unstrust then trust it an re run it, or open it with visual studio code it works great there
rating_analysis.ipynb : create the distributions of ratings.
political_review_display : DONT run it, is for creating html pages, can make lag the editor if runned.

Technologies

SpaCy: Handles text analysis and classification of reviews by political orientation.
Python Dash: Builds the interactive user interface for displaying results and settings.
Plotly: Creates visual representations of data, such as rating distributions.
HTML/CSS: Styles the web interface, enhancing visual appeal and usability.
Selenium: Automates the scraping of IMDb reviews to gather necessary data.

Analysis

In the Data folder, you will find several films already scraped, along with generated HTML pages for the political analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
Data		Data
NLP		NLP
Train_dataset		Train_dataset
assets		assets
document_creator		document_creator
gitData		gitData
.gitignore		.gitignore
AnalysisCollector.py		AnalysisCollector.py
FilmCollector.py		FilmCollector.py
IMDB_Extended.py		IMDB_Extended.py
Interface.py		Interface.py
Plots.py		Plots.py
README.MD		README.MD
Utils.py		Utils.py
political_bias_train.ipynb		political_bias_train.ipynb
political_review.ipynb		political_review.ipynb
political_review_display.ipynb		political_review_display.ipynb
rating_analysis.ipynb		rating_analysis.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDB Political Review Classifier

Disclaimers

Disclaimer: Non-Political Affiliation

Content Sensitivity Disclaimer

Overview

Full Document

IMDB Link

Installation

Train the model

Run the application

Notebooks

Technologies

Analysis

About

Releases

Packages

Languages

JoyAlbertini/IMDB_Political_Reviews_Classifier

Folders and files

Latest commit

History

Repository files navigation

IMDB Political Review Classifier

Disclaimers

Disclaimer: Non-Political Affiliation

Content Sensitivity Disclaimer

Overview

Full Document

IMDB Link

Installation

Train the model

Run the application

Notebooks

Technologies

Analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages