Skip to content

A Flask-based web scraper that fetches the latest news headlines from The Atlantic and displays them on a webpage. The application integrates Google OAuth and reCAPTCHA for security.

Notifications You must be signed in to change notification settings

mai-repo/Newscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

News Scraper Application

This Flask-based application scrapes the latest news headlines and descriptions from the Atlantic and stores the data before rendering it on a webpage.

Table of Contents

Requirements

The following Python packages are required to run the application:

  • Flask: Web framework for Python.
  • requests: HTTP library for sending GET requests to fetch the news data.
  • beautifulsoup4: Library for parsing HTML and scraping data.
  • google-auth: Library for authenticating with Google services.
  • google-auth-oauthlib: Library for OAuth 2.0 authentication with Google.

Setup Instructions

Follow these steps to get the application up and running:

1. Clone the Repository

Clone this repository to your local machine:

git clone https://github.com/mai-repo/RG-Knowledge-Check-1.git

2. Create a virtual environment

python -m venv .venv

3. Create a .gitignore

  • Add .venv in the .gitignore file

4. Install Dependencies

  • Install all the required Python packages
pip3 install -r requirements.txt

5. Set Up Google Applications and Keys

  • Follow the instructions to set up Google applications and obtain the necessary keys for authentication.

Google Application Setup and .env Configuration

1. Create a Google Cloud Project

  • Go to the Google Cloud Console.
  • Create a new project and enable the following APIs:
    • Google Identity Services API
    • reCAPTCHA API

2. Create OAuth 2.0 Credentials

  • Go to Credentials in the Google Cloud Console.
  • Create OAuth 2.0 Client ID for a web app.
  • Add authorized origins (e.g., http://127.0.0.1:5000/).
  • Download the JSON file with client secrets.

3. Set Up reCAPTCHA

4. Create .env File

  • Create a .env file in the root directory of your project.
  • Add the following variables:
GOOGLE_CLIENT_SECRET=your-google-client-secret
RECAPTCHA_SECRET_KEY=your-recaptcha-secret-key

### 6. Run the Flask Application
```bash
export FLASK_APP=main.py
flask run

7. Open your web browser

A webpage with a webscraper that asks user to click a button to scrape data from the Atlantic and returns a JSON file with the latest headlines

Stretch Goals

  • Allow users to choose from a variety of news sites
  • A music player to let user listen to music while browsing articles

About

A Flask-based web scraper that fetches the latest news headlines from The Atlantic and displays them on a webpage. The application integrates Google OAuth and reCAPTCHA for security.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published