Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 744 Bytes

README.md

File metadata and controls

18 lines (11 loc) · 744 Bytes

GraduateNU Major Scraper

This repo houses GraduateNU's major requirements scraper. It scrapes the Northeastern Academic Catalog.

Setup

Clone the repo and run:
pnpm install

Running

After install in dependencies you can run the scraper with:
pnpm scrape.

The scraper scrapes the current catalog by default, but you can specify one or more years for it to scrape as command line arguments. For example to scrape the catalog for 2021, 2022, and the current year, you'd write the following:
pnpm scrape 2021 2022 current

This will populate the results folder with parsed JSON files and the catalogCache folder with cached HTML.