Unfurl

Extract rich metadata from URLs.

Installation

npm install @borderless/unfurl --save

Usage

Unfurl attempts to parse and extract rich structured metadata from URLs.

import { scraper, urlScraper } from "@borderless/unfurl";
import * as plugins from "@borderless/unfurl/dist/plugins";

Scraper

Accepts a request function and a list of plugins to use. The request is expected to return a "page" object, which is the same shape as the input to scrape(page).

const scrape = scraper({
  request,
  plugins: [plugins.htmlmetaparser, plugins.exifdata],
});

const res = await fetch("http://example.com"); // E.g. `popsicle`.

await scrape({
  url: res.url,
  status: res.status,
  headers: res.headers.asObject(),
  body: res.stream(), // Must stream the request instead of buffering to support large responses.
});

URL Scraper

Simpler wrapper around scraper that automatically makes a request(url) for the page.

const scrape = urlScraper({ request });

await scrape("http://example.com");

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
.github/workflows		.github/workflows
packages		packages
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unfurl

Installation

Usage

Scraper

URL Scraper

License

About

Releases 13

Packages

Contributors 5

Languages

License

borderless/unfurl

Folders and files

Latest commit

History

Repository files navigation

Unfurl

Installation

Usage

Scraper

URL Scraper

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 13

Packages 0

Contributors 5

Languages

Packages