jsonl is a Python library designed to simplify working with JSON Lines data, adhering to the JSON Lines format.
- 🌎 Provides an API similar to Python's standard
json
module. - 🚀 Supports custom serialization/deserialization callbacks, with the standard
json
module as the default. - 🗜️ Supports compression and decompression using
gzip
,bzip2
, andxz
formats. - 🔧 Can load files with broken lines, skipping any malformed entries.
- 📦 Includes an easy-to-use utility for incrementally writing to multiple JSON Lines files.
To install jsonl using pip
, run the following command:
pip install py-jsonl
Dumping data to a JSON Lines File
Use jsonl.dump
to write an iterable of dictionaries to a JSON Lines file:
import jsonl
data = [
{"name": "Gilbert", "wins": [["straight", "7♣"], ["one pair", "10♥"]]},
{"name": "May", "wins": []},
]
jsonl.dump(data, "file.jsonl")
Loading data from a JSON Lines File
Use jsonl.load
to load a JSON Lines file into an iterable of objects:
import jsonl
iterable = jsonl.load("file.jsonl")
print(tuple(iterable))
Dumping data to Multiple JSON Lines Files
This example uses jsonl.dump_fork
to incrementally write daily temperature data for multiple cities to separate JSON
Lines files, exporting records for the first days of specified years.
It efficiently manages data by creating individual files for each city, optimizing memory usage.
import datetime
import itertools
import random
import jsonl
def fetch_temperature_by_city():
"""
Yielding filenames for each city with daily temperature data for the initial days of
the specified years.
"""
years = [2023, 2024]
first_days = 10
cities = ["New York", "Los Angeles", "Chicago"]
for year, city in itertools.product(years, cities):
start = datetime.datetime(year, 1, 1)
dates = (start + datetime.timedelta(days=day) for day in range(first_days))
daily_temperature = (
{"date": date.isoformat(), "city": city, "temperature": round(random.uniform(-10, 35), 2)}
for date in dates
)
yield (f"{city}.jsonl", daily_temperature)
# Write the generated data to files in JSON Lines format
jsonl.dump_fork(fetch_temperature_by_city())
For more detailed information and usage examples, refer to the project documentation
To contribute to the project, you can run the following commands for testing and documentation:
Install the development dependencies and run the tests:
(env)$ pip install -r requirements-dev.txt # Skip if already installed
(env)$ pytest tests/
(env)$ pytest --cov jsonl # Run tests with coverage
To build the documentation locally, use the following commands:
(env)$ pip install -r requirements-docs.txt # Skip if already installed
(env)$ mkdocs serve # Start live-reloading docs server
(env)$ mkdocs build # Build the documentation site
This project is licensed under the MIT license.