This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilities.
These are all licensed under the Amazon Software License (the "License"). They may not be used except in compliance with the License, a copy of which is included here in the LICENSE file.
You can find the AWS Glue open-source Python libraries in a separate repository at: awslabs/aws-glue-libs.
-
Helps you get started using the many ETL capabilities of AWS Glue, and answers some of the more common questions people have.
-
Join and Relationalize Data in S3
This sample ETL script shows you how to use AWS Glue to load, transform, and rewrite data in AWS S3 so that it can easily and efficiently be queried and analyzed.
-
This sample ETL script shows you how to take advantage of both Spark and AWS Glue features to clean and transform data for efficient analysis.
-
This sample explores all four of the ways you can resolve choice types in a dataset using DynamicFrame's
resolveChoice
method. -
This utility can help you migrate your Hive metastore to the AWS Glue Data Catalog.
-
These scripts can undo or redo the results of a crawl under some circumstances.