Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(MLOP-2604): Create delta writer #410

Open
wants to merge 3 commits into
base: staging
Choose a base branch
from

Conversation

ralphrass
Copy link
Contributor

@ralphrass ralphrass commented Feb 20, 2025

Why? 📖

Currently, Delta operations are not treated as a standard writer in Butterfree, making it difficult to handle Delta tables in the same way as other feature store writers. This PR introduces a Delta Feature Store Writer, enabling seamless integration of Delta operations into the existing pipeline.

What? 🔧

This PR introduces:

  • DeltaConfig: A configuration class for Delta table operations, allowing merge strategies and schema enforcement.
  • DeltaFeatureStoreWriter: A new writer to handle Delta tables as a feature store, supporting deduplication, conditional updates, and schema evolution.
  • DeltaWriter: Implements merge logic and optimization operations (such as VACUUM and OPTIMIZE) for Delta tables.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Release

How was everything tested? 📏

  • Created unit tests covering all functionalities.
  • Mocked Spark interactions to ensure correct behavior without requiring a running Delta Lake environment.

@ralphrass ralphrass self-assigned this Feb 20, 2025
@ralphrass ralphrass marked this pull request as ready for review February 20, 2025 17:57
@ralphrass ralphrass requested a review from a team as a code owner February 20, 2025 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant