-
Notifications
You must be signed in to change notification settings - Fork 81
Pull requests: NVIDIA/NeMo-Curator
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Remove lingering
DASK_DATAFRAME__QUERY_PLANNING
environment variables
#346
opened Nov 5, 2024 by
sarahyurick
Loading…
Synthetic Data Generation for Retriever Evaluation
#338
opened Oct 30, 2024 by
vinay-raman
Loading…
3 tasks done
Convert
translation_example.py
into a Jupyter Notebook tutorial
#336
opened Oct 29, 2024 by
sarahyurick
•
Draft
Add READMEs to
examples/
and nemo_curator/scripts
directories
#332
opened Oct 28, 2024 by
sarahyurick
Loading…
Add codepath for computing buckets without int conversion
#326
opened Oct 25, 2024 by
ayushdg
Loading…
3 tasks done
Dapt data curation tutorial fuzzy and semantic dedupe
gpuci
Run GPU CI/CD on PR
#322
opened Oct 24, 2024 by
ruchaa-apte
Loading…
[WIP] Retiring Run GPU CI/CD on PR
text_bytes_aware_shuffle
to use shuffle
directly
gpuci
#316
opened Oct 21, 2024 by
praateekmahajan
•
Draft
3 tasks
[WIP] MinHash improvement using minhash_permuted
#313
opened Oct 18, 2024 by
praateekmahajan
•
Draft
3 tasks
[DRAFT] Passing meta to map_partitions for read_data
#291
opened Oct 9, 2024 by
praateekmahajan
•
Draft
3 tasks
[DRAFT] Trying dask_cudf's read_json / read_parquet
#285
opened Oct 8, 2024 by
praateekmahajan
•
Draft
3 tasks
Added example notebook for translation with ct2 model.
documentation
Improvements or additions to documentation
Add Multiple Model Classification example
documentation
Improvements or additions to documentation
#173
opened Jul 30, 2024 by
sarahyurick
Loading…
Adding an example for executing NeMo modules using kubernetes Python …
documentation
Improvements or additions to documentation
#148
opened Jul 9, 2024 by
dpadmanabhan03
Loading…
2 of 3 tasks
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.