From c82b0f1653c6383770bb7534d1c149e31937bdb4 Mon Sep 17 00:00:00 2001 From: Eduardo Salinas Date: Fri, 13 Sep 2024 16:29:44 -0400 Subject: [PATCH 1/3] remove wheel from requirements.txt wheel is only needed for packaging --- requirements.txt | 1 - 1 file changed, 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 6b7863f..ad44f66 100644 --- a/requirements.txt +++ b/requirements.txt @@ -3,4 +3,3 @@ isort==5.9.3 black==24.3.0 autoflake==1.7.5 mypy==1.10.1 -wheel \ No newline at end of file From faa3209d5a53b8dbb77bb267365a863a92a92ee9 Mon Sep 17 00:00:00 2001 From: Eduardo Salinas Date: Mon, 16 Sep 2024 11:57:03 -0400 Subject: [PATCH 2/3] Update README.md --- README.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 2bd8f84..5a46ca4 100644 --- a/README.md +++ b/README.md @@ -28,9 +28,10 @@ To get started, clone this repository to your local machine and navigate to the ### 📦 Generate wheel package to share with others 1. *activate venv* 2. Update version inside setup.py if needed. -3. ```python setup.py bdist_wheel``` -4. Fetch from dir dist/ the .whl -5. This file can be installed via `pip install eureka_ml_insights.whl` +3. Install wheel package via ```pip install wheel``` +4. ```python setup.py bdist_wheel``` +5. Fetch from dir dist/ the .whl +6. This file can be installed via `pip install eureka_ml_insights.whl` ## 💥 Quick start To reproduce the results of a pre-defined experiment pipeline, you can run the following command: @@ -122,4 +123,4 @@ If you use this framework in your research, please cite the following paper: url={TODO}, } -``` \ No newline at end of file +``` From 1e500c0c9a51cf23de10cc9c3081a6661ae5dd13 Mon Sep 17 00:00:00 2001 From: Eduardo Salinas Date: Mon, 16 Sep 2024 11:58:10 -0400 Subject: [PATCH 3/3] Update README.md --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 9cb7a4f..ee8a00b 100644 --- a/README.md +++ b/README.md @@ -143,7 +143,6 @@ If you use this framework in your research, please cite the following paper: } ``` - # Responsible AI Considerations A cross-cutting dimension for all capability evaluations is the evaluation of several aspects of model behavior important for the responsible fielding of AI systems. These consideration include the fairness, reliability, safety, privacy, and security of models. While evaluations through the Toxigen dataset (included in Eureka-Bench) capture notions of representational fairness for different demographic groups and, to some extent, the ability of the model to generate safe language despite non-safe input triggers in the prompt, other aspects or nuances of fairness and safety require further evaluation and additional clarity, which we hope to integrate in future versions and welcome contributions for. We are also interested in expanding Eureka-Bench with tasks where fairness and bias can be studied in more benign settings that simulate how risks may appear when humans use AI to assist them in everyday tasks (e.g. creative writing, information search etc.) and subtle language or visual biases encoded in training data might be reflected in the AI's assistance.