Skip to content

Commit

Permalink
feat: adding AI evaluations based on Microsoft.Extensions.AI.Evaluati…
Browse files Browse the repository at this point in the history
…on and Aspire.Hosting.Testing (#8)
  • Loading branch information
RicardoNiepel authored Jan 17, 2025
1 parent 44fc1df commit cc9fa66
Show file tree
Hide file tree
Showing 27 changed files with 558 additions and 28 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -398,4 +398,6 @@ FodyWeavers.xsd
*.sln.iml

**/appsettings.Development.json
.azure
.azure

src/ChatApp.EvaluationTests/report.html
65 changes: 51 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,16 @@ The frontend of the application is developed using React and Vite.

- [Features](#features)
- [Getting Started](#getting-started)
- [Prerequisites](#prerequisites)
- [Try it out](#try-it-out)
- [Prerequisites for running experiments](#prerequisites-for-running-experiments)
- [Try it out (experimentation phase)](#try-it-out-experimentation-phase)
- [Local Development](#local-development)
- [Prerequisites](#prerequisites-1)
- [Prerequisites for local development](#prerequisites-for-local-development)
- [Running the app](#running-the-app)
- [Evaluating results](#evaluating-results)
- [Running the Evaluation Tests](#running-the-evaluation-tests)
- [Generating the Evaluation report](#generating-the-evaluation-report)
- [Azure Deployment](#azure-deployment)
- [Prerequisites](#prerequisites-2)
- [Prerequisites for deployment](#prerequisites-for-deployment)
- [Instructions](#instructions)
- [Sample Product Data](#sample-product-data)
- [Guidance](#guidance)
Expand Down Expand Up @@ -70,25 +73,24 @@ The first step for getting started with this template are the notebooks which ca

![Notebook preview](./images/notebook_preview.png)

### Prerequisites
### Prerequisites for running experiments

- .NET 9 SDK
- VSCode
- [Polyglot Notebooks Extension](https://marketplace.visualstudio.com/items?itemName=ms-dotnettools.dotnet-interactive-vscode)
- [Azure Developer CLI (azd)](https://aka.ms/install-azd)
- [Node](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm)

### Try it out
### Try it out (experimentation phase)

Open the notebooks under `./experiments/` and follow their instructions.

## Local Development

### Prerequisites
### Prerequisites for local development

- .NET 9 SDK
- VSCode or Visual Studio 2022 17.12
- Node.js 22
- [Node.js 22](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm)
- [Azure CLI (az)](https://aka.ms/install-azcli)
- [Azure Developer CLI (azd)](https://aka.ms/install-azd)

Expand Down Expand Up @@ -135,11 +137,46 @@ If you want to use existing Azure resource, but their endpoints below the Azure
}
```

### Evaluating results

Creative Writer Assistant uses evaluators to assess application response quality.
The 4 metrics the evaluators in this project assess are **Coherence, Fluency, Relevance and Groundedness**.

To understand what is being evaluated open the `.\src\data\test\eval_inputs.json` file.
Observe that 3 examples of research, product and assignment context are stored in this file as different scenarios.
This data will be sent to the API so that each example will have the evaluations run and will **incoperate all of the context, research, products, and final article when grading the response**.

#### Running the Evaluation Tests

1. Make sure the Creative Writer application is configured and able to run on your local machine before running the tests.
The tests will call into the Creative Writer APIs to collect AI responses using an .NET Aspire test host.
2. The evaluation process will use the same Azure OpenAI model deployment which is used by the main application.
3. Run the tests from Visual Studio, VS Code, or `dotnet test`.

#### Generating the Evaluation report

1. Navigate into the `src\ChatApp.EvaluationTests` folder
2. Update your dotnet tools by running

```shell
dotnet tool restore
```

3. Run the aieval report command to generate a report file.

```shell
dotnet aieval report --path .\bin\Debug\net9.0\cache --output .\report.html
```

4. Open the `report.html` file in your web browser.

![AI Evaluations](./images/ai_evaluations.png)

## Azure Deployment

![Architecture](./images/container_architecture.png)

### Prerequisites
### Prerequisites for deployment

- [Azure CLI (az)](https://aka.ms/install-azcli)
- [Azure Developer CLI (azd)](https://aka.ms/install-azd)
Expand All @@ -151,14 +188,14 @@ Navigate into `./ChatApp.AppHost/`.

1. Sign in to your Azure account. You'll need to login to both the Azure Developer CLI and Azure CLI:
i. First with Azure Developer CLI
i. First with Azure Developer CLI
```shell
azd auth login
```
ii. Then sign in with Azure CLI
ii. Then sign in with Azure CLI
```shell
az login --use-device-code
```
Expand All @@ -182,7 +219,7 @@ To load sample product data into Azure AI Search as vector store, use the notebo
This template uses `gpt-4o` and `text-embedding-3-large` which may not be available in all Azure regions. Check for [up-to-date region availability](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#standard-deployment-model-availability) and select a region during deployment accordingly
* we recommend using eastus2 or swedencentral
- we recommend using eastus2 or swedencentral
### Costs
Expand Down
Binary file added images/ai_evaluations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion src/ChatApp.AppHost/Program.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

var vectorStoreCollectionName = Environment.GetEnvironmentVariable("VectorStoreCollectionName") ?? "products";

var exisitingOpenAi = !builder.Configuration.GetSection("ConnectionStrings")["openAi"].IsNullOrEmpty();
var exisitingOpenAi = false;//!builder.Configuration.GetSection("ConnectionStrings")["openAi"].IsNullOrEmpty();
var openAi = !builder.ExecutionContext.IsPublishMode && exisitingOpenAi
? builder.AddConnectionString("openAi")
: builder.AddAzureOpenAI("openAi")
Expand Down
13 changes: 13 additions & 0 deletions src/ChatApp.EvaluationTests/.config/dotnet-tools.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"version": 1,
"isRoot": true,
"tools": {
"microsoft.extensions.ai.evaluation.console": {
"version": "0.9.56-preview",
"commands": [
"aieval"
],
"rollForward": false
}
}
}
50 changes: 50 additions & 0 deletions src/ChatApp.EvaluationTests/ChatApp.EvaluationTests.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>net9.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<IsPackable>false</IsPackable>
<IsTestProject>true</IsTestProject>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Aspire.Hosting.Testing" Version="9.0.0" />
<PackageReference Include="coverlet.collector" Version="6.0.2" />
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.10.0" />
<PackageReference Include="xunit" Version="2.9.2" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.8.2" />
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.1.0-preview.1.25064.3" />
<PackageReference Include="Microsoft.Extensions.AI.Evaluation" Version="0.9.56-preview" />
<PackageReference Include="Microsoft.Extensions.AI.Evaluation.Quality" Version="0.9.56-preview" />
<PackageReference Include="Microsoft.Extensions.AI.Evaluation.Reporting" Version="0.9.56-preview" />
<PackageReference Include="Microsoft.ML.Tokenizers" Version="1.0.1" />
<PackageReference Include="Microsoft.ML.Tokenizers.Data.O200kBase" Version="1.0.1" />
<PackageReference Include="YamlDotNet" Version="16.0.0" />
</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\ChatApp.AppHost\ChatApp.AppHost.csproj" />
<ProjectReference Include="..\ChatApp.ServiceDefaults\ChatApp.ServiceDefaults.csproj" />

<AssemblyAttribute Include="System.Reflection.AssemblyMetadataAttribute">
<_Parameter1>EvalQuestionsJsonPath</_Parameter1>
<_Parameter2>$(ProjectDir)..\data\test\eval_inputs.json</_Parameter2>
</AssemblyAttribute>
</ItemGroup>

<ItemGroup>
<None Update="appsettings.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>

<ItemGroup>
<Using Include="System.Net" />
<Using Include="Microsoft.Extensions.DependencyInjection" />
<Using Include="Aspire.Hosting.ApplicationModel" />
<Using Include="Aspire.Hosting.Testing" />
<Using Include="Xunit" />
</ItemGroup>

</Project>
38 changes: 38 additions & 0 deletions src/ChatApp.EvaluationTests/EvalInput.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

using Xunit.Abstractions;

namespace ChatApp.EvaluationTests;

public class EvalQuestion : IXunitSerializable
{
public required int ScenarioId { get; set; }

public required string ResearchContext { get; set; }

public required string ProductContext { get; set; }

public required string AssignmentContext { get; set; }

void IXunitSerializable.Deserialize(IXunitSerializationInfo info)
{
ScenarioId = info.GetValue<int>("ScenarioId");
ResearchContext = info.GetValue<string>("ResearchContext");
ProductContext = info.GetValue<string>("ProductContext");
AssignmentContext = info.GetValue<string>("AssignmentContext");
}

void IXunitSerializable.Serialize(IXunitSerializationInfo info)
{
info.AddValue("ScenarioId", ScenarioId);
info.AddValue("ResearchContext", ResearchContext);
info.AddValue("ProductContext", ProductContext);
info.AddValue("AssignmentContext", AssignmentContext);
}

public override string ToString()
{
return $"Scenario = {ScenarioId}";
}
}
Loading

0 comments on commit cc9fa66

Please sign in to comment.