diff --git a/README.md b/README.md index 1812d16..afdae65 100644 --- a/README.md +++ b/README.md @@ -59,14 +59,14 @@ This sample supports different architectural styles. It can be deployed as stand This repo is focused to showcase different options to implement **"chat with your private documents"** scenario using RAG patterns with Java, Azure OpenAI and Semantic Kernel. Below you can find the list of available implementations. -| Conversational Style | RAG Approach | Description | Java Open AI SDK | Java Semantic Kernel | -|:---------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:----------------------| -| One Shot Ask | [PlainJavaAskApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/PlainJavaAskApproach.java) | Use Azure AI Search and Java OpenAI APIs. It first retrieves top documents from search and use them to build a prompt. Then, it uses OpenAI to generate an answer for the user question.Several search retrieval options are available: Text, Vector, Hybrid. When Hybrid and Vector are selected an additional call to OpenAI is required to generate embeddings vector for the question. | :white_check_mark: | :x: | -| Chat | [PlainJavaChatApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/PlainJavaChatApproach.java) | Use Azure AI Search and Java OpenAI APIs. It first calls OpenAI to generate a search keyword for the chat history and then answer to the last chat question. Several search retrieval options are available: Text, Vector, Hybrid. When Hybrid and Vector are selected an additional call to OpenAI is required to generate embeddings vector for the chat extracted keywords. | :white_check_mark: | :x: | -| One Shot Ask | [JavaSemanticKernelWithMemoryApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithMemoryApproach.java) | Use Java Semantic Kernel framework with built-in MemoryStore for embeddings similarity search. A semantic function [RAG.AnswerQuestion](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerQuestion/config.json) is defined to build the prompt using Memory Store vector search results.A customized version of SK built-in [CognitiveSearchMemoryStore](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/memory/CustomAzureCognitiveSearchMemoryStore.java.ignore) is used to map index fields populated by the documents ingestion process. | :x: | This approach is currently disabled within the UI, memory feature will be available in the next java Semantic Kernel GA release | -| One Shot Ask | [JavaSemanticKernelChainsApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelChainsApproach.java) | Use Java Semantic Kernel framework with semantic and native functions chaining. It uses an imperative style for AI orchestration through semantic kernel functions chaining. [InformationFinder.SearchFromQuestion](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchPlugin.java) native function and [RAG.AnswerQuestion](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerQuestion/config.json) semantic function are called sequentially. Several search retrieval options are available: Text, Vector, Hybrid. | :x: | :white_check_mark: | -| Chat | [JavaSemanticKernelWithMemoryApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithMemoryChatApproach.java.ignore) | Use Java Semantic Kernel framework with built-in MemoryStore for embeddings similarity search. A semantic function [RAG.AnswerConversation](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerQuestion/config.json) is defined to build the prompt using Memory Store vector search results. A customized version of SK built-in [CognitiveSearchMemoryStore](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/memory/CustomAzureCognitiveSearchMemoryStore.java.ignore) is used to map index fields populated by the documents ingestion process. | :x: | :x: This approach is currently disabled within the UI, memory feature will be available in the next java Semantic Kernel GA release | -| Chat | [JavaSemanticKernelChainsApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelChainsChatApproach.java) | Use Java Semantic Kernel framework with semantic and native functions chaining. It uses an imperative style for AI orchestration through semantic kernel functions chaining. [InformationFinder.SearchFromConversation](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchPlugin.java) native function and [RAG.AnswerConversation](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerConversation/config.json) semantic function are called sequentially. Several search retrieval options are available: Text, Vector, Hybrid. | :x: | :white_check_mark: | +| Conversational Style | RAG Approach | Description | Java Open AI SDK | Java Semantic Kernel | +|:---------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:----------------------| +| One Shot Ask | [PlainJavaAskApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/PlainJavaAskApproach.java) | Use Azure AI Search and Java OpenAI APIs. It first retrieves top documents from search and use them to build a prompt. Then, it uses OpenAI to generate an answer for the user question.Several search retrieval options are available: Text, Vector, Hybrid. When Hybrid and Vector are selected an additional call to OpenAI is required to generate embeddings vector for the question. | :white_check_mark: | :x: | +| Chat | [PlainJavaChatApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/PlainJavaChatApproach.java) | Use Azure AI Search and Java OpenAI APIs. It first calls OpenAI to generate a search keyword for the chat history and then answer to the last chat question. Several search retrieval options are available: Text, Vector, Hybrid. When Hybrid and Vector are selected an additional call to OpenAI is required to generate embeddings vector for the chat extracted keywords. | :white_check_mark: | :x: | +| One Shot Ask | [JavaSemanticKernelWithVectorStoreApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithMemoryApproach.java) | Use Java Semantic Kernel framework with built-in VectorStore for embeddings similarity search. A semantic function [RAG.AnswerQuestion](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerQuestion/config.json) is defined to build the prompt using vector search results. An AzureAISearchVectorStoreRecordCollection instance is used to manage the AzureAISearch index populated by the documents ingestion process. | :x: | :white_check_mark: | +| One Shot Ask | [JavaSemanticKernelChainsApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelChainsApproach.java) | Use Java Semantic Kernel framework with semantic and native functions chaining. It uses an imperative style for AI orchestration through semantic kernel functions chaining. [InformationFinder.SearchFromQuestion](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchPlugin.java) native function and [RAG.AnswerQuestion](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerQuestion/config.json) semantic function are called sequentially. Several search retrieval options are available: Text, Vector, Hybrid. | :x: | :white_check_mark: | +| Chat | [JavaSemanticKernelWithVectorStoreApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithMemoryChatApproach.java.ignore) | Use Java Semantic Kernel framework with built-in VectorStore for embeddings similarity search. A semantic function [RAG.AnswerConversation](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerQuestion/config.json) is defined to build the prompt using vector search results. An AzureAISearchVectorStoreRecordCollection instance is used to manage an AzureAISearch index populated by the documents ingestion process. | :x: | :white_check_mark: | +| Chat | [JavaSemanticKernelChainsApproach](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelChainsChatApproach.java) | Use Java Semantic Kernel framework with semantic and native functions chaining. It uses an imperative style for AI orchestration through semantic kernel functions chaining. [InformationFinder.SearchFromConversation](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchPlugin.java) native function and [RAG.AnswerConversation](https://github.com/Azure-Samples/azure-search-openai-demo-java/blob/main/app/backend/src/main/resources/semantickernel/Plugins/RAG/AnswerConversation/config.json) semantic function are called sequentially. Several search retrieval options are available: Text, Vector, Hybrid. | :x: | :white_check_mark: | ## Getting Started diff --git a/app/backend/pom.xml b/app/backend/pom.xml index cfca289..81178e7 100644 --- a/app/backend/pom.xml +++ b/app/backend/pom.xml @@ -17,8 +17,8 @@ 17 5.14.0 - 11.6.0-beta.8 - 1.2.2 + 11.7.2 + 1.4.2 4.5.1 3.11.0 @@ -123,6 +123,10 @@ com.microsoft.semantic-kernel semantickernel-aiservices-openai + + com.microsoft.semantic-kernel + semantickernel-data-azureaisearch + diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImpl.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImpl.java index b7ff61d..53e1c31 100644 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImpl.java +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImpl.java @@ -2,8 +2,10 @@ import com.microsoft.openai.samples.rag.ask.approaches.PlainJavaAskApproach; import com.microsoft.openai.samples.rag.ask.approaches.semantickernel.JavaSemanticKernelChainsApproach; +import com.microsoft.openai.samples.rag.ask.approaches.semantickernel.JavaSemanticKernelWithVectorStoreApproach; import com.microsoft.openai.samples.rag.chat.approaches.PlainJavaChatApproach; import com.microsoft.openai.samples.rag.chat.approaches.semantickernel.JavaSemanticKernelChainsChatApproach; +import com.microsoft.openai.samples.rag.chat.approaches.semantickernel.JavaSemanticKernelWithVectorStoreChatApproach; import org.springframework.context.ApplicationContext; import org.springframework.context.ApplicationContextAware; import org.springframework.stereotype.Component; @@ -13,7 +15,6 @@ public class RAGApproachFactorySpringBootImpl implements RAGApproachFactory, App private static final String JAVA_OPENAI_SDK = "jos"; private static final String JAVA_SEMANTIC_KERNEL = "jsk"; - private static final String JAVA_SEMANTIC_KERNEL_PLANNER = "jskp"; private ApplicationContext applicationContext; @@ -29,6 +30,8 @@ public RAGApproach createApproach(String approachName, RAGType ragType, RAGOptio if (ragType.equals(RAGType.CHAT)) { if (JAVA_OPENAI_SDK.equals(approachName)) { return applicationContext.getBean(PlainJavaChatApproach.class); + } else if (JAVA_SEMANTIC_KERNEL.equals(approachName)) { + return applicationContext.getBean(JavaSemanticKernelWithVectorStoreChatApproach.class); } else if ( JAVA_SEMANTIC_KERNEL_PLANNER.equals(approachName) && ragOptions != null && @@ -39,6 +42,8 @@ public RAGApproach createApproach(String approachName, RAGType ragType, RAGOptio } else if (ragType.equals(RAGType.ASK)) { if (JAVA_OPENAI_SDK.equals(approachName)) return applicationContext.getBean(PlainJavaAskApproach.class); + else if (JAVA_SEMANTIC_KERNEL.equals(approachName)) + return applicationContext.getBean(JavaSemanticKernelWithVectorStoreApproach.class); else if (JAVA_SEMANTIC_KERNEL_PLANNER.equals(approachName) && ragOptions != null && ragOptions.getSemantickKernelMode() != null && ragOptions.getSemantickKernelMode() == SemanticKernelMode.chains) return applicationContext.getBean(JavaSemanticKernelChainsApproach.class); } diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithMemoryApproach.java.ignore b/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithMemoryApproach.java.ignore deleted file mode 100644 index 099efb3..0000000 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithMemoryApproach.java.ignore +++ /dev/null @@ -1,195 +0,0 @@ -// Copyright (c) Microsoft. All rights reserved. -package com.microsoft.openai.samples.rag.ask.approaches.semantickernel; - -import com.azure.ai.openai.OpenAIAsyncClient; -import com.azure.core.credential.TokenCredential; -import com.azure.search.documents.SearchAsyncClient; -import com.azure.search.documents.SearchDocument; -import com.microsoft.openai.samples.rag.approaches.ContentSource; -import com.microsoft.openai.samples.rag.approaches.RAGApproach; -import com.microsoft.openai.samples.rag.approaches.RAGOptions; -import com.microsoft.openai.samples.rag.approaches.RAGResponse; -import com.microsoft.openai.samples.rag.ask.approaches.semantickernel.memory.CustomAzureCognitiveSearchMemoryStore; -import com.microsoft.semantickernel.Kernel; -import com.microsoft.semantickernel.SKBuilders; -import com.microsoft.semantickernel.ai.embeddings.Embedding; -import com.microsoft.semantickernel.memory.MemoryQueryResult; -import com.microsoft.semantickernel.memory.MemoryRecord; -import com.microsoft.semantickernel.orchestration.SKContext; -import java.io.OutputStream; -import java.util.List; -import java.util.function.Function; -import java.util.stream.Collectors; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; -import org.springframework.beans.factory.annotation.Value; -import org.springframework.stereotype.Component; -import reactor.core.publisher.Mono; - -/** - * Use Java Semantic Kernel framework with built-in MemoryStore for embeddings similarity search. A - * semantic function is defined in RAG.AnswerQuestion (src/main/resources/semantickernel/Plugins) to - * build the prompt template which is grounded using results from the Memory Store. A customized - * version of SK built-in CognitiveSearchMemoryStore is used to map index fields populated by the - * documents ingestion process. - */ -@Component -public class JavaSemanticKernelWithMemoryApproach implements RAGApproach { - private static final Logger LOGGER = - LoggerFactory.getLogger(JavaSemanticKernelWithMemoryApproach.class); - private final TokenCredential tokenCredential; - private final OpenAIAsyncClient openAIAsyncClient; - - private final SearchAsyncClient searchAsyncClient; - - private final String EMBEDDING_FIELD_NAME = "embedding"; - - @Value("${cognitive.search.service}") - String searchServiceName; - - @Value("${cognitive.search.index}") - String indexName; - - @Value("${openai.chatgpt.deployment}") - private String gptChatDeploymentModelId; - - @Value("${openai.embedding.deployment}") - private String embeddingDeploymentModelId; - - public JavaSemanticKernelWithMemoryApproach( - TokenCredential tokenCredential, - OpenAIAsyncClient openAIAsyncClient, - SearchAsyncClient searchAsyncClient) { - this.tokenCredential = tokenCredential; - this.openAIAsyncClient = openAIAsyncClient; - this.searchAsyncClient = searchAsyncClient; - } - - /** - * @param question - * @param options - * @return - */ - @Override - public RAGResponse run(String question, RAGOptions options) { - - // Build semantic kernel context with Azure Cognitive Search as memory store. AnswerQuestion - // skill is imported from src/main/resources/semantickernel/Plugins. - Kernel semanticKernel = buildSemanticKernel(options); - - /** - * STEP 1: Retrieve relevant documents using user question Use semantic kernel built-in - * memory.searchAsync. It uses OpenAI to generate embeddings for the provided question. - * Question embeddings are provided to cognitive search via search options. - */ - List memoryResult = - semanticKernel - .getMemory() - .searchAsync(indexName, question, options.getTop(), 0.5f, false) - .block(); - - LOGGER.info( - "Total {} sources found in cognitive vector store for search query[{}]", - memoryResult.size(), - question); - - String sources = buildSourcesText(memoryResult); - List sourcesList = buildSources(memoryResult); - - // STEP 2: Build a SK context with the sources retrieved from the memory store and the user - // question. - SKContext skcontext = - SKBuilders.context() - .build() - .setVariable("sources", sources) - .setVariable("input", question); - - // STEP 3: Get a reference of the semantic function [AnswerQuestion] of the [RAG] plugin - // (a.k.a. skill) from the SK skills registry and provide it with the pre-built context. - Mono result = - semanticKernel.getFunction("RAG", "AnswerQuestion").invokeAsync(skcontext); - - return new RAGResponse.Builder() - // .prompt(plan.toPlanString()) - .prompt( - "Prompt is managed by SK and can't be displayed here. See App logs for" - + " prompt") - // STEP 4: triggering Open AI to get an answer - .answer(result.block().getResult()) - .sources(sourcesList) - .sourcesAsText(sources) - .question(question) - .build(); - } - - @Override - public void runStreaming( - String questionOrConversation, RAGOptions options, OutputStream outputStream) { - throw new IllegalStateException("Streaming not supported for this approach"); - } - - private List buildSources(List memoryResult) { - return memoryResult.stream() - .map( - result -> { - return new ContentSource( - result.getMetadata().getId(), result.getMetadata().getText()); - }) - .collect(Collectors.toList()); - } - - private String buildSourcesText(List memoryResult) { - StringBuilder sourcesContentBuffer = new StringBuilder(); - memoryResult.stream() - .forEach( - memory -> { - sourcesContentBuffer - .append(memory.getMetadata().getId()) - .append(": ") - .append(memory.getMetadata().getText().replace("\n", "")) - .append("\n"); - }); - return sourcesContentBuffer.toString(); - } - - private Kernel buildSemanticKernel(RAGOptions options) { - var kernelWithACS = - SKBuilders.kernel() - .withMemoryStorage( - new CustomAzureCognitiveSearchMemoryStore( - "https://%s.search.windows.net" - .formatted(searchServiceName), - tokenCredential, - this.searchAsyncClient, - this.EMBEDDING_FIELD_NAME, - buildCustomMemoryMapper())) - .withDefaultAIService( - SKBuilders.textEmbeddingGeneration() - .withOpenAIClient(openAIAsyncClient) - .withModelId(embeddingDeploymentModelId) - .build()) - .withDefaultAIService( - SKBuilders.chatCompletion() - .withModelId(gptChatDeploymentModelId) - .withOpenAIClient(this.openAIAsyncClient) - .build()) - .build(); - - kernelWithACS.importSkillFromResources( - "semantickernel/Plugins", "RAG", "AnswerQuestion", null); - return kernelWithACS; - } - - private Function buildCustomMemoryMapper() { - return searchDocument -> { - return MemoryRecord.localRecord( - (String) searchDocument.get("sourcepage"), - (String) searchDocument.get("content"), - "chunked text from original source", - new Embedding((List) searchDocument.get(EMBEDDING_FIELD_NAME)), - (String) searchDocument.get("category"), - (String) searchDocument.get("id"), - null); - }; - } -} diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithVectorStoreApproach.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithVectorStoreApproach.java new file mode 100644 index 0000000..d55840b --- /dev/null +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/JavaSemanticKernelWithVectorStoreApproach.java @@ -0,0 +1,159 @@ +// Copyright (c) Microsoft. All rights reserved. +package com.microsoft.openai.samples.rag.ask.approaches.semantickernel; + +import com.azure.ai.openai.OpenAIAsyncClient; +import com.azure.search.documents.indexes.SearchIndexAsyncClient; +import com.microsoft.openai.samples.rag.approaches.ContentSource; +import com.microsoft.openai.samples.rag.approaches.RAGApproach; +import com.microsoft.openai.samples.rag.approaches.RAGOptions; +import com.microsoft.openai.samples.rag.approaches.RAGResponse; +import com.microsoft.openai.samples.rag.chat.approaches.semantickernel.JavaSemanticKernelWithVectorStoreChatApproach; +import com.microsoft.openai.samples.rag.retrieval.semantickernel.AzureAISearchVectorStoreUtils; +import com.microsoft.semantickernel.Kernel; + +import com.microsoft.semantickernel.aiservices.openai.chatcompletion.OpenAIChatCompletion; +import com.microsoft.semantickernel.aiservices.openai.textembedding.OpenAITextEmbeddingGenerationService; +import com.microsoft.semantickernel.data.azureaisearch.AzureAISearchVectorStoreRecordCollection; +import com.microsoft.semantickernel.data.azureaisearch.AzureAISearchVectorStoreRecordCollectionOptions; + +import java.io.IOException; +import java.io.OutputStream; +import java.util.List; + +import com.microsoft.semantickernel.implementation.EmbeddedResourceLoader; +import com.microsoft.semantickernel.orchestration.FunctionResult; +import com.microsoft.semantickernel.plugin.KernelPlugin; +import com.microsoft.semantickernel.plugin.KernelPluginFactory; +import com.microsoft.semantickernel.semanticfunctions.HandlebarsPromptTemplateFactory; +import com.microsoft.semantickernel.semanticfunctions.KernelFunction; +import com.microsoft.semantickernel.semanticfunctions.KernelFunctionArguments; +import com.microsoft.semantickernel.semanticfunctions.KernelFunctionYaml; +import com.microsoft.semantickernel.services.chatcompletion.ChatCompletionService; +import com.microsoft.semantickernel.services.textembedding.EmbeddingGenerationService; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.springframework.beans.factory.annotation.Value; +import org.springframework.stereotype.Component; + +import com.microsoft.openai.samples.rag.retrieval.semantickernel.AzureAISearchVectorStoreUtils.DocumentRecord; + +/** + * Use Java Semantic Kernel framework with built-in VectorStores for embeddings similarity search. A + * semantic function is defined in RAG.AnswerQuestion (src/main/resources/semantickernel/Plugins) to + * build the prompt template which is grounded using results from the VectorRecordCollection. + * An AzureAISearchVectorStoreRecordCollection is used to manage an AzureAISearch index populated by the + * documents ingestion process. + */ +@Component +public class JavaSemanticKernelWithVectorStoreApproach implements RAGApproach { + private static final Logger LOGGER = + LoggerFactory.getLogger(JavaSemanticKernelWithVectorStoreApproach.class); + private final OpenAIAsyncClient openAIAsyncClient; + private final SearchIndexAsyncClient searchAsyncClient; + + + @Value("${cognitive.search.index}") + String indexName; + + @Value("${openai.chatgpt.deployment}") + private String gptChatDeploymentModelId; + + @Value("${openai.embedding.deployment}") + private String embeddingDeploymentModelId; + + public JavaSemanticKernelWithVectorStoreApproach( + OpenAIAsyncClient openAIAsyncClient, + SearchIndexAsyncClient searchAsyncClient) { + this.openAIAsyncClient = openAIAsyncClient; + this.searchAsyncClient = searchAsyncClient; + } + + /** + * @param question + * @param options + * @return + */ + @Override + public RAGResponse run(String question, RAGOptions options) { + // Build semantic kernel context with AnswerQuestion plugin, EmbeddingGenerationService and ChatCompletionService. + // skill is imported from src/main/resources/semantickernel/Plugins. + Kernel semanticKernel = buildSemanticKernel(options); + + // STEP 1: Build Vector Record Collection + AzureAISearchVectorStoreRecordCollection recordCollection = new AzureAISearchVectorStoreRecordCollection<>( + searchAsyncClient, + indexName, + AzureAISearchVectorStoreRecordCollectionOptions.builder() + .withRecordClass(DocumentRecord.class) + .build() + ); + + // STEP 2: Retrieve relevant documents using user question. + List memoryResult = AzureAISearchVectorStoreUtils.searchAsync( + question, semanticKernel, recordCollection, options); + + String sources = AzureAISearchVectorStoreUtils.buildSourcesText(memoryResult); + List sourcesList = AzureAISearchVectorStoreUtils.buildSources(memoryResult); + + // STEP 3: Generate a contextual and content specific answer using the search results and question + KernelFunction answerQuestion = semanticKernel.getFunction("RAG", "AnswerQuestion"); + KernelFunctionArguments arguments = KernelFunctionArguments.builder() + .withVariable("sources", sourcesList) + .withVariable("input", question) + .build(); + + FunctionResult reply = answerQuestion.invokeAsync(semanticKernel) + .withArguments(arguments) + .block(); + + return new RAGResponse.Builder() + .prompt("Prompt is managed by SK and can't be displayed here. See App logs for" + + " prompt") + .answer(reply.getResult()) + .sources(sourcesList) + .sourcesAsText(sources) + .question(question) + .build(); + } + + @Override + public void runStreaming( + String questionOrConversation, RAGOptions options, OutputStream outputStream) { + throw new IllegalStateException("Streaming not supported for this approach"); + } + + private Kernel buildSemanticKernel(RAGOptions options) { + KernelPlugin answerPlugin; + try { + answerPlugin = KernelPluginFactory.createFromFunctions( + "RAG", + "AnswerQuestion", + List.of( + KernelFunctionYaml.fromPromptYaml( + EmbeddedResourceLoader.readFile( + "semantickernel/Plugins/RAG/AnswerQuestion/answerQuestion.prompt.yaml", + JavaSemanticKernelWithVectorStoreChatApproach.class, + EmbeddedResourceLoader.ResourceLocation.CLASSPATH_ROOT + ), + new HandlebarsPromptTemplateFactory()) + ) + ); + } catch (IOException e) { + throw new RuntimeException(e); + } + Kernel kernel = Kernel.builder() + .withAIService(EmbeddingGenerationService.class, OpenAITextEmbeddingGenerationService.builder() + .withOpenAIAsyncClient(openAIAsyncClient) + .withModelId(embeddingDeploymentModelId) + .withDimensions(1536) + .build()) + .withAIService(ChatCompletionService.class, OpenAIChatCompletion.builder() + .withOpenAIAsyncClient(this.openAIAsyncClient) + .withModelId(gptChatDeploymentModelId) + .build()) + .withPlugin(answerPlugin) + .build(); + + return kernel; + } +} diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/memory/CustomAzureCognitiveSearchMemoryStore.java.ignore b/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/memory/CustomAzureCognitiveSearchMemoryStore.java.ignore deleted file mode 100644 index efe44e1..0000000 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/ask/approaches/semantickernel/memory/CustomAzureCognitiveSearchMemoryStore.java.ignore +++ /dev/null @@ -1,94 +0,0 @@ -// Copyright (c) Microsoft. All rights reserved. -package com.microsoft.openai.samples.rag.ask.approaches.semantickernel.memory; - -import com.azure.core.credential.TokenCredential; -import com.azure.search.documents.SearchAsyncClient; -import com.azure.search.documents.SearchDocument; -import com.azure.search.documents.models.SearchOptions; -import com.azure.search.documents.models.SearchQueryVector; -import com.microsoft.semantickernel.ai.embeddings.Embedding; -import com.microsoft.semantickernel.connectors.memory.azurecognitivesearch.AzureCognitiveSearchMemoryRecord; -import com.microsoft.semantickernel.connectors.memory.azurecognitivesearch.AzureCognitiveSearchMemoryStore; -import com.microsoft.semantickernel.memory.MemoryRecord; -import java.util.Collection; -import java.util.function.Function; -import java.util.stream.Collectors; -import javax.annotation.Nonnull; -import reactor.core.publisher.Mono; -import reactor.util.function.Tuple2; -import reactor.util.function.Tuples; - -public class CustomAzureCognitiveSearchMemoryStore extends AzureCognitiveSearchMemoryStore { - - private SearchAsyncClient searchClient; - private String embeddingFieldMapping = "Embedding"; - - private Function memoryRecordMapper; - - /** - * Create a new instance of custom memory storage using Azure Cognitive Search. - * - * @param endpoint Azure Cognitive Search URI, e.g. "https://contoso.search.windows.net" - * @param credentials Azure service credentials - * @param searchClient Another instance of cognitive search client. Unfortunately this is a hack - * as current getSearchClient is private in parent class. - */ - public CustomAzureCognitiveSearchMemoryStore( - @Nonnull String endpoint, - @Nonnull TokenCredential credentials, - @Nonnull SearchAsyncClient searchClient, - String embeddingFieldMapping) { - super(endpoint, credentials); - this.searchClient = searchClient; - if (embeddingFieldMapping != null && !embeddingFieldMapping.isEmpty()) - this.embeddingFieldMapping = embeddingFieldMapping; - } - - public CustomAzureCognitiveSearchMemoryStore( - @Nonnull String endpoint, - @Nonnull TokenCredential credentials, - @Nonnull SearchAsyncClient searchClient, - String embeddingFieldMapping, - Function memoryRecordMapper) { - this(endpoint, credentials, searchClient, embeddingFieldMapping); - this.memoryRecordMapper = memoryRecordMapper; - } - - public Mono>> getNearestMatchesAsync( - @Nonnull String collectionName, - @Nonnull Embedding embedding, - int limit, - float minRelevanceScore, - boolean withEmbedding) { - - SearchQueryVector searchVector = - new SearchQueryVector() - .setKNearestNeighborsCount(limit) - .setFields(embeddingFieldMapping) - .setValue(embedding.getVector()); - - SearchOptions searchOptions = new SearchOptions().setVectors(searchVector); - - return searchClient - .search(null, searchOptions) - .filter(result -> (double) minRelevanceScore <= result.getScore()) - .map( - result -> { - MemoryRecord memoryRecord; - // Use default SK mapper if no custom mapper is provided - if (this.memoryRecordMapper == null) { - memoryRecord = - result.getDocument(AzureCognitiveSearchMemoryRecord.class) - .toMemoryRecord(withEmbedding); - } else { - memoryRecord = - this.memoryRecordMapper.apply( - result.getDocument(SearchDocument.class)); - } - - float score = (float) result.getScore(); - return Tuples.of(memoryRecord, score); - }) - .collect(Collectors.toList()); - } -} diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithMemoryChatApproach.java.ignore b/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithMemoryChatApproach.java.ignore deleted file mode 100644 index 73ac8da..0000000 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithMemoryChatApproach.java.ignore +++ /dev/null @@ -1,196 +0,0 @@ -package com.microsoft.openai.samples.rag.chat.approaches.semantickernel; - -import com.azure.ai.openai.OpenAIAsyncClient; -import com.azure.core.credential.TokenCredential; -import com.azure.search.documents.SearchAsyncClient; -import com.azure.search.documents.SearchDocument; -import com.fasterxml.jackson.databind.ObjectMapper; -import com.microsoft.openai.samples.rag.approaches.ContentSource; -import com.microsoft.openai.samples.rag.approaches.RAGApproach; -import com.microsoft.openai.samples.rag.approaches.RAGOptions; -import com.microsoft.openai.samples.rag.approaches.RAGResponse; -import com.microsoft.openai.samples.rag.ask.approaches.semantickernel.memory.CustomAzureCognitiveSearchMemoryStore; -import com.microsoft.openai.samples.rag.common.ChatGPTConversation; -import com.microsoft.openai.samples.rag.common.ChatGPTUtils; -import com.microsoft.semantickernel.Kernel; -import com.microsoft.semantickernel.orchestration.FunctionResult; -import com.microsoft.semantickernel.semanticfunctions.KernelFunction; -import com.microsoft.semantickernel.semanticfunctions.KernelFunctionArguments; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; -import org.springframework.beans.factory.annotation.Value; -import org.springframework.stereotype.Component; - -import java.io.OutputStream; -import java.util.List; -import java.util.function.Function; -import java.util.stream.Collectors; - -/** - * Accomplish the same task as in the PlainJavaAskApproach approach but using Semantic Kernel framework: - * 1. Memory abstraction is used for vector search capability. It uses Azure Cognitive Search as memory store. - * 2. Semantic functions have been defined to ask question using sources from memory search results - */ -@Component -public class JavaSemanticKernelWithMemoryChatApproach implements RAGApproach { - private static final Logger LOGGER = LoggerFactory.getLogger(JavaSemanticKernelWithMemoryChatApproach.class); - private final TokenCredential tokenCredential; - private final OpenAIAsyncClient openAIAsyncClient; - - private final SearchAsyncClient searchAsyncClient; - - private final ObjectMapper objectMapper; - - private final String EMBEDDING_FIELD_NAME = "embedding"; - - @Value("${cognitive.search.service}") - String searchServiceName; - @Value("${cognitive.search.index}") - String indexName; - @Value("${openai.chatgpt.deployment}") - private String gptChatDeploymentModelId; - - @Value("${openai.embedding.deployment}") - private String embeddingDeploymentModelId; - - public JavaSemanticKernelWithMemoryChatApproach(TokenCredential tokenCredential, OpenAIAsyncClient openAIAsyncClient, SearchAsyncClient searchAsyncClient, ObjectMapper objectMapper) { - this.tokenCredential = tokenCredential; - this.openAIAsyncClient = openAIAsyncClient; - this.searchAsyncClient = searchAsyncClient; - this.objectMapper = objectMapper; - } - - @Override - public RAGResponse run(ChatGPTConversation questionOrConversation, RAGOptions options) { - String question = ChatGPTUtils.getLastUserQuestion(questionOrConversation.getMessages()); - - // STEP 1: Build semantic kernel with Azure Cognitive Search as memory store. AnswerQuestion skill is imported from resources. - Kernel semanticKernel = buildSemanticKernel(options); - - // STEP 2: Retrieve relevant documents using keywords extracted from the chat history - String conversation = ChatGPTUtils.formatAsChatML(questionOrConversation.toOpenAIChatMessages()); - List sourcesResult = getSourcesFromConversation(conversation, semanticKernel, options); - - LOGGER.info("Total {} sources found in cognitive vector store for search query[{}]", sourcesResult.size(), question); - - String sources = buildSourcesText(sourcesResult); - List sourcesList = buildSources(sourcesResult); - - // STEP 3: Generate a contextual and content specific answer using the search results and chat history - KernelFunction answerConversation = semanticKernel.getFunction("RAG", "AnswerConversation"); - KernelFunctionArguments arguments = KernelFunctionArguments.builder() - .withVariable("sources", sources) - .withVariable("conversation", conversation) - .withVariable("suggestions", String.valueOf(options.isSuggestFollowupQuestions())) - .withVariable("input", question) - .build(); - - FunctionResult reply = answerConversation.invokeAsync(semanticKernel) - .withArguments(arguments) - .block(); - - return new RAGResponse.Builder() - .prompt("Prompt is managed by Semantic Kernel") - .answer(reply.getResult()) - .sources(sourcesList) - .sourcesAsText(sources) - .question(question) - .build(); - } - - @Override - public void runStreaming(ChatGPTConversation questionOrConversation, RAGOptions options, OutputStream outputStream) { - throw new IllegalStateException("Streaming not supported for this approach"); - } - - private List getSourcesFromConversation(String conversation, Kernel kernel, RAGOptions options) { - KernelFunction extractKeywords = kernel - .getPlugin("RAG") - .get("ExtractKeywords"); - - KernelFunctionArguments arguments = KernelFunctionArguments.builder() - .withVariable("conversation", conversation) - .build(); - - FunctionResult result = extractKeywords - .invokeAsync(kernel) - .withArguments(arguments) - .block(); - String searchQuery = result.getResult(); - - - /** - * Use semantic kernel built-in memory.searchAsync. It uses OpenAI to generate embeddings for the provided question. - * Question embeddings are provided to cognitive search via search options. - */ - List memoryResult = kernel.getMemory().searchAsync( - indexName, - searchQuery, - options.getTop(), - 0.5f, - false) - .block(); - - return memoryResult; - } - - private List buildSources(List memoryResult) { - return memoryResult - .stream() - .map(result -> { - return new ContentSource( - result.getMetadata().getId(), - result.getMetadata().getText() - ); - }) - .collect(Collectors.toList()); - } - - private String buildSourcesText(List memoryResult) { - StringBuilder sourcesContentBuffer = new StringBuilder(); - memoryResult.stream().forEach(memory -> { - sourcesContentBuffer.append(memory.getMetadata().getId()) - .append(": ") - .append(memory.getMetadata().getText().replace("\n", "")) - .append("\n"); - }); - return sourcesContentBuffer.toString(); - } - - private Kernel buildSemanticKernel(RAGOptions options) { - var kernelWithACS = SKBuilders.kernel() - .withMemoryStorage( - new CustomAzureCognitiveSearchMemoryStore("https://%s.search.windows.net".formatted(searchServiceName), - tokenCredential, - this.searchAsyncClient, - this.EMBEDDING_FIELD_NAME, - buildCustomMemoryMapper())) - .withDefaultAIService(SKBuilders.textEmbeddingGeneration() - .withOpenAIClient(openAIAsyncClient) - .withModelId(embeddingDeploymentModelId) - .build()) - .withDefaultAIService(SKBuilders.chatCompletion() - .withModelId(gptChatDeploymentModelId) - .withOpenAIClient(this.openAIAsyncClient) - .build()) - .build(); - - kernelWithACS.importSkillFromResources("semantickernel/Plugins", "RAG", "AnswerConversation", null); - kernelWithACS.importSkillFromResources("semantickernel/Plugins", "RAG", "ExtractKeywords", null); - return kernelWithACS; - } - - private Function buildCustomMemoryMapper() { - return searchDocument -> { - return MemoryRecord.localRecord( - (String) searchDocument.get("sourcepage"), - (String) searchDocument.get("content"), - "chunked text from original source", - new Embedding((List) searchDocument.get(EMBEDDING_FIELD_NAME)), - (String) searchDocument.get("category"), - (String) searchDocument.get("id"), - null); - - }; - } -} diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithVectorStoreChatApproach.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithVectorStoreChatApproach.java new file mode 100644 index 0000000..5b252ab --- /dev/null +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/chat/approaches/semantickernel/JavaSemanticKernelWithVectorStoreChatApproach.java @@ -0,0 +1,206 @@ +package com.microsoft.openai.samples.rag.chat.approaches.semantickernel; + +import com.azure.ai.openai.OpenAIAsyncClient; +import com.azure.search.documents.indexes.SearchIndexAsyncClient; +import com.microsoft.openai.samples.rag.approaches.ContentSource; +import com.microsoft.openai.samples.rag.approaches.RAGApproach; +import com.microsoft.openai.samples.rag.approaches.RAGOptions; +import com.microsoft.openai.samples.rag.approaches.RAGResponse; +import com.microsoft.openai.samples.rag.common.ChatGPTConversation; +import com.microsoft.openai.samples.rag.common.ChatGPTUtils; +import com.microsoft.openai.samples.rag.retrieval.semantickernel.AzureAISearchVectorStoreUtils; +import com.microsoft.semantickernel.Kernel; +import com.microsoft.semantickernel.aiservices.openai.chatcompletion.OpenAIChatCompletion; +import com.microsoft.semantickernel.aiservices.openai.textembedding.OpenAITextEmbeddingGenerationService; +import com.microsoft.semantickernel.data.azureaisearch.AzureAISearchVectorStoreRecordCollection; +import com.microsoft.semantickernel.data.azureaisearch.AzureAISearchVectorStoreRecordCollectionOptions; +import com.microsoft.semantickernel.implementation.EmbeddedResourceLoader; +import com.microsoft.semantickernel.orchestration.FunctionResult; +import com.microsoft.semantickernel.plugin.KernelPlugin; +import com.microsoft.semantickernel.plugin.KernelPluginFactory; +import com.microsoft.semantickernel.semanticfunctions.HandlebarsPromptTemplateFactory; +import com.microsoft.semantickernel.semanticfunctions.KernelFunction; +import com.microsoft.semantickernel.semanticfunctions.KernelFunctionArguments; +import com.microsoft.semantickernel.semanticfunctions.KernelFunctionYaml; +import com.microsoft.semantickernel.services.chatcompletion.ChatCompletionService; +import com.microsoft.semantickernel.services.chatcompletion.ChatHistory; +import com.microsoft.semantickernel.services.chatcompletion.ChatMessageContent; +import com.microsoft.semantickernel.services.textembedding.EmbeddingGenerationService; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.springframework.beans.factory.annotation.Value; +import org.springframework.stereotype.Component; + +import java.io.IOException; +import java.io.OutputStream; +import java.util.ArrayList; +import java.util.List; + +import com.microsoft.openai.samples.rag.retrieval.semantickernel.AzureAISearchVectorStoreUtils.DocumentRecord; + +/** + * Use Java Semantic Kernel framework with built-in VectorStores for embeddings similarity search. A + * semantic function is defined in RAG.AnswerConversation (src/main/resources/semantickernel/Plugins) to + * build the prompt template which is grounded using results from the VectorRecordCollection. + * An AzureAISearchVectorStoreRecordCollection instance is used to manage an AzureAISearch index populated by the + * documents ingestion process. + */ +@Component +public class JavaSemanticKernelWithVectorStoreChatApproach implements RAGApproach { + private static final Logger LOGGER = LoggerFactory.getLogger(JavaSemanticKernelWithVectorStoreChatApproach.class); + private final OpenAIAsyncClient openAIAsyncClient; + private final SearchIndexAsyncClient searchAsyncClient; + private String renderedConversation; + + @Value("${cognitive.search.index}") + String indexName; + @Value("${openai.chatgpt.deployment}") + private String gptChatDeploymentModelId; + @Value("${openai.embedding.deployment}") + private String embeddingDeploymentModelId; + + public JavaSemanticKernelWithVectorStoreChatApproach(OpenAIAsyncClient openAIAsyncClient, SearchIndexAsyncClient searchAsyncClient) { + this.openAIAsyncClient = openAIAsyncClient; + this.searchAsyncClient = searchAsyncClient; + } + + @Override + public RAGResponse run(ChatGPTConversation questionOrConversation, RAGOptions options) { + ChatHistory conversation = questionOrConversation.toSKChatHistory(); + ChatMessageContent question = conversation.getLastMessage().get(); + + // Build semantic kernel context with AnswerConversation and ExtractKeywords plugins, EmbeddingGenerationService and ChatCompletionService. + Kernel semanticKernel = buildSemanticKernel(); + + // STEP 1: Build Vector Record Collection + AzureAISearchVectorStoreRecordCollection recordCollection = new AzureAISearchVectorStoreRecordCollection<>( + searchAsyncClient, + indexName, + AzureAISearchVectorStoreRecordCollectionOptions.builder() + .withRecordClass(DocumentRecord.class) + .build() + ); + + // STEP 2: Retrieve relevant documents using keywords extracted from the chat history + String conversationString = ChatGPTUtils.formatAsChatML(questionOrConversation.toOpenAIChatMessages()); + List sourcesResult = getSourcesFromConversation(conversationString, semanticKernel, recordCollection, options); + + LOGGER.info("Total {} sources found in cognitive vector store for search query[{}]", sourcesResult.size(), question); + + String sources = AzureAISearchVectorStoreUtils.buildSourcesText(sourcesResult); + List sourcesList = AzureAISearchVectorStoreUtils.buildSources(sourcesResult); + + // STEP 3: Generate a contextual and content specific answer using the search results and chat history + KernelFunction answerConversation = semanticKernel.getFunction("RAG", "AnswerConversation"); + KernelFunctionArguments arguments = KernelFunctionArguments.builder() + .withVariable("sources", sourcesList) + .withVariable("conversation", removeLastMessage(conversation)) + .withVariable("suggestions", options.isSuggestFollowupQuestions()) + .withVariable("input", question.getContent()) + .build(); + + FunctionResult reply = answerConversation.invokeAsync(semanticKernel) + .withArguments(arguments) + .block(); + + return new RAGResponse.Builder() + .prompt(renderedConversation) + .answer(reply.getResult()) + .sources(sourcesList) + .sourcesAsText(sources) + .question(question.getContent()) + .build(); + } + + @Override + public void runStreaming(ChatGPTConversation questionOrConversation, RAGOptions options, OutputStream outputStream) { + throw new IllegalStateException("Streaming not supported for this approach"); + } + + private ChatHistory removeLastMessage(ChatHistory conversation) { + ArrayList> messages = new ArrayList<>(conversation.getMessages()); + messages.remove(conversation.getMessages().size() - 1); + return new ChatHistory(messages); + } + + private List getSourcesFromConversation(String conversation, + Kernel kernel, + AzureAISearchVectorStoreRecordCollection recordCollection, + RAGOptions ragOptions) { + KernelFunction extractKeywords = kernel + .getPlugin("RAG") + .get("ExtractKeywords"); + + KernelFunctionArguments arguments = KernelFunctionArguments.builder() + .withVariable("conversation", conversation) + .build(); + + FunctionResult result = extractKeywords + .invokeAsync(kernel) + .withArguments(arguments) + .block(); + String searchQuery = result.getResult(); + + return AzureAISearchVectorStoreUtils.searchAsync( + searchQuery, + kernel, + recordCollection, + ragOptions + ); + } + + private Kernel buildSemanticKernel() { + KernelPlugin answerPlugin, extractKeywordsPlugin; + try { + answerPlugin = KernelPluginFactory.createFromFunctions( + "RAG", + "AnswerConversation", + List.of( + KernelFunctionYaml.fromPromptYaml( + EmbeddedResourceLoader.readFile( + "semantickernel/Plugins/RAG/AnswerConversation/answerConversation.prompt.yaml", + JavaSemanticKernelWithVectorStoreChatApproach.class, + EmbeddedResourceLoader.ResourceLocation.CLASSPATH_ROOT + ), + new HandlebarsPromptTemplateFactory()) + ) + ); + extractKeywordsPlugin = KernelPluginFactory.createFromFunctions( + "RAG", + "ExtractKeywords", + List.of( + KernelFunctionYaml.fromPromptYaml( + EmbeddedResourceLoader.readFile( + "semantickernel/Plugins/RAG/ExtractKeywords/extractKeywords.prompt.yaml", + JavaSemanticKernelWithVectorStoreChatApproach.class, + EmbeddedResourceLoader.ResourceLocation.CLASSPATH_ROOT + ), + new HandlebarsPromptTemplateFactory()) + ) + ); + } catch (IOException e) { + throw new RuntimeException(e); + } + + Kernel kernel = Kernel.builder() + .withAIService(EmbeddingGenerationService.class, OpenAITextEmbeddingGenerationService.builder() + .withOpenAIAsyncClient(openAIAsyncClient) + .withModelId(embeddingDeploymentModelId) + .withDimensions(1536) + .build()) + .withAIService(ChatCompletionService.class, OpenAIChatCompletion.builder() + .withOpenAIAsyncClient(this.openAIAsyncClient) + .withModelId(gptChatDeploymentModelId) + .build()) + .withPlugin(answerPlugin) + .withPlugin(extractKeywordsPlugin) + .build(); + + kernel.getGlobalKernelHooks().addPreChatCompletionHook(event -> { + this.renderedConversation = ChatGPTUtils.formatAsChatML(event.getOptions().getMessages()); + return event; + }); + + return kernel; + } +} diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/common/ChatGPTUtils.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/common/ChatGPTUtils.java index b6a0d01..e5a77ba 100644 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/common/ChatGPTUtils.java +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/common/ChatGPTUtils.java @@ -44,10 +44,10 @@ public static String formatAsChatML(List messages) { content = ((ChatRequestUserMessage) message).getContent().toString(); } else if (message instanceof ChatRequestSystemMessage) { sb.append(IM_START_SYSTEM).append("\n"); - content = ((ChatRequestSystemMessage) message).getContent(); + content = ((ChatRequestSystemMessage) message).getContent().toString(); } else if (message instanceof ChatRequestAssistantMessage) { sb.append(IM_START_ASSISTANT).append("\n"); - content = ((ChatRequestAssistantMessage) message).getContent(); + content = ((ChatRequestAssistantMessage) message).getContent().toString(); } if (content != null) { diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/config/AzureAISearchConfiguration.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/config/AzureAISearchConfiguration.java index fc687af..ce4f04e 100644 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/config/AzureAISearchConfiguration.java +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/config/AzureAISearchConfiguration.java @@ -7,6 +7,8 @@ import com.azure.search.documents.SearchAsyncClient; import com.azure.search.documents.SearchClient; import com.azure.search.documents.SearchClientBuilder; +import com.azure.search.documents.indexes.SearchIndexAsyncClient; +import com.azure.search.documents.indexes.SearchIndexClientBuilder; import org.springframework.beans.factory.annotation.Value; import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty; import org.springframework.context.annotation.Bean; @@ -82,4 +84,15 @@ public SearchAsyncClient asyncSearchDefaultClient() { .indexName(indexName) .buildAsyncClient(); } + + @Bean + @ConditionalOnProperty(name = "cognitive.tracing.enabled", havingValue = "true") + public SearchIndexAsyncClient asyncSearchIndexDefaultClient() { + String endpoint = "https://%s.search.windows.net".formatted(searchServiceName); + + return new SearchIndexClientBuilder() + .endpoint(endpoint) + .credential(tokenCredential) + .buildAsyncClient(); + } } diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/proxy/OpenAIProxy.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/proxy/OpenAIProxy.java index 190104a..8eba767 100644 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/proxy/OpenAIProxy.java +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/proxy/OpenAIProxy.java @@ -94,6 +94,7 @@ public Embeddings getEmbeddings(List texts) { EmbeddingsOptions embeddingsOptions = new EmbeddingsOptions(texts); embeddingsOptions.setUser("search-openai-demo-java"); embeddingsOptions.setModel(this.embeddingDeploymentModelId); + embeddingsOptions.setDimensions(1536); embeddingsOptions.setInputType("query"); embeddings = client.getEmbeddings(this.embeddingDeploymentModelId, embeddingsOptions); } catch (HttpResponseException e) { diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearchRetriever.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearchRetriever.java index edf7d1c..1644fda 100644 --- a/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearchRetriever.java +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/AzureAISearchRetriever.java @@ -5,12 +5,13 @@ import com.azure.ai.openai.models.Embeddings; import com.azure.core.util.Context; import com.azure.search.documents.SearchDocument; +import com.azure.search.documents.models.QueryCaption; import com.azure.search.documents.models.QueryCaptionType; -import com.azure.search.documents.models.QueryLanguage; -import com.azure.search.documents.models.QuerySpellerType; import com.azure.search.documents.models.QueryType; import com.azure.search.documents.models.SearchOptions; -import com.azure.search.documents.models.SearchQueryVector; +import com.azure.search.documents.models.SemanticSearchOptions; +import com.azure.search.documents.models.VectorSearchOptions; +import com.azure.search.documents.models.VectorizedQuery; import com.azure.search.documents.util.SearchPagedIterable; import com.microsoft.openai.samples.rag.approaches.ContentSource; import com.microsoft.openai.samples.rag.approaches.RAGOptions; @@ -126,7 +127,7 @@ private List buildSourcesFromSearchResults( if (options.isSemanticCaptions()) { StringBuilder sourcesContentBuffer = new StringBuilder(); - result.getCaptions() + result.getSemanticSearch().getQueryCaptions() .forEach( caption -> sourcesContentBuffer @@ -162,16 +163,15 @@ private void setSearchOptionsForVector( Optional.ofNullable(options.getTop()) .ifPresentOrElse(searchOptions::setTop, () -> searchOptions.setTop(3)); + VectorizedQuery query = new VectorizedQuery(questionVector) + .setKNearestNeighborsCount(options.getTop()) + .setFields("embedding"); + // "embedding" is the field name in the index where the embeddings are stored. - searchOptions.setVectors( - new SearchQueryVector() - .setValue(questionVector) - .setKNearestNeighborsCount(options.getTop()) - .setFields("embedding")); + searchOptions.setVectorSearchOptions(new VectorSearchOptions().setQueries(query)); } private void setSearchOptions(RAGOptions options, SearchOptions searchOptions) { - Optional.ofNullable(options.getTop()) .ifPresentOrElse(searchOptions::setTop, () -> searchOptions.setTop(3)); Optional.ofNullable(options.getExcludeCategory()) @@ -186,11 +186,14 @@ private void setSearchOptions(RAGOptions options, SearchOptions searchOptions) { isSemanticRanker -> { if (isSemanticRanker) { searchOptions.setQueryType(QueryType.SEMANTIC); - searchOptions.setQueryLanguage(QueryLanguage.EN_US); - searchOptions.setSpeller(QuerySpellerType.LEXICON); - searchOptions.setSemanticConfigurationName("default"); - searchOptions.setQueryCaption(QueryCaptionType.EXTRACTIVE); - searchOptions.setQueryCaptionHighlightEnabled(false); + searchOptions.setSemanticSearchOptions( + new SemanticSearchOptions() + .setSemanticConfigurationName("default") + .setQueryCaption( + new QueryCaption(QueryCaptionType.EXTRACTIVE) + .setHighlightEnabled(false) + ) + ); } }); } diff --git a/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchVectorStoreUtils.java b/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchVectorStoreUtils.java new file mode 100644 index 0000000..160d2a9 --- /dev/null +++ b/app/backend/src/main/java/com/microsoft/openai/samples/rag/retrieval/semantickernel/AzureAISearchVectorStoreUtils.java @@ -0,0 +1,191 @@ +package com.microsoft.openai.samples.rag.retrieval.semantickernel; + +import com.azure.search.documents.models.QueryCaption; +import com.azure.search.documents.models.QueryCaptionType; +import com.azure.search.documents.models.QueryType; +import com.azure.search.documents.models.SearchOptions; +import com.azure.search.documents.models.SemanticSearchOptions; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.microsoft.openai.samples.rag.approaches.ContentSource; +import com.microsoft.openai.samples.rag.approaches.RAGOptions; +import com.microsoft.openai.samples.rag.approaches.RetrievalMode; +import com.microsoft.semantickernel.Kernel; +import com.microsoft.semantickernel.data.azureaisearch.AzureAISearchVectorStoreRecordCollection; +import com.microsoft.semantickernel.data.vectorsearch.VectorSearchResult; +import com.microsoft.semantickernel.data.vectorsearch.VectorSearchResults; +import com.microsoft.semantickernel.data.vectorstorage.annotations.VectorStoreRecordData; +import com.microsoft.semantickernel.data.vectorstorage.annotations.VectorStoreRecordKey; +import com.microsoft.semantickernel.data.vectorstorage.annotations.VectorStoreRecordVector; +import com.microsoft.semantickernel.data.vectorstorage.definition.DistanceFunction; +import com.microsoft.semantickernel.data.vectorstorage.options.VectorSearchOptions; +import com.microsoft.semantickernel.services.ServiceNotFoundException; +import com.microsoft.semantickernel.services.textembedding.EmbeddingGenerationService; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.List; +import java.util.Optional; +import java.util.stream.Collectors; + +public class AzureAISearchVectorStoreUtils { + + private static final Logger LOGGER = LoggerFactory.getLogger(AzureAISearchVectorStoreUtils.class); + + private static final String EMBEDDING_FIELD_NAME = "embedding"; + + public static class DocumentRecord { + @VectorStoreRecordKey + private final String id; + @VectorStoreRecordData + private final String content; + @VectorStoreRecordVector(dimensions = 1536, distanceFunction = DistanceFunction.COSINE_DISTANCE) + private final List embedding; + @VectorStoreRecordData + private final String category; + @JsonProperty("sourcepage") + @VectorStoreRecordData + private final String sourcePage; + @JsonProperty("sourcefile") + @VectorStoreRecordData + private final String sourceFile; + + public DocumentRecord( + @JsonProperty("id") String id, + @JsonProperty("content") String content, + @JsonProperty("embedding") List embedding, + @JsonProperty("sourcepage") String sourcePage, + @JsonProperty("sourcefile") String sourceFile, + @JsonProperty("category") String category) { + this.id = id; + this.content = content; + this.embedding = embedding; + this.sourcePage = sourcePage; + this.sourceFile = sourceFile; + this.category = category; + } + + public String getId() { + return id; + } + + public String getContent() { + return content; + } + + public List getEmbedding() { + return embedding; + } + + public String getCategory() { + return category; + } + + public String getSourcePage() { + return sourcePage; + } + + public String getSourceFile() { + return sourceFile; + } + } + + + public static List searchAsync(String searchQuery, + Kernel kernel, + AzureAISearchVectorStoreRecordCollection recordCollection, + RAGOptions ragOptions) { + // Create VectorSearch options + VectorSearchOptions vectorSearchOptions = VectorSearchOptions.builder() + .withTop(ragOptions.getTop()) + .withVectorFieldName(EMBEDDING_FIELD_NAME) + .build(); + + // Vector to search + List questionVector = null; + + // Additional AzureAISearch options + SearchOptions searchOptions = getAdditionalSearchOptions(ragOptions); + + // If the retrieval mode is set to vectors or hybrid, convert the user's query text to an + // embeddings vector. The embeddings vector is passed as search options to Azure AI Search index + if (ragOptions.getRetrievalMode() == RetrievalMode.vectors + || ragOptions.getRetrievalMode() == RetrievalMode.hybrid) { + LOGGER.info( + "Retrieval mode is set to {}. Retrieving vectors for question [{}]", + ragOptions.getRetrievalMode(), + searchQuery); + + try { + // Get the embedding service from the kernel + EmbeddingGenerationService embeddingService = (EmbeddingGenerationService) kernel.getService(EmbeddingGenerationService.class); + // Generate the embeddings + questionVector = embeddingService.generateEmbeddingAsync(searchQuery).block().getVector(); + } catch (ServiceNotFoundException e) { + throw new RuntimeException(e); + } + } + + // Search the vector store for the relevant documents with the generated embeddings + VectorSearchResults memoryResult = recordCollection.hybridSearchAsync(searchQuery, questionVector, vectorSearchOptions, searchOptions) + .block(); + + // Remove the score from the result + return memoryResult.getResults().stream().map(VectorSearchResult::getRecord).collect(Collectors.toList()); + } + + + public static List buildSources(List memoryResult) { + return memoryResult + .stream() + .map(result -> { + return new ContentSource( + result.getSourcePage(), + result.getContent() + ); + }) + .collect(Collectors.toList()); + } + + + public static String buildSourcesText(List memoryResult) { + StringBuilder sourcesContentBuffer = new StringBuilder(); + memoryResult.stream().forEach(memory -> { + sourcesContentBuffer.append(memory.getSourceFile()) + .append(": ") + .append(memory.getContent().replace("\n", "")) + .append("\n"); + }); + return sourcesContentBuffer.toString(); + } + + + private static SearchOptions getAdditionalSearchOptions(RAGOptions options) { + SearchOptions searchOptions = new SearchOptions(); + + Optional.ofNullable(options.getTop()) + .ifPresentOrElse(searchOptions::setTop, () -> searchOptions.setTop(3)); + Optional.ofNullable(options.getExcludeCategory()) + .ifPresentOrElse( + value -> + searchOptions.setFilter( + "category ne '%s'".formatted(value.replace("'", "''"))), + () -> searchOptions.setFilter(null)); + + Optional.ofNullable(options.isSemanticRanker()) + .ifPresent( + isSemanticRanker -> { + if (isSemanticRanker) { + searchOptions.setQueryType(QueryType.SEMANTIC); + searchOptions.setSemanticSearchOptions( + new SemanticSearchOptions() + .setSemanticConfigurationName("default") + .setQueryCaption( + new QueryCaption(QueryCaptionType.EXTRACTIVE) + .setHighlightEnabled(false) + ) + ); + } + }); + return searchOptions; + } +} diff --git a/app/backend/src/main/resources/semantickernel/Plugins/RAG/ExtractKeywords/extractKeywords.prompt.yaml b/app/backend/src/main/resources/semantickernel/Plugins/RAG/ExtractKeywords/extractKeywords.prompt.yaml index 9847007..53467a3 100644 --- a/app/backend/src/main/resources/semantickernel/Plugins/RAG/ExtractKeywords/extractKeywords.prompt.yaml +++ b/app/backend/src/main/resources/semantickernel/Plugins/RAG/ExtractKeywords/extractKeywords.prompt.yaml @@ -4,8 +4,9 @@ template: | Generate a search query for the below conversation. Do not include cited source filenames and document names e.g info.txt or doc.pdf in the search query terms. - Do not include any text inside [] or <<>> in the search query terms. + Do not include any text inside [] or <<>> in the search query terms. Do not enclose the search query in quotes or double quotes. + conversation: {{#each conversation}} diff --git a/app/backend/src/test/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImplTest.java b/app/backend/src/test/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImplTest.java index 0c9d4ba..70208df 100644 --- a/app/backend/src/test/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImplTest.java +++ b/app/backend/src/test/java/com/microsoft/openai/samples/rag/approaches/RAGApproachFactorySpringBootImplTest.java @@ -6,6 +6,7 @@ import com.azure.search.documents.SearchAsyncClient; import com.microsoft.openai.samples.rag.ask.approaches.PlainJavaAskApproach; import com.microsoft.openai.samples.rag.ask.approaches.semantickernel.JavaSemanticKernelChainsApproach; +import com.microsoft.openai.samples.rag.ask.approaches.semantickernel.JavaSemanticKernelWithVectorStoreApproach; import com.microsoft.openai.samples.rag.chat.approaches.PlainJavaChatApproach; import com.microsoft.openai.samples.rag.proxy.AzureAISearchProxy; import org.junit.jupiter.api.Test; @@ -38,10 +39,8 @@ void testCreateApproachWithJavaPlain() { @Test void testCreateApproachWithJavaSemanticKernelMemory() { - assertThrows(IllegalArgumentException.class, () -> { - RAGApproach approach = ragApproachFactory.createApproach("jsk", RAGType.ASK, null); - }); - //assertInstanceOf(JavaSemanticKernelWithMemoryApproach.class, approach); + RAGApproach approach = ragApproachFactory.createApproach("jsk", RAGType.ASK, null); + assertInstanceOf(JavaSemanticKernelWithVectorStoreApproach.class, approach); } @Test diff --git a/app/frontend/src/pages/chat/Chat.tsx b/app/frontend/src/pages/chat/Chat.tsx index cb86f51..c99da6d 100644 --- a/app/frontend/src/pages/chat/Chat.tsx +++ b/app/frontend/src/pages/chat/Chat.tsx @@ -251,15 +251,14 @@ const Chat = () => { key: Approaches.JAVA_OPENAI_SDK, text: "Java Azure Open AI SDK" }, - /* Pending Semantic Kernel Memory implementation in V1.0.0 { key: Approaches.JAVA_SEMANTIC_KERNEL, - text: "Java Semantic Kernel - Memory" - },*/ + text: "Java Semantic Kernel" + }/**, { key: Approaches.JAVA_SEMANTIC_KERNEL_PLANNER, - text: "Java Semantic Kernel" - } + text: "Java Semantic Kernel - Chains" + },*/ ]; return ( @@ -375,7 +374,7 @@ const Chat = () => { onChange={onApproachChange} /> - {(approach === Approaches.JAVA_OPENAI_SDK || approach === Approaches.JAVA_SEMANTIC_KERNEL) && ( + {(approach === Approaches.JAVA_OPENAI_SDK) && ( { label="Use semantic ranker for retrieval" onChange={onUseSemanticRankerChange} /> - + {(approach === Approaches.JAVA_OPENAI_SDK || approach === Approaches.JAVA_SEMANTIC_KERNEL_PLANNER) && ( + + )} )*/} - {(approach === Approaches.JAVA_OPENAI_SDK || approach === Approaches.JAVA_SEMANTIC_KERNEL) && ( + {(approach === Approaches.JAVA_OPENAI_SDK) && ( - {(approach === Approaches.JAVA_OPENAI_SDK || approach === Approaches.JAVA_SEMANTIC_KERNEL_PLANNER) && ( - )} + {(approach === Approaches.JAVA_OPENAI_SDK || approach === Approaches.JAVA_SEMANTIC_KERNEL_PLANNER) && ( )} - {(approach === Approaches.JAVA_OPENAI_SDK || approach === Approaches.JAVA_SEMANTIC_KERNEL_PLANNER) && ( - - )} + + {useLogin && (