This notebook builds ontop of Question answering using embeddings-based search but the data will be loaded from wikipedia using llamaindex
Install the Azure Open AI SDK using the below command.
#r "nuget: Azure.AI.OpenAI, 1.0.0-beta.12"
#r "nuget:Microsoft.DotNet.Interactive.AIUtilities, 1.0.0-beta.24054.2"
using Microsoft.DotNet.Interactive;
using Microsoft.DotNet.Interactive.AIUtilities;
var azureOpenAIKey = await Kernel.GetPasswordAsync("Provide your OPEN_AI_KEY");
// Your endpoint should look like the following https://YOUR_OPEN_AI_RESOURCE_NAME.openai.azure.com/
var azureOpenAIEndpoint = await Kernel.GetInputAsync("Provide the OPEN_AI_ENDPOINT");
// Enter the deployment name you chose when you deployed the model.
var chatDeployment = await Kernel.GetInputAsync("Provide chat deployment name");
OpenAiClient
using the azureOpenAIEndpoint
and the azureOpenAIKey
¶using Azure;
using Azure.AI.OpenAI;
OpenAIClient client = new (new Uri(azureOpenAIEndpoint), new AzureKeyCredential(azureOpenAIKey.GetClearTextPassword()));
We need to use python to load and index data. Read the guide to get started with python in Polyglot Notebooks and this doc on how to connect.
First we need to connect a Python Kernel, in this example we are using Anaconda based deployment and a conda environment called AI
.
The environment retuires the following packages:
#!connect jupyter --kernel-name python3 --kernel-spec python3 --conda-env AI
The #!connect jupyter
feature is in preview. Please report any feedback or issues at https://github.com/dotnet/interactive/issues/new/choose.
Kernel added: #!python3
using System.Linq;
using System.Text.Json;
using Microsoft.DotNet.Interactive;
using Microsoft.DotNet.Interactive.Commands;
using Microsoft.DotNet.Interactive.Events;
using Microsoft.DotNet.Interactive.Formatting;
var pythonKernel = Kernel.Root.FindKernelByName("python3");
Exporting values to python3
kernel
var azureOpenAIKeyAsString = azureOpenAIKey.GetClearTextPassword();
#!set --value @csharp:azureOpenAIKeyAsString --name azureOpenAIKey
#!set --value @csharp:chatDeployment --name chatDeployment
#!set --value @csharp:azureOpenAIEndpoint --name azureOpenAIEndpoint
Now we need to setup the python kernel.
from llama_index import download_loader
from llama_index import Document
from llama_index import ServiceContext
from llama_index.embeddings import OpenAIEmbedding
from llama_index.text_splitter import SentenceSplitter
from llama_index.extractors import TitleExtractor
from llama_index.ingestion import IngestionPipeline, IngestionCache
from llama_index.llama_pack import download_llama_pack
from llama_index.embeddings import HuggingFaceEmbedding
from llama_index.llms import AzureOpenAI
from typing import Any, Dict, List
from llama_index.readers.base import BaseReader
from llama_index.readers.schema.base import Document
import wikipedia
from llama_index.node_parser import ( SentenceSplitter, SemanticSplitterNodeParser)
from llama_index.ingestion import IngestionPipeline, IngestionCache
from llama_index import VectorStoreIndex
from llama_index.indices.vector_store import VectorStoreIndex
from llama_index.llama_pack.base import BaseLlamaPack
from llama_index.schema import TextNode
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.indices.service_context import ServiceContext
from llama_index.retrievers import QueryFusionRetriever
import nest_asyncio
nest_asyncio.apply()
llm = AzureOpenAI(
engine= chatDeployment,
model= chatDeployment,
temperature=0.0,
azure_endpoint= azureOpenAIEndpoint,
api_key= azureOpenAIKey,
api_version="2023-07-01-preview"
)
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model)
QueryRewritingRetrieverPack = download_llama_pack("QueryRewritingRetrieverPack", "./query_rewriting_pack")
Loading documents from wikipedia using LlamaIndex loading
pages = wikipedia.search("2022 winter olympics")
documents = []
for page in pages:
try:
wikipedia.set_lang("en")
page_content = wikipedia.page(page).content
documents.append(Document(text=page_content))
except:
pages.remove(page)
Now we use a pipeline to create a set of nodes and compute embeddings.
splitter = SemanticSplitterNodeParser(buffer_size=1, breakpoint_percentile_threshold=95, embed_model=embed_model)
# create the pipeline with transformations
pipeline = IngestionPipeline( transformations=[ splitter, embed_model ])
# run the pipeline
nodes = pipeline.run(documents=documents)
index = VectorStoreIndex(nodes, service_context=service_context)
vector_retriever = index.as_retriever(similarity_top_k=10)
fusion_retriever = QueryFusionRetriever(
[vector_retriever],
llm = service_context.llm,
similarity_top_k=10,
num_queries=16, # set this to 1 to disable query generation
mode="reciprocal_rerank",
# query_gen_prompt="...", # we could override the query generation prompt here
verbose = True
)
public async Task<string[]> Search(string query){
await pythonKernel.SendAsync(new SubmitCode(
$"""
retrievedNodes = fusion_retriever.retrieve("{query}")
articles = []
for node in retrievedNodes:
articles.append(node.text)
"""));
var getValue = new RequestValue("articles", JsonFormatter.MimeType);
var result = await pythonKernel.SendAsync(getValue);
var returnValueProduced = result.Events.OfType<ValueProduced>().LastOrDefault();
var json = returnValueProduced.FormattedValue.Value;
var searchResults = JsonSerializer.Deserialize<string[]>(json);
return searchResults;
}
var tokenizer = await Tokenizer.CreateAsync(TokenizerModel.gpt35);
public async Task<string> AskAsync(string question){
var searchResults = await Search(question);
var articles = string.Join("\n", searchResults.Select(s => $"""
Wikipedia article section:
{s}
"""));
var userQuestion = $"""""
Use the below articles on the 2022 Winter Olympics to answer the subsequent question. If the answer cannot be found in the articles, write "I could not find an answer."
{articles}
Question: {question}
""""";
var options= new ChatCompletionsOptions{
Messages =
{
new ChatRequestSystemMessage(@"You answer questions about the 2022 Winter Olympics."),
new ChatRequestUserMessage(userQuestion)
},
Temperature = 0f,
MaxTokens = 3500,
DeploymentName = chatDeployment
};
var response = await client.GetChatCompletionsAsync(options);
var answer = response.Value.Choices.FirstOrDefault()?.Message?.Content;
return answer;
}
await AskAsync("Where did the 2022 winter Olympics took place?")
Generated queries: 1. Location of 2022 Winter Olympics 2. City that hosted 2022 Winter Olympics 3. Country where 2022 Winter Olympics were held 4. Venues of 2022 Winter Olympics 5. 2022 Winter Olympics host city details 6. Information about the place where 2022 Winter Olympics took place 7. 2022 Winter Olympics location history 8. Details about the 2022 Winter Olympics location 9. Where were the 2022 Winter Olympics held? 10. Host city of the 2022 Winter Olympics 11. 2022 Winter Olympics host country 12. Location and venues of 2022 Winter Olympics 13. Information on 2022 Winter Olympics host city 14. 2022 Winter Olympics location and details 15. Which city hosted the 2022 Winter Olympics?
The 2022 Winter Olympics took place in Beijing, China.
await AskAsync("What countries did take part in the 2022 winter Olympics? Write me the complete list of the countries.")
Generated queries: 1. List of all countries that participated in the 2022 Winter Olympics 2. Which nations competed in the 2022 Winter Olympics? 3. Full list of countries in the 2022 Winter Olympics 4. Names of countries that took part in the 2022 Winter Olympics 5. How many countries participated in the 2022 Winter Olympics? 6. Participating nations in the 2022 Winter Olympics 7. Countries that competed in the 2022 Winter Olympics 8. Complete list of 2022 Winter Olympics participating countries 9. All countries that were in the 2022 Winter Olympics 10. 2022 Winter Olympics participants by country 11. Countries that sent athletes to the 2022 Winter Olympics 12. Which countries were represented in the 2022 Winter Olympics? 13. List of nations that competed in the 2022 Winter Olympics 14. Countries that took part in the 2022 Winter Olympics 15. Full roster of countries in the 2022 Winter Olympics.
I could not find an answer.
await AskAsync("What countries did take part in the 2022 winter Olympics, what months where they held?")
Generated queries: 1. List of countries participating in the 2022 winter Olympics 2. Winter Olympic countries in 2022 3. Which nations competed in the 2022 winter Olympics? 4. Countries involved in the 2022 winter Olympics 5. 2022 winter Olympics participants by country 6. What countries were represented in the 2022 winter Olympics? 7. Nations that took part in the 2022 winter Olympics 8. 2022 winter Olympics: participating countries 9. Countries that competed in the 2022 winter Olympics 10. Winter Olympic nations in 2022 11. List of countries and months for the 2022 winter Olympics
The countries that took part in the 2022 Winter Olympics were not mentioned in the provided articles. However, it is mentioned that Norway led the total medal standings with 39 medals, Germany had 31 medals, Canada had 29 medals, and South Korea won 17 medals. The Winter Olympics were held between 4 and 20 February 2022.