{"id":196,"date":"2024-03-17T11:42:02","date_gmt":"2024-03-17T11:42:02","guid":{"rendered":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2024\/03\/17\/get-started-with-milvus-vector-db-in-net\/"},"modified":"2024-03-17T18:34:31","modified_gmt":"2024-03-17T18:34:31","slug":"get-started-with-milvus-vector-db-in-net","status":"publish","type":"post","link":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2024\/03\/17\/get-started-with-milvus-vector-db-in-net\/","title":{"rendered":"Get Started with Milvus Vector DB in .NET"},"content":{"rendered":"<p>Vector databases have become an important component of Generative AI workloads powering scenarios like search and Retrieval Augmented Generation (RAG).<\/p>\n<p>The .NET team has worked closely with Milvus to enable .NET developers to use vector databases in their applications.<\/p>\n<p>In this post, we\u2019ll show how you can quickly get started using the Milvus .NET SDK currently in preview.<\/p>\n<h2>What is Milvus?<\/h2>\n<p>Milvus is a vector database that can store, index, and manage embedding vectors generated  by deep neural networks and other machine learning models. <\/p>\n<p>For more details, see the <a href=\"https:\/\/milvus.io\/\">Milvus website<\/a>.<\/p>\n<h2>What are embedding vectors?<\/h2>\n<p>Embedding vectors are numerical representations of data such as text, images, and audio.  These numerical representations can be thought of as a collection of floating point values. <\/p>\n<p>In this example, you\u2019re looking at a visualization of movies based on their embedding  vector representations.<\/p>\n\n<p>These vectors are created by machine learning models such as the <a href=\"https:\/\/learn.microsoft.com\/azure\/ai-services\/openai\/concepts\/models#embeddings\">text embedding models from OpenAI<\/a>.<\/p>\n\n<p>Similar movies end up with similar embedding vector representations. For example,  movies like \u201cThe Lion King\u201d and \u201cToy Story\u201d might have similar vectors because they\u2019re  both animated and family-friendly, while \u201cInception\u201d and \u201cPulp Fiction\u201d would have  different vectors because they belong to different genres and styles.  <\/p>\n<p>These embedding vectors help computers understand and compare movies, which is useful  for search and recommendation systems.  <\/p>\n<p>For Generative AI applications, being able to provide relevant contextual information to  help Large Language Models (LLMs) like GPT generate relevant responses is important. Embedding vectors can help here.  <\/p>\n<p>For additional learning on embeddings, you can check out the following posts: <\/p>\n<p><a href=\"https:\/\/openai.com\/blog\/introducing-text-and-code-embeddings\">OpenAI \u2013 Introducing Text and Code Embeddings<\/a><br \/>\n<a href=\"https:\/\/learn.microsoft.com\/azure\/ai-services\/openai\/how-to\/embeddings?tabs=csharp\">Generate embeddings with Azure OpenAI<\/a><br \/>\n<a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/demystifying-retrieval-augmented-generation-with-dotnet\/#mind-the-gap\">Demistifying Retrieval Augmented Generation<\/a><br \/>\n<a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/transform-business-smart-dotnet-apps-azure-chatgpt\/#chat-with-your-data\">Transform your business with smart .NET apps powered by Azure and ChatGPT<\/a><br \/>\n<a href=\"https:\/\/arxiv.org\/abs\/1301.3781\">Efficient Estimation of Word Representations in Vector Space<\/a><br \/>\n<a href=\"https:\/\/www.andrewng.org\/publications\/improving-word-representations-via-global-context-and-multiple-word-prototypes\/\">Improving Word Representations via Global Context and Multiple Word Prototypes<\/a><br \/>\n<a href=\"https:\/\/towardsdatascience.com\/deep-learning-for-nlp-word-embeddings-4f5c90bcdab5\">Deep Learning for NLP: Word Embeddings<\/a><\/p>\n<h2>Why Vector DBs?<\/h2>\n<p>Similar to how relational databases and document databases are optimized for structured and semi-structured data, vector databases are built to efficiently store, index, and manage represented as embedding vectors. As a result, the indexing algorithms used by vector databases are optimized to efficiently retrieve data that can be used downstream in your applications which may have search and AI components.  <\/p>\n<h2>Get Started with Milvus in .NET<\/h2>\n<p>The code samples in this blog post are for illustration purposes. See the getting started sample for a more detailed sample.<\/p>\n<h3>Deploy Milvus to Azure<\/h3>\n<p>The easiest way for you to get started is by deploying an instance of the Milvus database to  Azure.  <\/p>\n<p>Milvus is available through the <a href=\"https:\/\/zilliz.com\/blog\/zilliz-cloud-now-available-on-microsoft-azure\">Zilliz Cloud for Azure<\/a>, the managed version of Milvus.  <\/p>\n<p>It\u2019s also available as an <a href=\"https:\/\/learn.microsoft.com\/azure\/container-apps\/services\">Azure Container Apps Add-On<\/a>. In future blog posts, we\u2019ll explore how to get started with these add-ons. Stay tuned!  <\/p>\n<h3>Connect to the database<\/h3>\n<p>Assuming you have an instance of Milvus deployed:<\/p>\n<p>Create a C# console application or Polyglot Notebook.<br \/>\nInstall the Milvus.Client NuGet package.<\/p>\n<p>Use the Milvus SDK to create a client and connect to your database. Make sure to replace \u201clocalhost\u201d with your Milvus service host. <\/p>\n<p>var milvusClient = new MilvusClient(&#8220;localhost&#8221;, username: &#8220;username&#8221;, password: &#8220;password&#8221;); <\/p>\n<h3>Create a collection<\/h3>\n<p>Data is organized in collections. Let\u2019s assume we\u2019re creating a collection to store movie  data.  <\/p>\n<p>Start by defining your schema. The schema will contain three fields:  <\/p>\n<p><em>movie_id<\/em>: The unique identifier for a movie<br \/>\n<em>movie_name<\/em>: The title of the movie<br \/>\n<em>movie_description<\/em>: Embedding vectors for the movie description.<\/p>\n<p>var schema = new CollectionSchema<br \/>\n{<br \/>\n    Fields =<br \/>\n    {<br \/>\n        FieldSchema.Create&lt;long&gt;(&#8220;movie_id&#8221;, isPrimaryKey: true),<br \/>\n        FieldSchema.CreateVarchar(&#8220;movie_name&#8221;, maxLength: 200),<br \/>\n        FieldSchema.CreateFloatVector(&#8220;movie_description&#8221;, dimension: 2)<br \/>\n    }<br \/>\n};<\/p>\n<p>Then, create your collection. <\/p>\n<p>var collection = await milvusClient.CreateCollectionAsync(collectionName: &#8220;movies&#8221;,schema: schema, shardsNum: 2);<\/p>\n<h3>Add data to your collection<\/h3>\n<p>Once your collection is created, add data to it.  <\/p>\n<p>In this case, here\u2019s the data we\u2019re using. In this sample, the embedding vectors for the  movie description have been precomputed for convenience. In a more real scenario though,  you\u2019d use an embedding model to generate them. In the table I\u2019ve also included the text  description only for demonstration purposes. However, the text description won\u2019t be  stored in the collection, only the embedding vectors.  <\/p>\n<p>movie_id<br \/>\nmovie_name<br \/>\nmovie_description (embedding)<br \/>\nmovie_description (text)<\/p>\n<p>1<br \/>\nThe Lion King<br \/>\n[0.10022575, -0.23998135]<br \/>\nThe Lion King is a classic Disney animated film that tells the story of a young lion named Simba who embarks on a journey to reclaim his throne as the king of the Pride Lands after the tragic death of his father.<\/p>\n<p>2<br \/>\nInception<br \/>\n[0.10327095, 0.2563685]<br \/>\nInception is a mind-bending science fiction film directed by Christopher Nolan. It follows the story of Dom Cobb, a skilled thief who specializes in entering people\u2019s dreams to steal their secrets. However, he is offered a final job that involves planting an idea into someone\u2019s mind.<\/p>\n<p>3<br \/>\nToy Story<br \/>\n[0.095857024, -0.201278]<br \/>\nToy Story is a groundbreaking animated film from Pixar. It follows the secret lives of toys when their owner, Andy, is not around. Woody and Buzz Lightyear are the main characters in this heartwarming tale.<\/p>\n<p>4<br \/>\nPulp Fiction<br \/>\n[0.106827796, 0.21676421]<br \/>\nPulp Fiction is a crime film directed by Quentin Tarantino. It weaves together interconnected stories of mobsters, hitmen, and other colorful characters in a non-linear narrative filled with dark humor and violence.<\/p>\n<p>5<br \/>\nShrek<br \/>\n[0.09568083, -0.21177962]<br \/>\nShrek is an animated comedy film that follows the adventures of Shrek, an ogre who embarks on a quest to rescue Princess Fiona from a dragon-guarded tower in order to get his swamp back.<\/p>\n<p>var movieIds = new [] { 1L, 2L, 3L, 4L, 5L };<br \/>\nvar movieNames = new [] { &#8220;The Lion King&#8221;, &#8220;Inception&#8221;, &#8220;Toy Story&#8221;, &#8220;Pulp  Fiction&#8221;, &#8220;Shrek&#8221; };<br \/>\nvar movieDescriptions = new ReadOnlyMemory&lt;float&gt;[] {<br \/>\n    new [] { 0.10022575f, 0.23998135f },<br \/>\n    new [] { 0.10327095f, -0.2563685f },<br \/>\n    new [] { 0.095857024f, 0.201278f },<br \/>\n    new [] { 0.106827796f, -0.21676421f },<br \/>\n    new [] { 0.09568083f, 0.21177962f }<br \/>\n}; <\/p>\n<p>await collection.InsertAsync(new FieldData[]<br \/>\n{<br \/>\n FieldData.Create(&#8220;movie_id&#8221;, movieIds),<br \/>\n FieldData.Create(&#8220;movie_name&#8221;, movieNames),<br \/>\n FieldData.CreateFloatVector(&#8220;movie_description&#8221;, movieDescriptions)<br \/>\n});  <\/p>\n<h3>Search for movies<\/h3>\n<p>Let\u2019s say that we want to find movies that match a search query, \u201cA movie that\u2019s fun for the  whole family\u201d. <\/p>\n<p>Query<br \/>\nEmbedding<\/p>\n<p>A movie that\u2019s fun for the whole family<br \/>\n[0.12217915, -0.034832448]<\/p>\n<p>Start by creating an index of your movie collection. In this case, the name of the index is movie_idx and the field that is indexed is the movie_description containing the embedding vectors of the movie descriptions. The rest are configurations the index uses to organize information and conduct searches. For more details, see the <a href=\"https:\/\/milvus.io\/docs\/v2.2.x\/index.md\">Milvus vector index<\/a> and <a href=\"https:\/\/milvus.io\/docs\/v2.2.x\/metric.md\">similarity metric<\/a> documentation.  <\/p>\n<p>await collection.CreateIndexAsync(<br \/>\n fieldName: &#8220;movie_description&#8221;,<br \/>\n indexType: IndexType.Flat,<br \/>\n metricType: SimilarityMetricType.L2,<br \/>\n indexName: &#8220;movie_idx&#8221;);<\/p>\n<p>Once your index is created, load your collection.  <\/p>\n<p>await collection.LoadAsync();<br \/>\nawait collection.WaitForCollectionLoadAsync(); <\/p>\n<p>Define parameters for your search. In this case, you want the result of your query to display  the name of the movies most relevant to your query, so you set the movie_name as the OutputFields.  <\/p>\n<p>var parameters = new SearchParameters<br \/>\n{<br \/>\n    OutputFields = { &#8220;movie_name&#8221; },<br \/>\n    ConsistencyLevel = ConsistencyLevel.Strong,<br \/>\n    ExtraParameters = { [&#8220;nprobe&#8221;] = &#8220;1024&#8221; }<br \/>\n};<\/p>\n<p>Then, conduct the search. Note that for vectors, I\u2019m passing in the embedding vector representation of my search query. Similar to the movie descriptions, they\u2019ve been conveniently precomputed.  <\/p>\n<p>var results = await collection.SearchAsync(<br \/>\n    vectorFieldName: &#8220;movie_description&#8221;,<br \/>\n    vectors: new ReadOnlyMemory&lt;float&gt;[] { new[] {0.12217915f, -0.034832448f } },<br \/>\n    SimilarityMetricType.L2,<br \/>\n    limit: 3,<br \/>\n    parameters);<\/p>\n<p>The result is the following:<\/p>\n<p>[ Toy Story, Shrek, The Lion King ]<\/p>\n<h2>Using Semantic Kernel<\/h2>\n<p>If you\u2019re using Milvus with Semantic Kernel, you can use the <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.SemanticKernel.Connectors.Milvus\">Milvus connector<\/a>.<\/p>\n<h2>Acknowledgements<\/h2>\n<p>Thanks to the Milvus organization and open-source community as well as the .NET Data  Access, Azure App Services, and Semantic Kernel teams for collaborating on this effort.<\/p>\n<h2>Next steps<\/h2>\n<p>Try out the <a href=\"https:\/\/github.com\/Azure-Samples\/openai\/tree\/main\/Basic_Samples#datastores\">samples<\/a> and get started today! <\/p>\n<p>The post <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/get-started-milvus-vector-db-dotnet\/\">Get Started with Milvus Vector DB in .NET<\/a> appeared first on <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\">.NET Blog<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>Vector databases have become an important component of Generative AI workloads powering scenarios like search and Retrieval Augmented Generation (RAG). [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":94,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[7],"tags":[],"class_list":["post-196","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet"],"_links":{"self":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/196","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/comments?post=196"}],"version-history":[{"count":1,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/196\/revisions"}],"predecessor-version":[{"id":223,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/196\/revisions\/223"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media\/94"}],"wp:attachment":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media?parent=196"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/categories?post=196"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/tags?post=196"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}