milvus-io
diff --git a/‎.github/workflows/black.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/black.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎blog/README.md‎
Lines changed: 1 addition & 0 deletions b/‎blog/README.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎bootcamp/Evaluation/eval_ragas.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎bootcamp/Evaluation/eval_ragas.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎bootcamp/Integration/bge_m3_embedding.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎bootcamp/Integration/bge_m3_embedding.ipynb‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎bootcamp/Integration/openai_embedding.ipynb‎
Lines changed: 2 additions & 2 deletions b/‎bootcamp/Integration/openai_embedding.ipynb‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎bootcamp/MilvusCheatSheet.md‎
Lines changed: 2 additions & 2 deletions b/‎bootcamp/MilvusCheatSheet.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎bootcamp/OpenAIAssistants/custom_RAG_workflow.ipynb‎
Lines changed: 4 additions & 2 deletions b/‎bootcamp/OpenAIAssistants/custom_RAG_workflow.ipynb‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎bootcamp/RAG/advanced_rag/README.md‎
Lines changed: 15 additions & 15 deletions b/‎bootcamp/RAG/advanced_rag/README.md‎
Lines changed: 15 additions & 15 deletions
diff --git a/‎bootcamp/RAG/bedrock_langchain_zilliz_rag.ipynb‎
Lines changed: 1 addition & 1 deletion b/‎bootcamp/RAG/bedrock_langchain_zilliz_rag.ipynb‎
Lines changed: 1 addition & 1 deletion
@@ -10,5 +10,5 @@ jobs:
       - uses: psf/black@stable
         with:
           options: "--check --diff --verbose"
-          src: "./bootcamp/tutorials"
+          src: "./tutorials"
           jupyter: true
@@ -1,4 +1,4 @@
-<img src="images/logo.png" alt="milvus bootcamp banner">
+<img src="pics/logo.png" alt="milvus bootcamp banner">
 
 <div class="column" align="middle">
   <a href="https://github.com/milvus-io/bootcamp/blob/master/LICENSE"><img height="20" src="https://img.shields.io/github/license/milvus-io/bootcamp" alt="license"/></a>
 
@@ -0,0 +1 @@
+Codes & Tutorials used in blogs
@@ -14,7 +14,7 @@
     "Ragas is an open source project for evaluating RAG components.  [Paper](https://arxiv.org/abs/2309.15217), [Code](https://docs.ragas.io/en/stable/getstarted/index.html), [Docs](https://docs.ragas.io/en/stable/getstarted/index.html), [Intro blog](https://medium.com/towards-data-science/rag-evaluation-using-ragas-4645a4c6c477).\n",
     "\n",
     "<div>\n",
-    "<img src=\"../../images/ragas_eval_image.png\" width=\"80%\"/>\n",
+    "<img src=\"../../pics/ragas_eval_image.png\" width=\"80%\"/>\n",
     "</div>\n",
     "\n",
     "**Please note that RAGAS can use a large amount of OpenAI api token consumption.** <br> \n",
 
@@ -19,7 +19,7 @@
     "\n",
     "This tutorial shows how to use **BGE M3 embedding model with Milvus** for semantic similarity search.\n",
     "\n",
-    "![](../../images/bge_m3.png)\n"
+    "![](../../pics/bge_m3.png)\n"
    ]
   },
   {
 
@@ -15,11 +15,11 @@
     "\n",
     "On January 25, OpenAI released 2 latest embedding models, `text-embedding-3-small` and `text-embedding-3-large`. Both embedding models has better performance over `text-embedding-ada-002`. The `text-embedding-3-small` is a highly efficient model. With 5X cost reduction, it achieves slight higher [MTEB](https://huggingface.co/spaces/mteb/leaderboard) score of 62.3% compared to 61%. `text-embedding-3-large` is OpenAI's best performing model, with 64.6% MTEB score.\n",
     "\n",
-    "![](../../images/openai_embedding_scores.png)\n",
+    "![](../../pics/openai_embedding_scores.png)\n",
     "\n",
     "More impressively, both models support trading-off performance and cost with a technique called \"Matryoshka Representation Learning\". Users can get shorten embeddings for vast reduction of the vector storage cost, without sacrificing the retrieval quality much. For example, reducing the vector dimension from 3072 to 256 only reduces the MTEB score from 64.6% to 62%. However, it achieves 12X cost reduction!\n",
     "\n",
-    "![](../../images/openai_embedding_vector_size.png)\n",
+    "![](../../pics/openai_embedding_vector_size.png)\n",
     "\n",
     "This tutorial shows how to use OpenAI's newest embedding models with Milvus for semantic similarity search."
    ]
 
@@ -21,7 +21,7 @@
   - [Community & Help](#community--help)
 
 <div>
-<img src="../images/milvus_zilliz_overview.png" width="90%"/>
+<img src="../pics/milvus_zilliz_overview.png" width="90%"/>
 </div>
 
 ## Milvus Introduction
@@ -53,7 +53,7 @@
 Milvus uses a shared-storage [architecture](https://milvus.io/docs/architecture_overview.md) with 4  layers which are mutually independent for scaling or disaster recovery: 1)access layer, 2)coordinator service, 3)worker nodes, and 4)storage. Milvus also includes data sharding, logs-as-data persistence, and streaming data ingestion.  
 
 <div>
-<img src="../images/oss_zilliz_architecture.png" width="90%"/>
+<img src="../pics/oss_zilliz_architecture.png" width="90%"/>
 </div>
 
 ### Documentation & Releases
 
@@ -20,7 +20,7 @@
     "Using open-source Q&A with retrieval saves money since we make free calls to our own data almost all the time - retrieval, evaluation, and development iterations.  We only make a paid call to OpenAI once for the final chat generation step. \n",
     "\n",
     "<div>\n",
-    "<img src=\"../../images/rag_image.png\" width=\"80%\"/>\n",
+    "<img src=\"../../pics/rag_image.png\" width=\"80%\"/>\n",
     "</div>\n",
     "\n",
     "Let's get started!"
@@ -194,6 +194,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "60a51aa1",
    "metadata": {},
    "source": [
     "## Create a Milvus collection\n",
@@ -234,6 +235,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
+   "id": "341bf019",
    "metadata": {},
    "outputs": [
     {
@@ -306,7 +308,7 @@
     "For each original text chunk, we'll write the quadruplet (`vector, text, source, h1, h2`) into the database.\n",
     "\n",
     "<div>\n",
-    "<img src=\"../../images/db_insert.png\" width=\"80%\"/>\n",
+    "<img src=\"../../pics/db_insert.png\" width=\"80%\"/>\n",
     "</div>\n",
     "\n",
     "**The Milvus Client wrapper can only handle loading data from a list of dictionaries.**\n",
 
@@ -9,7 +9,7 @@ It's important to note that we'll only provide a high-level exploration of these
 
 The diagram below shows the most straightforward vanilla RAG pipeline. First, document chunks are loaded into a vector store (such as [Milvus](https://milvus.io/docs) or [Zilliz cloud](https://zilliz.com/cloud)). Then, the vector store retrieves the Top-K most relevant chunks related to the query. These relevant chunks are then injected into the [LLM](https://zilliz.com/glossary/large-language-models-\(llms\))'s context prompt, and finally, the LLM returns the final answer. 
 
-![](../../../images/advanced_rag/vanilla_rag.png)
+![](../../../pics/advanced_rag/vanilla_rag.png)
 
 ## Various Types of RAG Enhancement Techniques 
 
@@ -31,15 +31,15 @@ Let's explore four effective methods to enhance your query experience: Hypotheti
 
 Creating hypothetical questions involves utilizing an LLM to generate multiple questions that users might ask about the content within each document chunk. Before the user's actual query reaches the LLM, the vector store retrieves the most relevant hypothetical questions related to the real query, along with their corresponding document chunks, and forwards them to the LLM.
 
-![](../../../images/advanced_rag/hypothetical_question.png)
+![](../../../pics/advanced_rag/hypothetical_question.png)
 
 This methodology bypasses the cross-domain asymmetry problem in the vector search process by directly engaging in query-to-query searches, alleviating the burden on vector searches. However, it introduces additional overhead and uncertainty in generating hypothetical questions.
 
 ### HyDE (Hypothetical Document Embeddings)
 
 HyDE stands for Hypothetical Document Embeddings. It leverages an LLM to craft a "***Hypothetical Document***" or a ***fake*** answer in response to a user query devoid of contextual information. This fake answer is then converted into vector embeddings and employed to query the most relevant document chunks within a vector database. Subsequently, the vector database retrieves the Top-K most relevant document chunks and transmits them to the LLM and the original user query to generate the final answer.
 
-![](../../../images/advanced_rag/hyde.png)
+![](../../../pics/advanced_rag/hyde.png)
 
 This method is similar to the hypothetical question technique in addressing cross-domain asymmetry in vector searches. However, it also has drawbacks, such as the added computational costs and uncertainties of generating fake answers.
 
@@ -56,7 +56,7 @@ Imagine a user asking: "***What are the differences in features between Milvus a
 
 Once we have these sub-queries, we send them all to the vector database after converting them into vector embeddings. The vector database then finds the Top-K document chunks most relevant to each sub-query. Finally, the LLM uses this information to generate a better answer.
 
-![](../../../images/advanced_rag/sub_query.png)
+![](../../../pics/advanced_rag/sub_query.png)
 
 By breaking down the user query into sub-queries, we make it easier for our system to find relevant information and provide accurate answers, even to complex questions.
 
@@ -72,7 +72,7 @@ To simplify this user query, we can use an LLM to generate a more straightforwar
 
 ***Stepback Question: "What is the dataset size limit that Milvus can handle?"***
 
-![](../../../images/advanced_rag/stepback.png)
+![](../../../pics/advanced_rag/stepback.png)
 
 This method can help us get better and more accurate answers to complex queries. It breaks down the original question into a simpler form, making it easier for our system to find relevant information and provide accurate responses.
 
@@ -84,15 +84,15 @@ Enhancing indexing is another strategy for enhancing the performance of your RAG
 
 When building an index, we can employ two granularity levels: child chunks and their corresponding parent chunks. Initially, we search for child chunks at a finer level of detail. Then, we apply a merging strategy: if a specific number, ***n***, of child chunks from the first ***k*** child chunks belong to the same parent chunk, we provide this parent chunk to the LLM as contextual information. 
 
-![](../../../images/advanced_rag/merge_chunks.png)
+![](../../../pics/advanced_rag/merge_chunks.png)
 
 This methodology has been implemented in [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/retrievers/recursive_retriever_nodes.html).
 
 ### Constructing Hierarchical Indices
 
 When creating indices for documents, we can establish a two-level index: one for document summaries and another for document chunks. The vector search process comprises two stages: initially, we filter relevant documents based on the summary, and subsequently, we retrieve corresponding document chunks exclusively within these relevant documents.
 
-![](../../../images/advanced_rag/hierarchical_index.png)
+![](../../../pics/advanced_rag/hierarchical_index.png)
 
 This approach proves beneficial in situations involving extensive data volumes or instances where data is hierarchical, such as content retrieval within a library collection.
 
@@ -102,7 +102,7 @@ The Hybrid Retrieval and Reranking technique integrates one or more supplementar
 
 Common supplementary retrieval algorithms include lexical frequency-based methods like [BM25](https://milvus.io/docs/embed-with-bm25.md) or big models utilizing sparse embeddings like [Splade](https://zilliz.com/learn/discover-splade-revolutionize-sparse-data-processing). Re-ranking algorithms include RRF or more sophisticated models such as [Cross-Encoder](https://www.sbert.net/examples/applications/cross-encoder/README.html), which resembles BERT-like architectures.
 
-![](../../../images/advanced_rag/hybrid_and_rerank.png)
+![](../../../pics/advanced_rag/hybrid_and_rerank.png)
 
 This approach leverages diverse retrieval methods to improve retrieval quality and address potential gaps in vector recall.
 
@@ -114,15 +114,15 @@ Refinement of the retriever component within the RAG system can also improve RAG
 
 In a basic RAG system, the document chunk given to the LLM is a larger window encompassing the retrieved embedding chunk. This ensures that the information provided to the LLM includes a broader range of contextual details, minimizing information loss. The Sentence Window Retrieval technique decouples the document chunk used for embedding retrieval from the chunk provided to the LLM. 
 
-![](../../../images/advanced_rag/sentence_window.png)
+![](../../../pics/advanced_rag/sentence_window.png)
 
 However, expanding the window size may introduce additional interfering information. We can adjust the size of the window expansion based on the specific business needs.
 
 ### Meta-data Filtering
 
 To ensure more precise answers, we can refine the retrieved documents by filtering metadata like time and category before passing them to the LLM. For instance, if financial reports spanning multiple years are retrieved, filtering based on the desired year will refine the information to meet specific requirements. This method proves effective in situations with extensive data and detailed metadata, such as content retrieval in library collections.
 
-![](../../../images/advanced_rag/metadata_filtering.png)
+![](../../../pics/advanced_rag/metadata_filtering.png)
 
 ## Generator Enhancement
 
@@ -132,7 +132,7 @@ Let’s explore more RAG optimizing techniques by improving the generator within
 
 The noise information within retrieved document chunks can significantly impact the accuracy of RAG's final answer. The limited prompt window in LLMs also presents a hurdle for more accurate answers. To address this challenge, we can compress irrelevant details, emphasize key paragraphs, and reduce the overall context length of retrieved document chunks. 
 
-![](../../../images/advanced_rag/compress_prompt.png)
+![](../../../pics/advanced_rag/compress_prompt.png)
 
 This approach is similar to the earlier discussed hybrid retrieval and reranking method, wherein a reranker is utilized to sift out irrelevant document chunks.
 
@@ -142,7 +142,7 @@ In the paper "[Lost in the middle](https://arxiv.org/abs/2307.03172)," researche
 
 Based on this observation, we can adjust the order of retrieved chunks to improve the answer quality: when retrieving multiple knowledge chunks, chunks with relatively low confidence are placed in the middle, and chunks with relatively high confidence are positioned at both ends. 
 
-![](../../../images/advanced_rag/adjust_order.png)
+![](../../../pics/advanced_rag/adjust_order.png)
 
 ## RAG Pipeline Enhancement
 
@@ -156,16 +156,16 @@ Some initially retrieved Top-K document chunks are ambiguous and may not answer
 
 We can conduct the reflection using efficient reflection methods such as Natural Language Inference(NLI) models or additional tools like internet searches for verification. 
 
-![](../../../images/advanced_rag/self_reflection.png)
+![](../../../pics/advanced_rag/self_reflection.png)
 
 This concept of self-reflection has been explored in several papers or projects, including [Self-RAG](https://arxiv.org/pdf/2310.11511.pdf), [Corrective RAG](https://arxiv.org/pdf/2401.15884.pdf), [LangGraph](https://github.com/langchain-ai/langgraph/blob/main/examples/reflexion/reflexion.ipynb), etc. 
 
 ### Query Routing with an Agent 
 
 Sometimes, we don’t have to use a RAG system to answer simple questions as it might result in more misunderstanding and inference from misleading information. In such cases, we can use an agent as a router at the querying stage. This agent assesses whether the query needs to go through the RAG pipeline. If it does, the subsequent RAG pipeline is initiated; otherwise, the LLM directly addresses the query. 
 
-![](../../../images/advanced_rag/query_routing.png)
-![](../../../images/advanced_rag/query_routing_with_sub_query.png)
+![](../../../pics/advanced_rag/query_routing.png)
+![](../../../pics/advanced_rag/query_routing_with_sub_query.png)
 
 The agent could take various forms, including an LLM, a small classification model, or even a set of rules. 
 
 
@@ -131,7 +131,7 @@
     "The zilliz cloud uri and zilliz api key can be obtained from the [Zilliz cloud console guide](https://docs.zilliz.com/docs/on-zilliz-cloud-console).\n",
     "\n",
     "In simple terms, you can access them on your zilliz cloud cluster page.\n",
-    " ![](../../images/zilliz_uri_and_key.png)"
+    " ![](../../pics/zilliz_uri_and_key.png)"
    ]
   },
   {
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-<img src="images/logo.png" alt="milvus bootcamp banner">`
	`1`	`+<img src="pics/logo.png" alt="milvus bootcamp banner">`
`2`	`2`
`3`	`3`	`<div class="column" align="middle">`
`4`	`4`	`<a href="https://github.com/milvus-io/bootcamp/blob/master/LICENSE"><img height="20" src="https://img.shields.io/github/license/milvus-io/bootcamp" alt="license"/></a>`
Original file line number	Diff line number	Diff line change
`@@ -19,7 +19,7 @@`
`19`	`19`	`"\n",`
`20`	`20`	`"This tutorial shows how to use BGE M3 embedding model with Milvus for semantic similarity search.\n",`
`21`	`21`	`"\n",`
`22`		`- "![](../../images/bge_m3.png)\n"`
	`22`	`+ "![](../../pics/bge_m3.png)\n"`
`23`	`23`	`]`
`24`	`24`	`},`
`25`	`25`	`{`
Original file line number	Diff line number	Diff line change
`@@ -131,7 +131,7 @@`
`131`	`131`	`"The zilliz cloud uri and zilliz api key can be obtained from the [Zilliz cloud console guide](https://docs.zilliz.com/docs/on-zilliz-cloud-console).\n",`
`132`	`132`	`"\n",`
`133`	`133`	`"In simple terms, you can access them on your zilliz cloud cluster page.\n",`
`134`		`- " ![](../../images/zilliz_uri_and_key.png)"`
	`134`	`+ " ![](../../pics/zilliz_uri_and_key.png)"`
`135`	`135`	`]`
`136`	`136`	`},`
`137`	`137`	`{`