Page Card

crewai rag

Belongs to subject agentic ai

RAG Tool The RagTool is a dynamic knowledge base tool for answering questions using Retrieval-Augmented Generation.

​ RagTool ​ Description The RagTool is designed to answer questions by leveraging the power of Retrieval-Augmented Generation (RAG) through EmbedChain. It provides a dynamic knowledge base that can be queried to retrieve relevant information from various data sources. This tool is particularly useful for applications that require access to a vast array of information and need to provide contextually relevant answers.

​ Example The following example demonstrates how to initialize the tool and use it with different data sources:

Code

Copy from crewai_tools import RagTool

Create a RAG tool with default settings

rag_tool = RagTool()

Add content from a file

rag_tool.add(data_type="file", path="path/to/your/document.pdf")

Add content from a web page

rag_tool.add(data_type="web_page", url="https://example.com")

Define an agent with the RagTool

@agent def knowledge_expert(self) -> Agent: ''' This agent uses the RagTool to answer questions about the knowledge base. ''' return Agent( config=self.agents_config"knowledge_expert", allow_delegation=False, tools=rag_tool ) ​ Supported Data Sources The RagTool can be used with a wide variety of data sources, including:

πŸ“° PDF files πŸ“Š CSV files πŸ“ƒ JSON files πŸ“ Text πŸ“ Directories/Folders 🌐 HTML Web pages πŸ“½οΈ YouTube Channels πŸ“Ί YouTube Videos πŸ“š Documentation websites πŸ“ MDX files πŸ“„ DOCX files 🧾 XML files πŸ“¬ Gmail πŸ“ GitHub repositories 🐘 PostgreSQL databases 🐬 MySQL databases πŸ€– Slack conversations πŸ’¬ Discord messages πŸ—¨οΈ Discourse forums πŸ“ Substack newsletters 🐝 Beehiiv content πŸ’Ύ Dropbox files πŸ–ΌοΈ Images βš™οΈ Custom data sources ​ Parameters The RagTool accepts the following parameters:

summarize: Optional. Whether to summarize the retrieved content. Default is False. adapter: Optional. A custom adapter for the knowledge base. If not provided, an EmbedchainAdapter will be used. config: Optional. Configuration for the underlying EmbedChain App. ​ Adding Content You can add content to the knowledge base using the add method:

Code

Copy

Add a PDF file

rag_tool.add(data_type="file", path="path/to/your/document.pdf")

Add a web page

rag_tool.add(data_type="web_page", url="https://example.com")

Add a YouTube video

rag_tool.add(data_type="youtube_video", url="https://www.youtube.com/watch?v=VIDEO_ID")

Add a directory of files

rag_tool.add(data_type="directory", path="path/to/your/directory") ​ Agent Integration Example Here’s how to integrate the RagTool with a CrewAI agent:

Code

Copy from crewai import Agent from crewai.project import agent from crewai_tools import RagTool

Initialize the tool and add content

rag_tool = RagTool() rag_tool.add(data_type="web_page", url="https://docs.crewai.com") rag_tool.add(data_type="file", path="company_data.pdf")

Define an agent with the RagTool

@agent def knowledge_expert(self) -> Agent: return Agent( config=self.agents_config"knowledge_expert", allow_delegation=False, tools=rag_tool ) ​ Advanced Configuration You can customize the behavior of the RagTool by providing a configuration dictionary:

Code

Copy from crewai_tools import RagTool

Create a RAG tool with custom configuration

config = { "app": { "name": "custom_app", }, "llm": { "provider": "openai", "config": { "model": "gpt-4", } }, "embedding_model": { "provider": "openai", "config": { "model": "text-embedding-ada-002" } } }

rag_tool = RagTool(config=config, summarize=True) ​ Conclusion The RagTool provides a powerful way to create and query knowledge bases from various data sources. By leveraging Retrieval-Augmented Generation, it enables agents to access and retrieve relevant information efficiently, enhancing their ability to provide accurate and contextually appropriate responses.