This guide is designed for educational purposes to help you understand RAG concepts and how they work in LarAgent and AI development. The prompts, configurations, and implementations provided here are not fine-tuned or extensively tested for production use.Use this guide to learn and experiment, then build upon it with production-grade practices.
Retrieval-as-Tool is an advanced RAG approach where the AI agent decides
when and what to retrieve based on the conversation context. Unlike
traditional RAG that retrieves context for every query, this method gives the
agent tools to fetch information only when needed, making it more efficient
and context-aware.
- SQL Query Tool - For retrieving structured data from databases (users, orders, settings, etc.)
- Document Search Tool - For searching unstructured documentation using vector embeddings
How Retrieval-as-Tool Works
The key difference:- Traditional RAG: Retrieves context for every query in the
prompt
method - Retrieval-as-Tool: Agent decides when to retrieve and what data source to query
Understanding the SQL Query Approach
The SQL query tool in this guide is designed for structured data retrieval
(database records like users, orders, products) and NOT for document
retrieval. For unstructured documents and FAQs, we’ll use the Document
Search tool with vector embeddings.
- Query database tables for precise, structured information
- Search documentation for conceptual knowledge and procedures
- Combine both when needed (e.g., “Show me user John’s order history and the return policy”)
Prerequisites
Before starting this guide, make sure you have:LarAgent Installed
LarAgent Installed
You should have LarAgent installed and configured. If not, check the Quickstart guide.
Database Access
Database Access
Ensure your Laravel application has database access configured. We’ll be using
DB::select()
for safe, read-only queries.Vector Search Service (for Document Tool)
Vector Search Service (for Document Tool)
Embeddings Generator
Embeddings Generator
We recommend using
openai-php/client
since LarAgent already provides it.Make sure you have some data in your database and documents indexed in your
vector database for testing.
Implementation Steps
Step 1: Create Your Support Agent
Create a new agent using the artisan command:app/AiAgents/SmartSupportAgent.php
.
Step 2: Define Agent Instructions
Create a blade template for your agent’s instructions:resources/views/prompts/smart_support_instructions.blade.php
Step 3: Create the SQL Query Guardrail Agent
Before implementing the main agent, create a guardrail agent that validates SQL queries for safety:SqlGuardAgent.php
:
app/AiAgents/SqlGuardAgent.php
Step 4: Implement SQL Query Tool in Your Agent
Now, let’s implement the SQL query tool using the#[Tool]
attribute:
app/AiAgents/SmartSupportAgent.php
The
queryDatabase
tool validates every SQL query through the SqlGuardAgent
before execution, ensuring only safe SELECT statements are processed. This
prevents any data modification attempts.Step 5: Add Document Search Tool with Enum Constraints
Now let’s add the document search tool. First, create an Enum to constrain the limit parameter:app/Enums/DocumentLimit.php
Using an Enum for the
limit
parameter frames the LLM’s ability to choose
by providing a predefined set of valid options. Instead of allowing any
integer (which could lead to values like 1, 100, or even negative numbers),
the LLM can only select from the three specific cases: 3, 4, or 5. This
ensures more predictable behavior and prevents edge cases while still giving
the agent flexibility to adjust the number of retrieved documents based on the
query complexity.The simplified implementation returns raw JSON results, allowing the agent to
interpret and present the information in the most appropriate way based on the
conversation context. The Enum constraint ensures the agent can only request
3, 4, or 5 documents, preventing excessive retrieval while maintaining
flexibility.
Understanding Enum Benefits
TheDocumentLimit
Enum provides several advantages:
Type Safety
Prevents invalid values at the language level
Clear Options
LLM sees exactly what choices are available
No Validation Needed
Removes need for min/max boundary checks
Semantic Meaning
Agent understands these are the only valid options
- Simple questions → 3 documents (default)
- Moderate complexity → 4 documents
- Complex or multi-faceted queries → 5 documents
Testing Your Implementation
Interactive Testing
Test your agent using the built-in chat command:Testing SQL Guardrails
The guardrail agent should reject unsafe queries:Programmatic Testing
You can also test programmatically:Debugging Tips
1
Monitor Tool Calls
Add logging to see which tools are being called:
2
Check Guardrail Decisions
Log guardrail agent decisions:
3
Verify Tool Selection
Set File chat history and Check in
storage/app/private
json file to learn if the agent is choosing the right tools for different question types.4
Monitor Token Usage
Tool calls can increase token consumption significantly.
Advanced: Combining Results
The agent will automatically synthesize information from multiple tool calls. For example: User Question: “Show me the top 5 customers and explain the loyalty program benefits” Agent’s Process:- Calls
queryDatabase
with:SELECT name, email, total_purchases FROM customers ORDER BY total_purchases DESC LIMIT 5
- Calls
searchDocumentation
with: “loyalty program benefits” - Combines results into a comprehensive answer
Next Steps
Add More Tools
Extend with API calls, external services, or specialized data sources
Implement Caching
Cache frequent queries to improve performance
Enhanced Guardrails
Add table-level permissions and query complexity limits
Monitoring & Analytics
Track tool usage patterns and optimize performance
Comparing RAG Approaches
Vector-Based RAG vs Retrieval-as-Tool
Vector-Based RAG vs Retrieval-as-Tool
Vector-Based RAG (Traditional):
- ✅ Simpler to implement
- ✅ Consistent context injection
- ❌ Retrieves for every query (less efficient)
- ❌ No selective retrieval
- ✅ Agent decides when to retrieve
- ✅ More efficient (only retrieves when needed)
- ✅ Can combine multiple data sources
- ❌ More complex implementation
- ❌ Requires careful tool design
When to Use Each Approach
When to Use Each Approach
Use Vector-Based RAG when:
- Every query needs context from documentation
- Building a simple FAQ bot
- Working with a single knowledge source
- Queries vary significantly in data needs
- Multiple data sources (DB + docs + APIs)
- Need fine-grained control over retrieval
- Building complex conversational agents
For more information about RAG fundamentals in LarAgent, check the RAG Core
Concept and Vector-Based RAG
guide.
Summary
You’ve now implemented a sophisticated Retrieval-as-Tool RAG system with LarAgent! Your agent can:- ✅ Intelligently decide when to retrieve information
- ✅ Query databases safely with SQL guardrails
- ✅ Search documentation using vector embeddings
- ✅ Combine multiple sources for comprehensive answers
- ✅ Validate and secure all data access
Key Takeaways
- Tools as Retrieval Methods: Using
#[Tool]
attribute makes retrieval explicit and controllable - Dual Data Sources: Structured (SQL) and unstructured (vectors) data serve different purposes
- Security First: Guardrail agents validate operations before execution
- Smart Decisions: The agent chooses when and what to retrieve based on context
- Extensibility: Easy to add more tools for APIs, external services, or specialized sources